# Llama Index RAG Retrieval and Ingestion

This notebook presents a straightforward implementation of a Retrieval-Augmented Generation (RAG) system using the llama_index library. The system ingests documents, creates a vector index, and retrieves the most relevant content based on a query.

## System Introduction

In this system, documents are ingested from either text strings or a directory, indexed using llama_index, and later retrieved using a query. A language model is used in other parts of the overall system (not shown here) to generate queries or hypothetical documents, but this implementation focuses on the vector database management with llama_index.

## Underlying Concept

Traditional retrieval methods may require heavy customization when working with large document corpora. By leveraging llama_index, the system automatically handles document ingestion, vector indexing, and similarity search. This creates a robust, scalable retrieval solution that easily integrates with modern language models.

## System Components

1. **Document Retrieval Module:** Loads a persisted llama_index vector store and retrieves documents relevant to a query.
2. **Text Ingestion Module:** Ingests a list of raw text strings into the vector store. It creates or appends to an existing index and persists the update.
3. **Corpus Ingestion Module:** Reads documents from a specified directory using the SimpleDirectoryReader, then creates or updates the llama_index vector store.

## How It Works

### 1. Text Preprocessing and Vector Indexing

- The retrieval module loads the persisted index from disk using a specified directory.
- Ingestion modules process raw text or directory files and update the vector store accordingly.

### 2. Retrieval Mechanism

- A query is submitted to the retrieval module, which loads the index and searches for the most similar nodes/documents.
- The retrieved nodes are then returned as a list of text strings.

### 3. Corpus Ingestion

- The corpus ingestion module reads all documents from a directory, ingests them into the index, and persists the updated index.

## System Advantages

- **Automated Vector Management:** llama_index simplifies document ingestion and indexing.
- **Scalability:** Easily ingest and index new documents from text or entire corpora without significant re-engineering.
- **Seamless Integration:** Works alongside modern language models for end-to-end RAG pipelines.

## Practical Benefits

- **Improved Retrieval Accuracy:** Efficient similarity search based on vector representations.
- **Flexibility:** Supports ingestion from both raw text and document directories.
- **Easy Deployment:** By persisting the index to disk, the system can be restarted or updated incrementally.

## Implementation Insights

- The **retrieve** module leverages llama_index's `StorageContext` and `VectorIndexRetriever` to load and search the index.
- The **ingest_texts** module creates or updates the index using a list of text strings and persists changes using the storage context.
- The **ingest_corpus** module uses the `SimpleDirectoryReader` to load files from a directory, then either creates a new index or appends to an existing one before persisting it.

Each component is designed for clarity and robustness, ensuring that the vector database is always up-to-date with minimal manual intervention.

## Summary

This simple llama_index RAG system rethinks document retrieval by automating ingestion and indexing of documents. By utilizing the llama_index library, the system efficiently manages vector storage, making it well-suited for integration with language models to achieve enhanced retrieval relevance.

## Code Implementation

Below, the code for each external module is automatically displayed using `%pycat`. This allows you to review the code without running it.

In [None]:
%pycat retrieve.py

: 

In [None]:
%pycat ingest.py

In [None]:
%pycat ingest_corpus.py