Code for the RAG pipeline (Raw files (format can be txt,pdf)) --> Llamacloud --> OpenAI Embed --> pinecone)
The RAG pipeline is a system that helps process and store documents, making it easy to search and retrieve relevant information when needed. It breaks down documents into smaller parts, organizes them in a smart database called Pinecone, and uses advanced AI technology to find and provide answers based on those documents. This guide explains how to set up and use the system step by step.
Ensure the following prerequisites are installed:
-
Python version: Python 3.8 or higher is recommended.(Python 3.10 recommended)
-
Required Libraries: Refer to
requirements.txt -
OpenAI API Key:
Obtain your OpenAI API key and store it in a .env file in the root directory.
-
Pinecone Account:
-
.env variables:
OPENAI_API_KEY=your_openai_api_key PINECONE_API_KEY=your_pinecone_api_key LLAMA_CLOUD_API_KEY=your_llama_cloud_api_key
- Log in to Pinecone:
- Navigate to Pinecone.
- Use admin credentials to log in.
- Create a New Index:
- Click Create Index.
- Provide a Name (e.g.,
ragimplementation-demo-index). - Under Configuration, select text-embedding-ada-002.
- Leave metric as
cosine - Click Create Index.
- Get the API Key:
- From the left menu, click API Key.
- Copy the API key and store it securely.
-
Purpose : The tool converts different types of documents (like PDFs and text files) into a simpler format called Markdown (.md) for easier use and processing
-
Key Features:
- Read Documents: It looks for files in a specific folder.
- Converts Content: It organizes the content into a neat Markdown format, making it easier to work with.
-
Setup:
- Make two folders at root level. one is input_docs and another is parsed_docs
- Add the files: Place the documents you want to process into input_docs folder.
- Set the folder path: Update the file paths in the program to match where your documents are stored.
- Using llama parse to parse the input files
input_raw_file_dir = "./input_docs" # Directory containing documents output_raw_file_dir = "./parsed_docs" # Directory to store markdown files
This notebook is responsible for processing parsed markdown files, generating embeddings, and uploading them to a Pinecone vector database. It enables efficient retrieval-augmented generation (RAG) by structuring and storing knowledge in a vectorized format.
- Reads parsed markdown files from the output directory of
parsing_kb_rag.ipynb. - Splits content into smaller, meaningful chunks optimized for embedding and retrieval.
- Generates embeddings using the
text-embedding-ada-002model. - Uploads embeddings to a Pinecone database under a specified namespace for organized storage.
- Specify the input directory containing parsed markdown files:
input_parsed_file_dir = "./parsed_docs" # Directory with parsed markdown files
- Authenticate with Pinecone using admin credentials.
- Create an index in Pinecone with the following configuration:
- Index Name: Choose a name relevant to your project (e.g.,
as per the need). - Embedding Model:
text-embedding-ada-002. - Namespace: Assign a namespace for structured storage (e.g.,
as per the need).
- Index Name: Choose a name relevant to your project (e.g.,
This setup ensures that your markdown content is effectively chunked, vectorized, and indexed for seamless retrieval.
This notebook is designed to query the Pinecone database and generate responses based on retrieved document chunks. It serves as a testing and evaluation tool for retrieval-augmented generation (RAG) performance.
- Retrieves relevant document chunks from the Pinecone vector database using similarity search.
- Generates responses using the ChatOpenAI model (
gpt-4o-mini). - Utilizes a LangChain prompt template to structure queries effectively.
- Filters results based on similarity thresholds to ensure high-quality and contextually accurate responses.
- Configure environment variables:
- Store your OpenAI API key and other necessary credentials in a
.envfile.
- Store your OpenAI API key and other necessary credentials in a
- Update Pinecone index and namespace in the notebook:
index_name = "<put your index name here which was used in rag_pinecone>" namespace = "<put your namespace name here which was used in rag_pinecone>"
- Define queries in a list within the notebook.
- Retrieve relevant chunks using Pinecone’s similarity search.
- Filter results based on a similarity threshold to eliminate low-relevance data.
- Generate responses by passing the retrieved context and query to the LLM.
This setup ensures accurate, context-aware responses by leveraging vector-based retrieval and OpenAI’s language model.
