Advanced RAG Cookbooks👨🏻‍💻

If you find this repository helpful, please consider giving it a star⭐️

Advanced RAG Cookbooks👨🏻‍💻

Welcome to the comprehensive collection of advanced Retrieval-Augmented Generation (RAG) techniques.

Introduction🚀

RAG is a popular method that improves accuracy and relevance by finding the right information from reliable sources and transforming it into useful answers. This repository covers the most effective advanced RAG techniques with clear implementations and explanations.

The main goal of this repository is to provide a helpful resource for researchers and developers looking to use advanced RAG techniques in their projects. Building these techniques from scratch takes time, and finding proper evaluation methods can be challenging. This repository simplifies the process by offering ready-to-use implementations and guidance on how to evaluate them.

Note

This repository starts with naive RAG as a foundation and progresses to advanced techniques. It also includes research papers/references for each RAG technique, which you can explore for further reading.

Introduction to RAG💡

Large Language Models are trained on a fixed dataset, which limits their ability to handle private or recent information. They can sometimes "hallucinate", providing incorrect yet believable answers. Fine-tuning can help but it is expensive and not ideal for retraining again and again on new data. The Retrieval-Augmented Generation (RAG) framework addresses this issue by using external documents to improve the LLM's responses through in-context learning. RAG ensures that the information provided by the LLM is not only contextually relevant but also accurate and up-to-date.

There are four main components in RAG:

Indexing: First, documents (in any format) are split into chunks, and embeddings for these chunks are created. These embeddings are then added to a vector store.

Retriever: Then, the retriever finds the most relevant documents based on the user's query, using techniques like vector similarity from the vector store.

Augment: After that, the Augment part combines the user's query with the retrieved context into a prompt, ensuring the LLM has the information needed to generate accurate responses.

Generate: Finally, the combined query and prompt are passed to the model, which then generates the final response to the user's query.

These components of RAG allow the model to access up-to-date, accurate information and generate responses based on external knowledge. However, to ensure RAG systems are functioning effectively, it’s essential to evaluate their performance.

RAG Evaluation📊

Evaluating RAG applications is important for understanding how well these systems work. We can see how effectively they combine information retrieval with generative models by checking their accuracy and relevance. This evaluation helps improve RAG applications in tasks like text summarization, chatbots, and question-answering. It also identifies areas for improvement, ensuring that these systems provide trustworthy responses as information changes. Overall, effective evaluation helps optimize performance and builds confidence in RAG applications for real-world use. These notebooks contain an end-to-end RAG implementation + RAG evaluation part in Athina AI.

RAG Techniques⚙️

Here are the details of all the RAG techniques covered in this repository.

Technique	Tools	Description
Naive RAG	LangChain, Pinecone, Athina AI	Combines retrieved data with LLMs for simple and effective responses.
Hybrid RAG	LangChain, Chromadb, Athina AI	Combines vector search and traditional methods like BM25 for better information retrieval.
Hyde RAG	LangChain, Weaviate, Athina AI	Creates hypothetical document embeddings to find relevant information for a query.
Parent Document Retriever	LangChain, Chromadb, Athina AI	Breaks large documents into small parts and retrieves the full document if a part matches the query.
RAG fusion	LangChain, LangSmith, Qdrant, Athina AI	Generates sub-queries, ranks documents with Reciprocal Rank Fusion, and uses top results for accurate responses.
Contextual RAG	LangChain, Chromadb, Athina AI	Compresses retrieved documents to keep only relevant details for concise and accurate responses.
Rewrite Retrieve Read	LangChain, Chromadb, Athina AI	Improves query, retrieves better data, and generates accurate answers.
Corrective RAG	LangChain, LangGraph, Chromadb, Athina AI	Refines relevant documents, removes irrelevant ones or does the web search.
Self RAG	LangChain, LangGraph, FAISS, Athina AI	Reflects on retrieved data to ensure accurate and complete responses.
Adaptive RAG	LangChain, LangGraph, FAISS, Athina AI	Adjusts retrieval methods based on query type, using indexed data or web search.
Unstructured RAG	LangChain, LangGraph, FAISS, Athina AI, Unstructured	This method designed to handle documents that combine text, tables, and images.

Demo🎬

A quick demo of how each notebook works:

demo.mp4

Getting Started🛠️

First, clone this repository by using the following command:

git clone https://github.com/athina-ai/rag-cookbooks.git

Next, navigate to the project directory:

cd rag-cookbooks

Once you are in the 'rag-cookbooks' directory, follow the detailed implementation for each technique.

Creators + Contributors👨🏻‍💻

Contributing🤝

If you have a new technique or improvement to suggest, we welcome contributions from the community!

License📝

This project is licensed under MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
LICENSE.txt		LICENSE.txt
README.md		README.md
adaptive_rag.ipynb		adaptive_rag.ipynb
basic_unstructured_rag.ipynb		basic_unstructured_rag.ipynb
contextual_rag.ipynb		contextual_rag.ipynb
corrective_rag.ipynb		corrective_rag.ipynb
fusion_rag.ipynb		fusion_rag.ipynb
hybrid_rag.ipynb		hybrid_rag.ipynb
hyde_rag.ipynb		hyde_rag.ipynb
naive_rag.ipynb		naive_rag.ipynb
parent_document_retriever.ipynb		parent_document_retriever.ipynb
rewrite_retrieve_read.ipynb		rewrite_retrieve_read.ipynb
self_rag.ipynb		self_rag.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Advanced RAG Cookbooks👨🏻‍💻

Introduction🚀

Introduction to RAG💡

RAG Evaluation📊

RAG Techniques⚙️

Demo🎬

Getting Started🛠️

Creators + Contributors👨🏻‍💻

Contributing🤝

License📝

About

Releases

Packages

Languages

License

athina-ai/rag-cookbooks

Folders and files

Latest commit

History

Repository files navigation

Advanced RAG Cookbooks👨🏻‍💻

Introduction🚀

Introduction to RAG💡

RAG Evaluation📊

RAG Techniques⚙️

Demo🎬

Getting Started🛠️

Creators + Contributors👨🏻‍💻

Contributing🤝

License📝

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages