# RAG with LangChain

## Overview
* One of the most powerful applications enabled by LLMs is question-answering chatbots.
* These apps can answer questions about specific source information.
* They use the RAG technique.

#### What is RAG?
* Augmenting LLM knowledge with additional data.
    * Private data.
    * Recent data (introduced after the LLM's cutoff date)

#### LangChain and RAG
* LangChain has a number of components designed to help build RAG apps.

#### Our focus now: RAG for unstructured data
* For LangChain RAG with structured data, see:
    * [RAG with SQL data](https://python.langchain.com/docs/use_cases/sql/)
    * [RAG with code data](https://python.langchain.com/docs/use_cases/code_understanding)

## RAG Architecture
* A typical RAG app has 2 main components, sometimes separated in 3:
    * Indexing: load data and index it.
    * Retrieval and generation: question and answer.

#### Steps of the Indexing phase
* Load text with a [Document Loader](https://python.langchain.com/docs/modules/data_connection/document_loaders/).
* Split text into small chunks with a [Splitter](https://python.langchain.com/docs/modules/data_connection/document_transformers/).
* Convert chunks into [embeddings](https://python.langchain.com/docs/modules/data_connection/text_embedding/) and store them in a [vector database, also called vector store](https://python.langchain.com/docs/modules/data_connection/vectorstores/).

#### Steps of the Retrieval and Generation phase
* Given a question, the most relevant chunks are retrieved from the vector database using a [Retriever](https://python.langchain.com/docs/modules/data_connection/retrievers/).
* The question and the chunks are sent to the [LLM](https://python.langchain.com/docs/modules/model_io/llms/) in a prompt. The LLM then produces the answer. You can use a [ChatModel](https://python.langchain.com/docs/modules/model_io/chat) instead of an LLM.

## Resources available from LangChain
* [RAG Quickstart Guide](https://python.langchain.com/docs/use_cases/question_answering/quickstart).
* [How to get the source documents used to produce a RAG answer](https://python.langchain.com/docs/use_cases/question_answering/sources).
* [How to stream RAG answers](https://python.langchain.com/docs/use_cases/question_answering/streaming).
* [How to add chat history to a RAG app](https://python.langchain.com/docs/use_cases/question_answering/chat_history).
* [How to do RAG when each user has their own private data](https://python.langchain.com/docs/use_cases/question_answering/per_user).
* [How to use Agents for RAG](https://python.langchain.com/docs/use_cases/question_answering/conversational_retrieval_agents).
* [How to use RAG with local models](https://python.langchain.com/docs/use_cases/question_answering/local_retrieval_qa).