# Build a Retrieval Augmented Generation (RAG) App

One of the most powerful applications enabled by LLMs is sophisticated question-answering (Q&A) chatbots. These are applications that can answer questions about specific source information. These applications use a technique known as Retrieval Augmented Generation, or RAG.

This tutorial will show how to build a simple Q&A application
over a text data source. Along the way we’ll go over a typical Q&A
architecture and highlight additional resources for more advanced Q&A techniques. We’ll also see
how LangSmith can help us trace and understand our application.
LangSmith will become increasingly helpful as our application grows in
complexity.

## What is RAG?

RAG is a technique for augmenting LLM knowledge with additional data.

LLMs can reason about wide-ranging topics, but their knowledge is limited to the public data up to a specific point in time that they were trained on. If you want to build AI applications that can reason about private data or data introduced after a model's cutoff date, you need to augment the knowledge of the model with the specific information it needs. The process of bringing the appropriate information and inserting it into the model prompt is known as Retrieval Augmented Generation (RAG).

LangChain has a number of components designed to help build Q&A applications, and RAG applications more generally. 

**Note**: Here we focus on Q&A for unstructured data. If you are interested for RAG over structured data, check out our tutorial on doing [question/answering over SQL data](/v0.2/docs/tutorials/sql_qa).

## Concepts
A typical RAG application has two main components:

**Indexing**: a pipeline for ingesting data from a source and indexing it. *This usually happens offline.*

**Retrieval and generation**: the actual RAG chain, which takes the user query at run time and retrieves the relevant data from the index, then passes that to the model.

The most common full sequence from raw data to answer looks like:

### Indexing
1. **Load**: First we need to load our data. This is done with [DocumentLoaders](/v0.2/docs/concepts/#document-loaders).
2. **Split**: [Text splitters](/v0.2/docs/concepts/#text-splitters) break large `Documents` into smaller chunks. This is useful both for indexing data and for passing it in to a model, since large chunks are harder to search over and won't fit in a model's finite context window.
3. **Store**: We need somewhere to store and index our splits, so that they can later be searched over. This is often done using a [VectorStore](/v0.2/docs/concepts/#vectorstores) and [Embeddings](/v0.2/docs/concepts/#embedding-models) model.

![index_diagram](../../../static/img/rag_indexing.png)

### Retrieval and generation
4. **Retrieve**: Given a user input, relevant splits are retrieved from storage using a [Retriever](/v0.2/docs/concepts/#retrievers).
5. **Generate**: A [ChatModel](/v0.2/docs/concepts/#chat-models) / [LLM](/v0.2/docs/concepts/#llms) produces an answer using a prompt that includes the question and the retrieved data

![retrieval_diagram](../../../static/img/rag_retrieval_generation.png)


## Setup

### Installation

To install LangChain run:

```bash npm2yarn
npm i langchain
```

For more details, see our [Installation guide](/docs/get_started/installation).

### LangSmith

Many of the applications you build with LangChain will contain multiple steps with multiple invocations of LLM calls.
As these applications get more and more complex, it becomes crucial to be able to inspect what exactly is going on inside your chain or agent.
The best way to do this is with [LangSmith](https://smith.langchain.com).

After you sign up at the link above, make sure to set your environment variables to start logging traces:

```shell
export LANGCHAIN_TRACING_V2="true"
export LANGCHAIN_API_KEY="..."
```

## Preview

In this guide we’ll build a QA app over as website. The specific website we will use isthe [LLM Powered Autonomous
Agents](https://lilianweng.github.io/posts/2023-06-23-agent/) blog post
by Lilian Weng, which allows us to ask questions about the contents of
the post.

We can create a simple indexing pipeline and RAG chain to do this in ~20
lines of code:

```{=mdx}
import ChatModelTabs from "@theme/ChatModelTabs";

<ChatModelTabs customVarName="llm" />
```