# Lesson 1: LlamaIndex Basics

There are 5 key stages in RAG:

![image.png](attachment:cdaaece7-a77f-44be-90b8-982a54a801ca.png)

Ref https://docs.llamaindex.ai/en/stable/getting_started/concepts.html

## Loading Stage

#### Document
A Document is a container around any data source - for instance, a PDF, an API output, or retrieve data from a database.

#### Nodes
A Node is the atomic unit of data in LlamaIndex and represents a “chunk” of a source Document. Nodes have metadata that relate them to the document they are in and to other nodes.


## Indexing Stage

#### Indexes
Once you’ve ingested your data, LlamaIndex will help you index the data into a structure that’s easy to retrieve. This usually involves generating vector embeddings which are stored in a specialized database called a vector store. 

#### Embeddings
LLMs generate numerical representations of data called embeddings. When filtering your data for relevance, LlamaIndex will convert queries into embeddings, and your vector store will find data that is numerically similar to the embedding of your query.


## Querying Stage

#### Retrievers
A retriever defines how to retrieve relevant context from an index when given a query. Your retrieval strategy is key to the relevancy of the data retrieved and the efficiency with which it’s done.

#### Routers
A router determines which retriever will be used to retrieve relevant context from the knowledge base. More specifically, the RouterRetriever class, is responsible for selecting one or multiple candidate retrievers to execute a query. 

#### Node Postprocessors
A node postprocessor takes in a set of retrieved nodes and applies transformations, filtering, or re-ranking logic to them.

#### Response Synthesizers
A response synthesizer generates a response from an LLM, using a user query and a given set of retrieved text chunks.

## Putting it all together

#### Query Engines
A query engine is an end-to-end pipeline that allows you to ask questions over your data. It takes in a natural language query, and returns a response, along with reference context retrieved and passed to the LLM.

#### Chat Engines
 A chat engine is an end-to-end pipeline for having a conversation with your data (multiple back-and-forth instead of a single question-and-answer).


A Chat Engine provides a high-level interface to have a back-and-forth conversation with your data, as opposed to a single question-answer interaction facilitated by the Query Engine. By maintaining a history of the conversation, the Chat Engine can provide answers that are contextually aware of previous interactions.


# Lesson 1: LlamaIndex Implementation

In [1]:
from rag_llama_index import RAGLlamaIndex

# 1. code walk-through
# 2. show a few example queries