## LangChain

LangChain is a Open Source framework designed for building applications powered by LLMs. It simplifies development process by allowing the developers to chain together various components and integrate third party components with ease. 

### Key Features of LangChain

- LangChain provides a structured way to build applications that utilize LLMs, making it easier to develop ChatBots, Virtual Assistants and other AI Driven tools.
- The framework allows us to combine different components such as datasources, daabases and APIs to create a more complex and capable applications. 
- LangChain provides a set of tools which helps the developers experiment quickly with various configurations and integrations reducing the time needed to build and test applications.
- It include tools for every step of the agent development lifecycle enabling the creation of reliable and context-aware agents.

LangChain is most popular for creating RAG (Retrieval Augmented Generation) applications. 

### Retrieval Augmented Generation (RAG) Application
Retrieval Augmented Generation (RAG) is an AI Technique that combines two things

- Retrieval: Finds relevent information from large external sources like database, document collection or web
- Generation: Uses an LLM to create a response based on both the retrieved information and also based on its own knowledge. 

#### Why is RAG needed
While LLMs are powerful, they have two inherent limitations

- Their knowledge is limited to the dataset on which they are trained
- They sometimes hallucinate (make up facts)

RAG helps solve this by letting the model look things up before answering from memory

### How RAG works?

- User asks a question
- A search engine or a vector database would then scan the database and retrieve relevent information based on the search query. 
- A generator, based on the question and retrieved information generates a natural language response.

### Different components (steps?) in RAG application using LangChain

#### Data Ingestion

- Load

    Data is loaded from multiple datasets. Data can be loaded frin a variety of datasets including and not limited to PDF, XLS, JSON internet. 

- Split

    The loaded data is then split into various text / document chunks. This is crucial for efficient retrieval as it allows the model to search through smaller and most relevent pieces of information rather than searching entire documents.

- Embed

    In the embed stage, the data is then converted to vector embeddings. They can be either static embeddings or contextual embeddings. Need to check more about this.

- Store

    These embeddings are then stored in a vector store database. We have multiple vector databases namely FAISS, CHROMADB, ASTRADB

#### Data Retrieval

- When a question is asked, relevent information from the vector database in context to the question is retrieved.
- The question and the retrieved information together creates a prompt.
- This prompt is then fed to the LLM to generate a natural language response.

Below is a step by step breakdown of how this happens.

- User asks a question
- The question is then passed to an embedding model like OpenAI or HuggingFace to produce a Contextual Embedding.
- This generated contextual embedding is then used to query the vector database to fetch the relevent information.
- Using the original question, fetched vector embeddings and optional metadata (source, page number etc), a prompt is generated.
- This prompt is then fed to an LLM which generates a natural language response.

Data Retrieval in LangChain is done by a component called Retrieval Chain. Retrieval Chain is an interface which will query the vector database, fetch the relevent information, combines both the question and the fetched information into a prompt.