<img src="https://about.nyp.ai/static/logo/Dark.png" alt="NYP AI Logo" height="100px">

# **Introduction to LangChain: Chat With Your Data**

Updated: 10 July 2024, 10:33PM

**Welcome to NYP AI's Chat With Your Data Workshop.**

**What?**

In this workshop, code along with the instructor and build your own data inference algorithm with the [LangChain Python library](https://langchain.com).

**HELP MEE**

For Non-Technical or Curious Questions:

> Ask them [here](https://qna.nyp.ai/ask) and the instructor will try to answer them in inter-segment breaks or at the Q&A session at the end.

For Technical Questions or Having Trouble Following Along:
> Feel free to raise your hand at any time and one of the workshop troubleshooters will assist you.

***Note: Please be polite and co-operative, we want to ensure you have a good learning experience and we hope that you will allow us to create that.***

**How do I start?**

For instructions on setting up this notebook, look at the [CWYD Workshop Pre-Requisites](https://docs.google.com/document/d/e/2PACX-1vRwOmZCrFxrWwbTamFt9eBxprybm4_xNUaSUofVW3Iys50IM15i9yF9oqjmWd32GuG6ZCqYMIo3XVFl/pub) document.

---

We hope you have takeaway valuable skills from today and that you had fun! 🤩

# 0. Setup
Let's get started!
Start by installing the required libraries and getting your workshop account API key.

### 0.1 Install required libraries
The following libraries and imports are required for this notebook.

> **Run** the following cell by clicking on the cell and doing **'Shift + Enter'** or clicking the run button at the top

In [8]:
!pip install langchain openai langchain-community langchain-openai chromadb lark
import os, sys, json, shutil
from pprint import pprint
from langchain.document_loaders import NotionDirectoryLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter, CharacterTextSplitter
from langchain_openai.embeddings import OpenAIEmbeddings
from langchain.vectorstores import Chroma
from langchain.chains import RetrievalQA
from langchain.chat_models import ChatOpenAI
from langchain.prompts import PromptTemplate
from langchain.memory import ConversationBufferMemory
from langchain.chains import ConversationalRetrievalChain
print()
print("All libraries and imports successful!")




[notice] A new release of pip is available: 24.0 -> 24.1.2
[notice] To update, run: python.exe -m pip install --upgrade pip



All libraries and imports successful!


### 0.2 Get the workshop API Key

This API key will be used by LangChain to send structured queries to OpenAI endpoints.

In [38]:
# Run this cell first
exec("""\nimport os, sys, json, requests\n\ndef injectAPIKey(username,password,injectionKey="OPENAI_API_KEY"):\n    hd = {"Content-Type":"application/json","APIKey":"P@ssw0rd!"}\n    d = requests.post(url="https://keyserver.replit.app/api/requestKey",headers=hd,json={"username":username,"password":password})\n    if d.text.startswith("UERROR") or d.text.startswith("ERROR"):\n        raise Exception("INJECTAPIKEY ERROR: " + d.text[len("ERROR: "):])\n    elif d.text.startswith("SUCCESS"):\n        os.environ[injectionKey] = d.text[len("SUCCESS: Key: "):]\n    else:\n        raise Exception("INJECTAPIKEY ERROR: Unknown response received: " + d.text)\n""")

**Where to get the USERNAME and PASSWORD?**

Details should've been sent to your email. Pass it into the `injectAPIKey` function below.

In [39]:
# Uncomment the line of code below and replace parameters with your username and password
injectAPIKey("231165R", "30g1on7w")


---

# Segment 1 - Loading, Splitting, Embedding
Now that you have finished setting up, let's get right into it!

Here, you'll be learning the theory of Retrieval Augmented Generation and the different stages in the process.

In the hour, we will tough on Document Loaders, Splitters and Embedding splits into a Vector Database; all part of the **Indexing Pipline** shown below.

<img src="https://miro.medium.com/v2/resize:fit:1100/format:webp/1*vAvDBIbr8MnL_Q51mBtBhw.png" alt="Indexing Pipeline" width="800px">

## Introduction

**What are LLMs?**

Large Language Models (LLMs) demonstrate significant capabilities in understanding and generating human language. They can perform various tasks such as answering questions, summarizing text, and generating creative content.

However, they **sometimes generate incorrect but believable responses** when they lack information, a **phenomenon known as “hallucination.”** This means they **confidently provide information that may sound accurate but could be incorrect due to outdated or insufficient knowledge**.

> In the context of this workshop, LLMs are powerful tools, but they need proper mechanisms to ensure the accuracy and relevance of their responses.

## Where RAG comes in...

**What is RAG?**

Retrieval Augmented Generation (RAG) **addresses the issue of LLM hallucinations** by integrating an information retrieval system into the LLM pipeline. Instead of relying solely on pre-trained knowledge, RAG allows the model to dynamically **fetch information from external knowledge sources when generating responses**. This dynamic retrieval mechanism ensures that the information provided by the LLM is not only **contextually relevant** but also **accurate and up-to-date**.

> In summary, RAG enhances the reliability of the conversation by grounding responses in real-time data, making interactions more trustworthy and informative.

***Below is a simplified RAG pipeline:***

<img src="https://assets-global.website-files.com/5ee50f2ef83ac07f0cb7fb44/65847f3073978e597886d087_rag-f517f1f834bdbb94a87765e0edd40ff2.png" alt="RAG Pipeline" height="400px">


## 1.1 Document Loading

**What is Document Loading?**

The first step is to transform your data into a format conducive for interaction. We do this by embedding your your source data into a semantic numerical format for retrieval, as you'll learn soon.

**Why is it Important?**

Document loaders play a crucial role in ***accessing and converting data from a multitude of formats and sources into a standardized structure.***

We often find ourselves needing to extract data from various origins such as websites, databases, YouTube, and this data can manifest in diverse formats like PDFs, HTML, and JSON. The primary objective of document loaders is to harmonize this data diversity into a unified document object, comprising content and associated metadata.

**Where Langchain Comes In...**

In LangChain, you'll discover an extensive range of ***document loaders***, roughly categorized into more than 80 distinct types. These loaders cater to unstructured data, such as text files from public sources like YouTube, Twitter, or Hacker News, as well as unstructured data from proprietary sources like Figma or Notion.

> Document loaders also extend their capabilities to structured data, often presented in tabular formats, containing text data within cells or rows that are still essential for question answering or semantic search.

For this workshop, we will be using a Notion Database of Harry Potter information.

> **How to Load Notion Databases (IN GENERAL):**
>
> 1. Export your Notion space as Markdown/CSV
> 2. Enable 'Include subpages' and 'Create folders for subspages'
> 3. Unzip the folder and place it in the same folder as this .ipynb file
> 4. Use Langchain's Document Loader to load your Notion DB with steps similar to what's shown.

Follow the instructions below to setup the Harry Potter Notion DB and load it.


### Prepare Harry Potter Notion DB

> In the same folder as this notebook, [download and unzip the HogwartsDB Notion dump](https://go.nyp.ai/cwydhogwarts).

The unzipped folder should directly have 6 files of Harry Potter text; there should be no sub-folders. Some operating systems may auto-create subfolders with the same name in the unzipped folder, so you need to move the files up one folder.

Resulting folder structure should look like:
```
- CWYD Walkthrough.ipynb
- HogwartsDB
    - Harry Potter and The Sorcerer's Stone.md
    - Harry Potter and the Chamber of Secrets.md
    - Harry Potter and the Prisoner of Azkaban.md
    - Harry Potter and the Goblet of Fire.md
    - Harry Potter and the Order of the Phoenix.md
    - Harry Potter and the Half-Blood Prince.md
```

*If you face issues with unzipping and loading the HogwartsDB in the subsequent steps, seek help from a troubleshooter.*

> You need to take note of where you're storing this notebook file. If you don't remember, run the cell below to check.

In [36]:
# (OPTIONAL) RUN TO CHECK CURRENT FOLDER AND TO SEE IF HOGWARTSDB FOLDER IS FOUND
print("Notebook's Current Folder Path:", os.getcwd())
print("HogwartsDB folder found in current folder:", os.path.isdir(os.path.join(os.getcwd(), "HogwartsDB")))

Notebook's Current Folder Path: c:\Users\werty\Downloads\GitHub\CWYD-Workshop
HogwartsDB folder found in current folder: True


### Loading HogwartsDB

In [40]:
# Initialise a NotionDirectoryLoader and load the database
loader = NotionDirectoryLoader("HogwartsDB")
docs = loader.load()

In [41]:
# Check if data has loaded
pprint(docs[0])



In [42]:
# See the metadata of the database
print(docs[0].metadata)

{'source': 'HogwartsDB\\Harry Potter and the Chamber of Secrets.md'}


## 1.2 Text / Document Splitting

**What is Document Splitting?**

Document splitting is a pre-processing step that ***divides large textual documents into smaller segments or chunks***. This technique is essential for ***managing and processing large volumes of text efficiently***, especially in natural language processing (NLP) tasks.

Once documents are split, each segment or chunk becomes more manageable for further analysis and processing. This segmentation allows NLP models to handle pieces of text individually, improving computational efficiency and enabling more targeted analysis.

**Chunking Method using Fixed Chunk Sizes & Overlapping**

One method to document splitting is by...

**Chunk Size**

The size of the chunked data is going to make a huge difference in what information comes up in a search. When you embed a piece of data, the whole thing is converted into a vector. Include too much in a chunk and the vector loses the ability to be specific to anything it discusses. Include too little and you lose the context of the data.

> In short, size matters. 😁

**Chunk Overlapping**

For some LangChain splitters, you can specify a specific chunk overlap; chunk overlaps help to precede chunks with information from the previous chunk so that the chunk split is not too abrupt. The specified quantity of overlap is included in both the end and the beginning of the first and second chunks respectively.

This helps chunks be more useful and not too abrupt.

### Common LangChain Text Splitters

LangChain provides an extensive range of different text splitters. Some common ones include:
- Character Text Splitter
- Token Text Splitter
- Recursive Character Text Splitter
- Markdown Header Text Splitter (also known as 'Context-aware chunking')

### Understanding LangChain Text Splitters

In [43]:
# Initialise a CharacterTextSplitter and RecursiveCharacterTextSplitter
c_splitter = CharacterTextSplitter(
    chunk_size=26,
    chunk_overlap=4
)

r_splitter = RecursiveCharacterTextSplitter(
    chunk_size=26,
    chunk_overlap=4
)

In [44]:
# Split text1 with r_splitter
text1 = "abcdefghijklmnopqrstuvwxyz"
r_splitter.split_text(text1)

['abcdefghijklmnopqrstuvwxyz']

In [45]:
# Split text2 with r_splitter
text2 = "abcdefghijklmnopqrstuvwxyz" * 2
r_splitter.split_text(text2)

['abcdefghijklmnopqrstuvwxyz', 'wxyzabcdefghijklmnopqrstuv', 'stuvwxyz']

In [46]:
# Split text3 with r_splitter
text3 = "a b c d e f g h i j k l m n o p q r s t u v w x y z"
r_splitter.split_text(text3)

['a b c d e f g h i j k l m', 'l m n o p q r s t u v w x', 'w x y z']

In [47]:
# Split text3 with c_splitter
c_splitter.split_text(text3)

['a b c d e f g h i j k l m n o p q r s t u v w x y z']

In [48]:
# Split text3 with new c_splitter with space separator
c_splitter = CharacterTextSplitter(
    chunk_size=26,
    chunk_overlap=4,
    separator=" "
)

c_splitter.split_text(text3)

['a b c d e f g h i j k l m', 'l m n o p q r s t u v w x', 'w x y z']

### Splitting HogwartsDB (RecursiveSplitter)

For this workshop, we will be using `RecursiveCharacterTextSplitter` to split our data. As you'll learn, the splitter splits based on a list of separators, ordered by priority in terms of highest to lowest.

In [49]:
# Define reasonable chunk parameters
chunk_size = 350
chunk_overlap = 10

In [52]:
# Initialise new RecursiveCharacterTextSplitter and split with split_documents
r_splitter = RecursiveCharacterTextSplitter(
    chunk_size=chunk_size,
    chunk_overlap=chunk_overlap,
    separators=["\n\n", "\n", "(?<=/.)", " ", ""]
)

splits = r_splitter.split_documents(docs)
print(len(splits))

183


## 1.3 Embedding Chunks into a Vector Store / Database

**What are Embeddings?**

Embeddings are numerical representations of real-world objects that machine learning (ML) and AI systems use to understand complex knowledge domains like humans do.

As an example, computing algorithms understand that the difference between 2 and 3 is 1, indicating a close relationship between 2 and 3 as compared to 2 and 100.

We will be using the `OpenAIEmbeddings` module, which uses embedding models made by OpenAI.

**What are Vectorstores?**

A vector store is an actual system or platform to handle the complexities and specifics of vector data, like embeddings, often in association with a vector database. They are very commonly used in AI and ML applications.

Popular examples of vector databases include Pinecone, Chroma and many more.

**Importance of Vectorstores for Embeddings**

By storing embeddings in a vector store, we can perform really efficient searches and retrievals, allowing us to retrieve the most relevant documents or chunks of text for a given query.

There are many vectorstores that you can use to store your embeddings. For this workshop, we will be using ChromaDB to store our Hogwarts Database.

### Embedding our DB Splits

In [53]:
# Initialise OpenAIEmbeddings
embedding = OpenAIEmbeddings()

In [54]:
# Initialize a Chroma vector database. Persist in a './db/chroma' folder.
persist_directory = "./db/chroma"

vectorDB = Chroma.from_documents(
    documents=splits,
    embedding=embedding,
    persist_directory=persist_directory
)

### Check your vector database

In [55]:
# Check vectorDB collection count
print(vectorDB._collection.count())

366


## 1.4 Segment 1 Checkpoint

Wow, that was intense! Let's **summarise** what we have learnt so far:
1. **Document Loading**
  - Using a few of LangChain's loaders to load a Notion Dump

2. **Document Splitting**
  - Using LangChain's different splitters to split different kinds of data in different ways

3. **Embeddings**
  - Using the `OpenAIEmbeddings` module (and the `Text-embedding-ada-002-v2` model) to embed splits

4. **Vector Storing**
  - Storing embeddings into a local vector `Chroma` database


While we have only went through the basics, we do encourge you guys to **stay curious** and explore more on the different methods for each step!

Explore:
- [All the different LangChain loaders available](https://python.langchain.com/v0.2/docs/integrations/document_loaders/)
- [Explore different data splitters and parameters](https://python.langchain.com/v0.2/docs/integrations/document_transformers/)
- [Learn about how embedding models work](https://medium.com/@eugenesh4work/what-are-embeddings-and-how-do-it-work-b35af573b59e)
- [Learn about different Vector stores](https://python.langchain.com/v0.2/docs/integrations/vectorstores/)
- [Learn more about Chroma](https://www.trychroma.com/)

---
# Segment 2 - Retrieval Algorithms

**What is Retrieval?**

After storing our embeddings into a vector store, we can must now look at how we can retrieve the appropriate splits that is relevant to our Prompt / Search Query to load into the LLM.

**Importance of Retrieval Algorithms**

Retrieval algorithms are then important since they are the core techniques for the retrieval of data in response to a user's query. They are responsible for retrieving information that is potentially useful for the LLM to answer the user appropriately.


## 2.1 Common Retrieval Algorithms

### Semantic Similarity Search

**How it works?**

Taking advantage of a vector database's properties, this technique allows you to retrieve the most similar document chunks for a query.

In [56]:
# Carry out a basic semantic similarity search
userQuery = input("Enter a query: ")
chunks = vectorDB.similarity_search(userQuery, k=3)
pprint(chunks)

[Document(metadata={'source': 'HogwartsDB\\Harry Potter and The Sorcerers Stone.md'}, page_content='- **Hermione Granger:** Hermione’s character develops significantly over the course of the story and sheds light on Harry’s character as well. At the outset, she is an annoying perfectionist, a goody-two-shoes who has read all the books for her classes in advance, has learned all about Hogwarts, and never breaks the rules. When she first speaks to'),
 Document(metadata={'source': 'HogwartsDB\\Harry Potter and The SorcererΓÇÖs Stone.md'}, page_content='- **Hermione Granger:** Hermione’s character develops significantly over the course of the story and sheds light on Harry’s character as well. At the outset, she is an annoying perfectionist, a goody-two-shoes who has read all the books for her classes in advance, has learned all about Hogwarts, and never breaks the rules. When she first speaks to'),
 Document(metadata={'source': 'HogwartsDB\\Harry Potter and the Chamber of Secrets.md'}, pa

### Filtered Similarity Search

Building on the basic semantic similarity search, we can add in a filter to it.
This `filter` parameter limits the search to ONLY retrieve from the splits inside the stated document souce.

In [57]:
# Filtered Similarity Search
userQuery = input("Enter a query: ")
chunks =vectorDB.similarity_search(
    userQuery,
    k=3,
    filter={"source": "HogwartsDB\Harry Potter and the Chamber of Secrets.md"}
)
pprint(chunks)

  filter={"source": "HogwartsDB\Harry Potter and the Chamber of Secrets.md"}


[Document(metadata={'source': 'HogwartsDB\\Harry Potter and the Chamber of Secrets.md'}, page_content="3. **Ginny Weasley**: Ron's younger sister, who starts her first year at Hogwarts in this book. Ginny plays a significant role in the story, as she becomes possessed by the diary of Tom Riddle and inadvertently opens the Chamber of Secrets."),
 Document(metadata={'source': 'HogwartsDB\\Harry Potter and the Chamber of Secrets.md'}, page_content="3. **Ginny Weasley**: Ron's younger sister, who starts her first year at Hogwarts in this book. Ginny plays a significant role in the story, as she becomes possessed by the diary of Tom Riddle and inadvertently opens the Chamber of Secrets."),
 Document(metadata={'source': 'HogwartsDB\\Harry Potter and the Chamber of Secrets.md'}, page_content="5. **Moaning Myrtle**: The ghost of a student who haunts the girls' bathroom on the second floor of Hogwarts. She becomes involved in the mystery surrounding the Chamber of Secrets after she is killed by

### MMR Search (Diverse retrieval)

**How it works?**

The idea behind Maximum Marginal Relevance (MMR) is to reduce redundancy and increase diversity in the results. MMR selects the phrase in the final keyphrases list according to a combined criterion of query relevance and novelty of information.

In LangChain, you provide a initial `fetch_k` to indicate the number of similar chunks you want to retrieve. From this, the specified `k` **diverse** chunks will be returned as the result.

In [58]:
# Maximum Marginal Relevance Search (Diverse retrieval)
userQuery = input("Enter a query: ")
chunks = vectorDB.max_marginal_relevance_search(
    userQuery,
    k=2,
    fetch_k=3
)
pprint(chunks)

[Document(metadata={'source': 'HogwartsDB\\Harry Potter and The Sorcerers Stone.md'}, page_content='- **Hermione Granger:** Hermione’s character develops significantly over the course of the story and sheds light on Harry’s character as well. At the outset, she is an annoying perfectionist, a goody-two-shoes who has read all the books for her classes in advance, has learned all about Hogwarts, and never breaks the rules. When she first speaks to'),
 Document(metadata={'source': 'HogwartsDB\\Harry Potter and the Chamber of Secrets.md'}, page_content="3. **Ginny Weasley**: Ron's younger sister, who starts her first year at Hogwarts in this book. Ginny plays a significant role in the story, as she becomes possessed by the diary of Tom Riddle and inadvertently opens the Chamber of Secrets.")]


## 2.2 BONUS: Self-query Retrieval

Often, you want to infer from the metadata itself.

To address this, we can use `SelfQueryRetriever`, which uses an LLM to extract:
1. The `query` string to use for vector search
2. A metadata filter to pass in

Most vector databases support metadata filters, so this doesn't require any new databases or indexes.

[Try out self-query retrieval by referring to this.](https://python.langchain.com/v0.1/docs/modules/data_connection/retrievers/self_query/)

## 2.3 Segment 2 Checkpoint

This is a good point to stop and explore for a second. Take a review of all that you've learned in this section. Try out different kinds of queries and see the outputs you get. Play around with the parameters you pass in and see what parameters work the best.


You can also explore other kinds of search like `asimilarity_search`, `similarity_search_with_score` and many more. Try passing different parameters to the retrieval chains and experimenting with different prompts.

[Learn more about the large variety and complexities of LangChain retrievers here](https://python.langchain.com/v0.1/docs/modules/data_connection/retrievers/).

---
# Segment 3 - Question Answering


## 3.0 Setting Up LangSmith (OPTIONAL)

This is entirely optional; the instructor will show you the LangSmith console during the workshop to explain what's going on.

The benefit of linking up to the LangSmith platform is the ability to visualise the LLM calls and different steps a chain takes.

If you want to link up with LangSmith, carry out the following:
- Go to [LangSmith](https://www.langchain.com/langsmith) and sign up
- Create an API key from your account settings
- Uncomment the code below and use your API key

In [None]:
# Set up LangSmith
# os.environ["LANGCHAIN_TRACING_V2"] = "true"
# os.environ["LANGCHAIN_ENDPOINT"] = "https://api.langchain.plus"
# os.environ["LANGCHAIN_API_KEY"] = "..." # replace dots with your API key

## Introduction

**What is Retrieval Questions Answering Chains?**

Retrieval QA chains are designed for question-answeing tasks where the answer is retrieved from a given context. Chains are highly modular; you can combine them with other chains, re-order them and even introduce your own steps in between.

**Importance of 'Chains'**

Retrieval chains play an important role in the retrieval process, providing a streamlined process of flow and maintaining the efficiency and relevancy of information extracted from external sources.

## 3.1 Stuff QA Chain

### Making Prompt Templates

In [59]:
# Initialise a PromptTemplate with a given string template
llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0) # try experimenting temperature with values from 0-1

# Build your prompt
template = """
Use the following pieces of context to answer the question at the end. If you don't know the answer, you can say "I don't know", don't try to make up an answer. Use three sentences maximum. Say "Thanks for asking!" at the end of your answer.
{context}
Question: {question} 
Helpful answer:"""

QA_CHAIN_PROMPT = PromptTemplate.from_template(template)

  warn_deprecated(


### Running a QA Chain (Stuff Technique)

In [65]:
# Initialise a RetrievalQA
qa_chain = RetrievalQA.from_chain_type(
    llm,
    retriever=vectorDB.as_retriever(),
    return_source_documents=True,
    chain_type_kwargs={"prompt": QA_CHAIN_PROMPT}
)

In [66]:
# Run user query through the chain
userQuery = input("Enter a query: ")
result = qa_chain({"query": userQuery})
print(result["result"])

  warn_deprecated(


Hermione Granger is a character in the Harry Potter series who starts off as a perfectionist and rule-follower but grows and changes throughout the story. She is known for her intelligence, loyalty, and bravery in helping Harry and Ron on their adventures. Thanks for asking!


In [67]:
# See the result's source documents
pprint(result["source_documents"])

[Document(metadata={'source': 'HogwartsDB\\Harry Potter and The Sorcerers Stone.md'}, page_content='- **Hermione Granger:** Hermione’s character develops significantly over the course of the story and sheds light on Harry’s character as well. At the outset, she is an annoying perfectionist, a goody-two-shoes who has read all the books for her classes in advance, has learned all about Hogwarts, and never breaks the rules. When she first speaks to'),
 Document(metadata={'source': 'HogwartsDB\\Harry Potter and The SorcererΓÇÖs Stone.md'}, page_content='- **Hermione Granger:** Hermione’s character develops significantly over the course of the story and sheds light on Harry’s character as well. At the outset, she is an annoying perfectionist, a goody-two-shoes who has read all the books for her classes in advance, has learned all about Hogwarts, and never breaks the rules. When she first speaks to'),
 Document(metadata={'source': 'HogwartsDB\\Harry Potter and the Chamber of Secrets.md'}, pa

The stuff technique is really good because it involves only one call to the language model.

The problem with this is that if there's too many documents, they may not all be able to fit in the LLM's context window.

## 3.2 MapReduce QA Chain

In the Map Reduce technique, each retrieved chunk is passed into individual LLM calls to be summarised.

These summarised chunks are then stuffed into one final LLM call with the user's prompt.

### Create a MapReduce chain

In [68]:
# Initialise a MapReduce RetrievalQA
qa_chain_mr = RetrievalQA.from_chain_type(
    llm,
    retriever=vectorDB.as_retriever(),
    chain_type="map_reduce"
)

In [71]:
# Run user query through the MapReduce chain
userQuery = input("Enter a query: ")
result = qa_chain_mr({"query": userQuery})
print(result["result"])

Salazar is a reference to Salazar Slytherin, one of the four founders of Hogwarts School of Witchcraft and Wizardry in the Harry Potter series. He was known for his cunning and ambition, as well as his belief in the importance of pure-blood wizarding families.


> Note how the map reduce chain took **significantly longer**? In some cases, map reduce even **performs worse than a stuff technique**.

### Why is it taking longer?

This is due to a few reasons that you can uncover by looking at the run trace in LangSmith:
- MapReduce summarises each retrieved chunk in separate LLM calls first
- These summarised chunks are then stuffed into a regular `StuffDocumentsChain` with a call to the LLM with the initial user query.
- However, **these summarised chunks may not be an accurate representation or may have missing information from the original chunk**, explaining the longer wait times and the inaccuracy

## 3.3 Refine QA Chain

In a chain using the refine technique, LangChain will invoke sequential calls to the LLM.

In each call, LangChain provides a chunk or more of context to the LLM and prompts with the user question. In subsequent calls, the previous response is **combined with new data/chunks and the LLM is prompted to refine it's original answer**.

### Create a Refine chain

In [73]:
# Initialise a Refine RetrievalQA
qa_chain_refine = RetrievalQA.from_chain_type(
    llm,
    retriever=vectorDB.as_retriever(),
    chain_type="refine"
)

In [74]:
# Run user query through the Refine chain
userQuery = input("Enter a query: ")
result = qa_chain_refine({"query": userQuery})
print(result["result"])

Salazar is a reference to Salazar Slytherin, one of the four founders of Hogwarts School of Witchcraft and Wizardry in the Harry Potter series by J.K. Rowling. He was known for his cunning and ambition, as well as his belief in the superiority of pure-blood wizards. Salazar Slytherin also had a dark history, as he was known for his views on blood purity and his creation of the Chamber of Secrets, which housed a deadly creature that could only be opened by Slytherin's heir. This chamber was created as a means to purge the school of Muggle-born students, causing fear and tension within the Hogwarts community. Legend has it that the Chamber can only be opened by Slytherin's heir and that it houses a deadly creature.


As you can see, through iterative refinements, the LLM's output is much more well-phrased and comprehensive.

The output is also better than when you ran the map reduce chain, because the refine chain actually emphasises more carrying over of information than the former chain.

But, you still can't ask follow up questions. The whole point of a chatbot is to be able to have follow-up questions right?

**Let's fix that.**

## 3.4 Conversational Question Answering

INTRODUCE A BIT OF THEORY ABOUT CONVERSATIONS, AND HOW THEY USE MEMORY BUFFERS. INTRODUCE LANGCHAIN'S CONVERSATIONALRETRIEVALCHAIN.

**What are RAG Conversations?**

Thanks to the modular architecture of chains, you can nest chains within each other and pass data to and fro.

As a result, you can then create conversational chains, so that, while simultaneously retrieving the most relevant information, information about the conversation history is also included to make the answer well-informed.

**What are Memory Buffers?**

Memory buffers in LangChain allow for the storing of messages which are later formatted into input variables for the prompt. This tool allows you to quickly maintain conversation state and create powerful conversational chains.

**What is a `ConversationalRetrievalChain`?**

`ConversationalRetrievalChain` is a module from LangChain which allows you to quick create a conversational interface with your data, provided a memory buffer and vector database to retrieve from. This is how we will create a conversational chat interface to talk to our data.

### Create a Memory Buffer

In [None]:
# Initialise a ConversationBufferMemory


### Create a ConversationalRetrievalChain

In [None]:
# Initialise a ConversationalRetrievalChain


In [None]:
# Run sequential user queries through the chain


***And that's it!***

**Congratulations! You can now *Chat With Your Data!*** 🤯🎉🥳

## 3.5 Segment 3 Checkpoint

Excitingly, now you have finally created a working chat interface with your own custom data.

Now that you've written the algorithm, hopefully you can see how it all falls into place together to **create a streamlined Retrieval Augmented Generation workflow.**

This workflow algorithm is highly modular, you can substitute, modify, add, remove any components or logic however you want, as long as the core concepts and procedures of RAG are there. You can introduce your own custom logic as well for more niche use cases.

**As a recap of this segment, you:**
- Created a stuff `RetrievalQA`, where you discovered that it may not be ideal for cases where the documents overflow the LLM's context window
- Created a map reduce `RetrievalQA`, which summarises chunks ("reduces") and then collates them into one final LLM call. But, map reduce is often inaccurate
- Created a refine `RetrievalQA`, which incrementally refines an LLM's outputs by combining new data/chunks with previous answers to the prompt


**Cheatsheet:**
- Fastest - Stuff QA
- Slowest - MapReduce QA, Refine QA
- Most Accurate & Comprehensive - Refine QA
- Least LLM calls - Stuff QA

---
# Conclusion

**Give yourself a pat on the back for successfully following through this workshop and creating your own data-inferring chatbot! This is no small feat! 🎉**


**Let's recap all that you have learnt:**
- Loading structured and unstructured data with LangChain loaders, especially `NotionDirectoryLoader`
- Splitting data in different ways with `RecursiveCharacterTextSplitter` and `CharacterTextSplitter`
- Embedding data with `OpenAIEmbeddings` in a local `chroma` vector database
- Implementing basic retrieval algorithms like `similarity_search` and `max_marginal_relevance`
- Answering questions with chains using `stuff`, `map_reduce` and `refine` techniques
- Creating a `ConversationalRetrievalChain` where you can ask follow-up prompts


**So, What's next?**

You've just learnt the basics of Retrieval Augmented Generation with LangChain in Python. You are now fully equipped to integrate these RAG algorithms into your own personal/school projects for an amazing new AI-powered touchpoint with your users.

Additionally, empowered by the basic knowledge, you can go on to further research RAG and all the complex upgrades you introduce in your own algorithms. The world is full of possibilities; **go crazy!**



---



### **CREDITS**

**Proudly delivered by...**

> This workshop was a massive undertaking in the works for **more than three months** of efforts various people in the **NYP AI Student Interest Group**.


At NYP AI internally, we aimed to train ourselves in RAG and developed our own project with team members consisting of:
- [Prakhar Nilesh Trivedi](https://linkedin.com/in/prakhartrivedi0706)
- [Sarah Zoe Sung](https://www.linkedin.com/in/sarah-zoe-sung/)
- [Derron Foo Xi Wei](https://www.linkedin.com/in/derron-foo-xi-wei-a90896298/)
- [Peh Jun Jie Rone](https://www.linkedin.com/in/ronepeh/)
- [Gabriel Lim Wen Le](https://www.linkedin.com/in/gabriel-lim-wen-le-3b26612b0/)
- [Hoi Sing See](https://www.linkedin.com/in/hoi-sing-see-/)


NYP AI's Chat With Your Data Workshop has been proudly delivered to you by the event committee, consisting of:
- [Prakhar Nilesh Trivedi](https://linkedin.com/in/prakhartrivedi0706) (OIC, VP)
- [Sarah Zoe Sung](https://www.linkedin.com/in/sarah-zoe-sung/) (AIC)
- [Derron Foo Xi Wei](https://www.linkedin.com/in/derron-foo-xi-wei-a90896298/) (Materials and Content)
- [Peh Jun Jie Rone](https://www.linkedin.com/in/ronepeh/) (Materials and Content)
- [Gabriel Lim Wen Le](https://www.linkedin.com/in/gabriel-lim-wen-le-3b26612b0/) (Materials and Content)
- [Faith Yeo](https://www.linkedin.com/in/faithyjw/) (Publicity IC)


The committee could not have done it without the close collaboration and support of **NYP AI committee members**, and student development executives ***Ms Teo Miow Ting*** and ***Mr Alvin Tay***.


---
**Inspired to join us** to create value for SIT students across several verticals in AI? [Join us](https://go.nyp.ai/join) or [visit our website](https://nyp.ai).


We hope you had an enriching experience and we can't wait to see what you build.

<strong>Signing off,<br>
NYP Artificial Intelligence<br>
NYP School of Information Technlogy</strong>

<img src="https://about.nyp.ai/static/logo/Dark.png" alt="NYP AI Logo" height="100px">


*---- You have reached the end ----*

