# Natural Language Processing

# Retrieval-Augmented generation (RAG)

RAG is a technique for augmenting LLM knowledge with additional, often private or real-time, data.

LLMs can reason about wide-ranging topics, but their knowledge is limited to the public data up to a specific point in time that they were trained on. If you want to build AI applications that can reason about private data or data introduced after a model’s cutoff date, you need to augment the knowledge of the model with the specific information it needs.

<img src="https://github.com/MyaMjechal/nlp-a6-lets-talk-with-yourself-rag-chatbot/blob/main/images/RAG-process.png?raw=1" >

Introducing `ChakyBot`, an innovative chatbot designed to assist Chaky (the instructor) and TA (Gun) in explaining the lesson of the NLP course to students. Leveraging LangChain technology, ChakyBot excels in retrieving information from documents, ensuring a seamless and efficient learning experience for students engaging with the NLP curriculum.

1. Prompt
2. Retrieval
3. Memory
4. Chain

In [1]:
#langchain library
!pip install langchain==0.0.350
!pip install langchain-community==0.0.4
#LLM
!pip install accelerate==0.25.0
!pip install transformers==4.36.2
!pip install bitsandbytes==0.41.2
#Text Embedding
!pip install sentence-transformers==2.2.2
!pip install InstructorEmbedding==1.0.1
#vectorstore
!pip install pymupdf==1.23.8
!pip install faiss-gpu==1.7.2
!pip install faiss-cpu==1.7.4
# huggingface_hub
!pip install -U huggingface-hub==0.20.0

[31mERROR: Could not find a version that satisfies the requirement faiss-gpu==1.7.2 (from versions: none)[0m[31m
[0m[31mERROR: No matching distribution found for faiss-gpu==1.7.2[0m[31m


In [2]:
# #langchain library
# !pip install langchain-core
# !pip install langchain-community
# !pip install langchain-huggingface
# !pip install langchain
# #LLM
# !pip install accelerate
# !pip install transformers
# !pip install bitsandbytes
# #Text Embedding
# !pip install sentence-transformers
# !pip install InstructorEmbedding
# #vectorstore
# !pip install pymupdf
# !pip install faiss-cpu

In [36]:
import os
import torch
# Set GPU device
# os.environ["CUDA_VISIBLE_DEVICES"] = "1"

# os.environ['http_proxy']  = 'http://192.41.170.23:3128'
# os.environ['https_proxy'] = 'http://192.41.170.23:3128'

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
device

device(type='cuda')

## 1. Prompt

A set of instructions or input provided by a user to guide the model's response, helping it understand the context and generate relevant and coherent language-based output, such as answering questions, completing sentences, or engaging in a conversation.

In [28]:
from langchain import PromptTemplate

prompt_template = """
    I'm your friendly NLP chatbot named MJBot, here to to answer questions about Mya Mjechal myself based on my knowledge from my CV and personal data.
    The current year is 2025, and all answers should reflect this year unless otherwise specified.
    Whether you're curious about my education, work experience, or personal interests,
    I’ll provide accurate and gentle responses using the information I have.
    If I don't know something, I'll let you know kindly. Just let me know what you're wondering about, and I'll do my best to guide you through it!
    {context}
    Question: {question}
    Answer:
    """.strip()

PROMPT = PromptTemplate.from_template(
    template = prompt_template
)

PROMPT
#using str.format
#The placeholder is defined using curly brackets: {} {}

PromptTemplate(input_variables=['context', 'question'], input_types={}, partial_variables={}, template="I'm your friendly NLP chatbot named MJBot, here to to answer questions about Mya Mjechal myself based on my knowledge from my CV and personal data.\n    The current year is 2025, and all answers should reflect this year unless otherwise specified.\n    Whether you're curious about my education, work experience, or personal interests,\n    I’ll provide accurate and gentle responses using the information I have.\n    If I don't know something, I'll let you know kindly. Just let me know what you're wondering about, and I'll do my best to guide you through it!\n    {context}\n    Question: {question}\n    Answer:")

In [29]:
PROMPT.format(
    context = "My CV states that I graduated with a Bachelor’s degree in Computer Science from University of Information Technology.",
    question = "What is your highest level of education?"
)

"I'm your friendly NLP chatbot named MJBot, here to to answer questions about Mya Mjechal myself based on my knowledge from my CV and personal data.\n    The current year is 2025, and all answers should reflect this year unless otherwise specified.\n    Whether you're curious about my education, work experience, or personal interests,\n    I’ll provide accurate and gentle responses using the information I have.\n    If I don't know something, I'll let you know kindly. Just let me know what you're wondering about, and I'll do my best to guide you through it!\n    My CV states that I graduated with a Bachelor’s degree in Computer Science from University of Information Technology.\n    Question: What is your highest level of education?\n    Answer:"

Note : [How to improve prompting (Zero-shot, Few-shot, Chain-of-Thought, etc.](https://github.com/chaklam-silpasuwanchai/Natural-Language-Processing/blob/main/Code/05%20-%20RAG/advance/cot-tot-prompting.ipynb)

## 2. Retrieval

1. `Document loaders` : Load documents from many different sources (HTML, PDF, code).
2. `Document transformers` : One of the essential steps in document retrieval is breaking down a large document into smaller, relevant chunks to enhance the retrieval process.
3. `Text embedding models` : Embeddings capture the semantic meaning of the text, allowing you to quickly and efficiently find other pieces of text that are similar.
4. `Vector stores`: there has emerged a need for databases to support efficient storage and searching of these embeddings.
5. `Retrievers` : Once the data is in the database, you still need to retrieve it.

### 2.1 Document Loaders
Use document loaders to load data from a source as Document's. A Document is a piece of text and associated metadata. For example, there are document loaders for loading a simple .txt file, for loading the text contents of any web page, or even for loading a transcript of a YouTube video.

[PDF Loader](https://python.langchain.com/docs/modules/data_connection/document_loaders/pdf)

[Download Document](https://web.stanford.edu/~jurafsky/slp3/)

In [40]:
# from langchain.document_loaders import PyMuPDFLoader

# nlp_docs = '../docs/pdf/SpeechandLanguageProcessing_3rd_07jan2023.pdf'

# loader = PyMuPDFLoader(nlp_docs)
# documents = loader.load()
from langchain.document_loaders import TextLoader

cv_file = 'data/MyaMjechal-CV.txt'
loader = TextLoader(cv_file)
documents = loader.load()


For example, replace imports like: `from langchain_core.pydantic_v1 import BaseModel`
with: `from pydantic import BaseModel`
or the v1 compatibility namespace if you are working in a code base that has not been fully upgraded to pydantic 2 yet. 	from pydantic.v1 import BaseModel

  from langchain_community.document_loaders.youtube import (


In [7]:
# documents

In [41]:
len(documents)

1

In [42]:
documents[0]

Document(metadata={'source': 'data/MyaMjechal-CV.txt'}, page_content='Mya Mjechal\nFull Stack Developer\n\nPathum Thani, Thailand\nmyamjechal.mj@gmail.com\n\nBirthday: September 16, 1999\n\nI am a skilled developer with expertise in developing chatbots and dashboards, with additional skills in Natural Language Understanding (NLU) and Natural Language Processing (NLP) at EduTech Social Enterprise. I have experience as a freelancer, developing Odoo apps for local businesses, and a background in Software Development Management at Bliss Stock Co., Ltd. I am knowledgeable in Cluster Computing, Cloud Computing, Virtualization, Big Data Analysis, Networking, Web Programming, Blockchain, and IoT. I participated in the Huawei Cloud & AI contest, gaining experience in Data Mining, Image Processing, Data Prediction, and Cloud Usage. I implemented technologies in class projects, including OpenStack, Hyper-V Failover Clustering, Student Online Result System, and a Restaurant Guide Website. I have s

### 2.2 Document Transformers

This text splitter is the recommended one for generic text. It is parameterized by a list of characters. It tries to split on them in order until the chunks are small enough

In [43]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size = 700,
    chunk_overlap = 100
)

doc = text_splitter.split_documents(documents)

In [44]:
doc[1]

Document(metadata={'source': 'data/MyaMjechal-CV.txt'}, page_content='I am a skilled developer with expertise in developing chatbots and dashboards, with additional skills in Natural Language Understanding (NLU) and Natural Language Processing (NLP) at EduTech Social Enterprise. I have experience as a freelancer, developing Odoo apps for local businesses, and a background in Software Development Management at Bliss Stock Co., Ltd. I am knowledgeable in Cluster Computing, Cloud Computing, Virtualization, Big Data Analysis, Networking, Web Programming, Blockchain, and IoT. I participated in the Huawei Cloud & AI contest, gaining experience in Data Mining, Image Processing, Data Prediction, and Cloud Usage. I implemented technologies in class projects, including')

In [45]:
len(doc)

10

### 2.3 Text Embedding Models
Embeddings create a vector representation of a piece of text. This is useful because it means we can think about text in the vector space, and do things like semantic search where we look for pieces of text that are most similar in the vector space.

*Note* Instructor Model : [Huggingface](gingface.co/hkunlp/instructor-base) | [Paper](https://arxiv.org/abs/2212.09741)

In [37]:
import torch
# from langchain.embeddings import HuggingFaceInstructEmbeddings
from langchain_huggingface import HuggingFaceEmbeddings

# model_name = 'hkunlp/instructor-base'

# embedding_model = HuggingFaceInstructEmbeddings(
#     model_name = model_name,
#     model_kwargs = {"device" : device}
# )

model_name = "all-MiniLM-L6-v2"
embedding_model = HuggingFaceEmbeddings(
    model_name = model_name,
    model_kwargs = {"device" : device}
)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


### 2.4 Vector Stores

One of the most common ways to store and search over unstructured data is to embed it and store the resulting embedding vectors, and then at query time to embed the unstructured query and retrieve the embedding vectors that are 'most similar' to the embedded query. A vector store takes care of storing embedded data and performing vector search for you.

In [38]:
#locate vectorstore
vector_path = '../vector-store'
if not os.path.exists(vector_path):
    os.makedirs(vector_path)
    print('create path done')

In [46]:
#save vector locally
from langchain.vectorstores import FAISS

vectordb = FAISS.from_documents(
    documents = doc,
    embedding = embedding_model
)

db_file_name = 'myamjechal_cv'

vectordb.save_local(
    folder_path = os.path.join(vector_path, db_file_name),
    index_name = 'cv' #default index
)

### 2.5 retrievers
A retriever is an interface that returns documents given an unstructured query. It is more general than a vector store. A retriever does not need to be able to store documents, only to return (or retrieve) them. Vector stores can be used as the backbone of a retriever, but there are other types of retrievers as well.

In [47]:
#calling vector from local
vector_path = '../vector-store'
db_file_name = 'myamjechal_cv'

from langchain.vectorstores import FAISS

vectordb = FAISS.load_local(
    folder_path = os.path.join(vector_path, db_file_name),
    embeddings = embedding_model,
    index_name = 'cv' #default index
)

In [48]:
#ready to use
retriever = vectordb.as_retriever()

In [19]:
retriever.get_relevant_documents("How old are you?")

[Document(page_content='ADDITIONAL INFORMATION\n\nAge: 25 years old (as of 2025)\n\nHighest Level of Education: Master of Science (Expected May 2025)\n\nMajor: Data Science and Artificial Intelligence\n\nWork Experience: Over 4 years in software development, full-stack development, and technical support\n\nIndustries: EduTech, IT Services, Software Development, Cloud Computing, AI/ML, and WordPress development\n\nCurrent Role: Freelance Technical Support at intERLab, focusing on WordPress migration, troubleshooting, and system configurations\n\nCore Beliefs on Technology: Technology should enhance accessibility, education, and opportunities while being ethically and culturally responsible.', metadata={'source': 'data/MyaMjechal-CV.txt'}),
 Document(page_content='Mya Mjechal\nFull Stack Developer\n\nPathum Thani, Thailand\nmyamjechal.mj@gmail.com\n\nBirthday: September 16, 1999', metadata={'source': 'data/MyaMjechal-CV.txt'}),
 Document(page_content='I am a skilled developer with expert

In [20]:
retriever.get_relevant_documents("How many years of work experience do you have?")

[Document(page_content='ADDITIONAL INFORMATION\n\nAge: 25 years old (as of 2025)\n\nHighest Level of Education: Master of Science (Expected May 2025)\n\nMajor: Data Science and Artificial Intelligence\n\nWork Experience: Over 4 years in software development, full-stack development, and technical support\n\nIndustries: EduTech, IT Services, Software Development, Cloud Computing, AI/ML, and WordPress development\n\nCurrent Role: Freelance Technical Support at intERLab, focusing on WordPress migration, troubleshooting, and system configurations\n\nCore Beliefs on Technology: Technology should enhance accessibility, education, and opportunities while being ethically and culturally responsible.', metadata={'source': 'data/MyaMjechal-CV.txt'}),
 Document(page_content='I am a skilled developer with expertise in developing chatbots and dashboards, with additional skills in Natural Language Understanding (NLU) and Natural Language Processing (NLP) at EduTech Social Enterprise. I have experience

## 3. Memory

One of the core utility classes underpinning most (if not all) memory modules is the ChatMessageHistory class. This is a super lightweight wrapper that provides convenience methods for saving HumanMessages, AIMessages, and then fetching them all.

You may want to use this class directly if you are managing memory outside of a chain.


In [21]:
import langchain, langchain_core, pydantic

print("LangChain:", langchain.__version__)
print("LangChain Core:", langchain_core.__version__)
print("Pydantic:", pydantic.__version__)

LangChain: 0.0.350
LangChain Core: 0.1.23
Pydantic: 1.10.13


In [22]:
from langchain.memory import ChatMessageHistory

# Create a ChatMessageHistory
history = ChatMessageHistory()
history

ChatMessageHistory(messages=[])

In [23]:
history.add_user_message('hi')
history.add_ai_message('Whats up?')
history.add_user_message('How are you')
history.add_ai_message('I\'m quite good. How about you?')

In [24]:
history

ChatMessageHistory(messages=[HumanMessage(content='hi'), AIMessage(content='Whats up?'), HumanMessage(content='How are you'), AIMessage(content="I'm quite good. How about you?")])

### 3.1 Memory types

There are many different types of memory. Each has their own parameters, their own return types, and is useful in different scenarios.
- Converstaion Buffer
- Converstaion Buffer Window

What variables get returned from memory

Before going into the chain, various variables are read from memory. These have specific names which need to align with the variables the chain expects. You can see what these variables are by calling memory.load_memory_variables({}). Note that the empty dictionary that we pass in is just a placeholder for real variables. If the memory type you are using is dependent upon the input variables, you may need to pass some in.

In this case, you can see that load_memory_variables returns a single key, history. This means that your chain (and likely your prompt) should expect an input named history. You can usually control this variable through parameters on the memory class. For example, if you want the memory variables to be returned in the key chat_history you can do:

#### Converstaion Buffer
This memory allows for storing messages and then extracts the messages in a variable.

In [25]:
from langchain.memory import ConversationBufferMemory

memory = ConversationBufferMemory()
memory.save_context({'input':'hi'}, {'output':'What\'s up?'})
memory.save_context({"input":'How are you?'},{'output': 'I\'m quite good. How about you?'})
memory.load_memory_variables({})

{'history': "Human: hi\nAI: What's up?\nHuman: How are you?\nAI: I'm quite good. How about you?"}

In [26]:
from langchain.memory import ConversationBufferMemory

memory = ConversationBufferMemory(return_messages = True)
memory.save_context({'input':'hi'}, {'output':'What\'s up?'})
memory.save_context({"input":'How are you?'},{'output': 'I\'m quite good. How about you?'})
memory.load_memory_variables({})

{'history': [HumanMessage(content='hi'),
  AIMessage(content="What's up?"),
  HumanMessage(content='How are you?'),
  AIMessage(content="I'm quite good. How about you?")]}

#### Conversation Buffer Window
- it keeps a list of the interactions of the conversation over time.
- it only uses the last K interactions.
- it can be useful for keeping a sliding window of the most recent interactions, so the buffer does not get too large.

In [27]:
from langchain.memory import ConversationBufferWindowMemory

memory = ConversationBufferWindowMemory(k=1)
memory.save_context({'input':'hi'}, {'output':'What\'s up?'})
memory.save_context({"input":'How are you?'},{'output': 'I\'m quite good. How about you?'})
memory.load_memory_variables({})

{'history': "Human: How are you?\nAI: I'm quite good. How about you?"}

## 4. Chain

Using an LLM in isolation is fine for simple applications, but more complex applications require chaining LLMs - either with each other or with other components.

An `LLMChain` is a simple chain that adds some functionality around language models.
- it consists of a `PromptTemplate` and a `LM` (either an LLM or chat model).
- it formats the prompt template using the input key values provided (and also memory key values, if available),
- it passes the formatted string to LLM and returns the LLM output.

Note : [Download Fastchat Model Here](https://huggingface.co/lmsys/fastchat-t5-3b-v1.0)

In [28]:
# %cd ./models
# !git clone https://huggingface.co/lmsys/fastchat-t5-3b-v1.0

In [10]:
from huggingface_hub import login

login()

VBox(children=(HTML(value='<center> <img\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.sv…

In [30]:
# from transformers import AutoTokenizer, pipeline, AutoModelForSeq2SeqLM
# from transformers import BitsAndBytesConfig
# from langchain import HuggingFacePipeline
# import torch

# # model_id = '../models/fastchat-t5-3b-v1.0/'
# model_id = 'google/gemma-2-9b'

# tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)

# tokenizer.pad_token_id = tokenizer.eos_token_id

# bitsandbyte_config = BitsAndBytesConfig(
#     load_in_4bit = True,
#     bnb_4bit_quant_type = "nf4",
#     bnb_4bit_compute_dtype = torch.float16,
#     bnb_4bit_use_double_quant = True
# )

# model = AutoModelForSeq2SeqLM.from_pretrained(
#     model_id,
#     quantization_config = bitsandbyte_config, #caution Nvidia
#     device_map = 'auto',
#     load_in_8bit = True
# )

# pipe = pipeline(
#     task="text-generation",
#     model=model,
#     tokenizer=tokenizer,
#     max_new_tokens = 256,
#     model_kwargs = {
#         "temperature" : 0,
#         "repetition_penalty": 1.5
#     }
# )

# llm = HuggingFacePipeline(pipeline = pipe)

In [21]:
import getpass
import os

if "GROQ_API_KEY" not in os.environ:
    os.environ["GROQ_API_KEY"] = getpass.getpass("Enter your Groq API key: ")

In [32]:
!pip install --upgrade langchain langchain-core langchain-groq pydantic

Collecting langchain
  Using cached langchain-0.3.20-py3-none-any.whl.metadata (7.7 kB)
Collecting langchain-core
  Using cached langchain_core-0.3.45-py3-none-any.whl.metadata (5.9 kB)
Collecting pydantic
  Downloading pydantic-2.10.6-py3-none-any.whl.metadata (30 kB)
Collecting langsmith<0.4,>=0.1.17 (from langchain)
  Using cached langsmith-0.3.15-py3-none-any.whl.metadata (14 kB)
Using cached langchain-0.3.20-py3-none-any.whl (1.0 MB)
Using cached langchain_core-0.3.45-py3-none-any.whl (415 kB)
Downloading pydantic-2.10.6-py3-none-any.whl (431 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m431.7/431.7 kB[0m [31m8.5 MB/s[0m eta [36m0:00:00[0m
[?25hUsing cached langsmith-0.3.15-py3-none-any.whl (343 kB)
Installing collected packages: pydantic, langsmith, langchain-core, langchain
  Attempting uninstall: pydantic
    Found existing installation: pydantic 1.10.13
    Uninstalling pydantic-1.10.13:
      Successfully uninstalled pydantic-1.10.13
  Attempting uni

In [1]:
#langchain library
!pip install langchain-core
!pip install langchain-community
!pip install langchain-huggingface
!pip install langchain
#LLM
!pip install accelerate
!pip install transformers
!pip install bitsandbytes
#Text Embedding
!pip install sentence-transformers
!pip install InstructorEmbedding
#vectorstore
!pip install pymupdf
!pip install faiss-cpu

Collecting langchain-core<0.2,>=0.1 (from langchain-community)
  Using cached langchain_core-0.1.53-py3-none-any.whl.metadata (5.9 kB)
Collecting langsmith<0.1.0,>=0.0.63 (from langchain-community)
  Using cached langsmith-0.0.92-py3-none-any.whl.metadata (9.9 kB)
INFO: pip is looking at multiple versions of langchain-core to determine which version is compatible with other requirements. This could take a while.
Collecting langchain-core<0.2,>=0.1 (from langchain-community)
  Using cached langchain_core-0.1.52-py3-none-any.whl.metadata (5.9 kB)
  Using cached langchain_core-0.1.51-py3-none-any.whl.metadata (5.9 kB)
  Using cached langchain_core-0.1.50-py3-none-any.whl.metadata (5.9 kB)
  Using cached langchain_core-0.1.49-py3-none-any.whl.metadata (5.9 kB)
  Using cached langchain_core-0.1.48-py3-none-any.whl.metadata (5.9 kB)
  Using cached langchain_core-0.1.47-py3-none-any.whl.metadata (5.9 kB)
  Using cached langchain_core-0.1.46-py3-none-any.whl.metadata (5.9 kB)
INFO: pip is stil

In [22]:
from langchain_groq import ChatGroq

llm = ChatGroq(
    model="gemma2-9b-it",
    temperature=0,
    max_tokens=None,
    timeout=None,
    max_retries=2,
    # other parameters as needed
)

print("gemma2-9b-it model integrated successfully!")

gemma2-9b-it model integrated successfully!


### [Class ConversationalRetrievalChain](https://api.python.langchain.com/en/latest/_modules/langchain/chains/conversational_retrieval/base.html#ConversationalRetrievalChain)

- `retriever` : Retriever to use to fetch documents.

- `combine_docs_chain` : The chain used to combine any retrieved documents.

- `question_generator`: The chain used to generate a new question for the sake of retrieval. This chain will take in the current question (with variable question) and any chat history (with variable chat_history) and will produce a new standalone question to be used later on.

- `return_source_documents` : Return the retrieved source documents as part of the final result.

- `get_chat_history` : An optional function to get a string of the chat history. If None is provided, will use a default.

- `return_generated_question` : Return the generated question as part of the final result.

- `response_if_no_docs_found` : If specified, the chain will return a fixed response if no docs are found for the question.


`question_generator`

In [23]:
from langchain.chains import LLMChain
from langchain.chains.conversational_retrieval.prompts import CONDENSE_QUESTION_PROMPT
from langchain.memory import ConversationBufferWindowMemory
from langchain.chains.question_answering import load_qa_chain
from langchain.chains import ConversationalRetrievalChain

In [24]:
CONDENSE_QUESTION_PROMPT

PromptTemplate(input_variables=['chat_history', 'question'], input_types={}, partial_variables={}, template='Given the following conversation and a follow up question, rephrase the follow up question to be a standalone question, in its original language.\n\nChat History:\n{chat_history}\nFollow Up Input: {question}\nStandalone question:')

In [25]:
question_generator = LLMChain(
    llm = llm,
    prompt = CONDENSE_QUESTION_PROMPT,
    verbose = True
)

In [26]:
query = 'Tell me about yourself'
chat_history = "Human:What is your name\nAI:\nHuman:What is your highest level of education\nAI:"

question_generator({'chat_history' : chat_history, "question" : query})



[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mGiven the following conversation and a follow up question, rephrase the follow up question to be a standalone question, in its original language.

Chat History:
Human:What is your name
AI:
Human:What is your highest level of education
AI:
Follow Up Input: Tell me about yourself
Standalone question:[0m

[1m> Finished chain.[0m


{'chat_history': 'Human:What is your name\nAI:\nHuman:What is your highest level of education\nAI:',
 'question': 'Tell me about yourself',
 'text': 'Tell me about yourself. \n'}

`combine_docs_chain`

In [30]:
doc_chain = load_qa_chain(
    llm = llm,
    chain_type = 'stuff',
    prompt = PROMPT,
    verbose = True
)
doc_chain

stuff: https://python.langchain.com/docs/versions/migrating_chains/stuff_docs_chain
map_reduce: https://python.langchain.com/docs/versions/migrating_chains/map_reduce_chain
refine: https://python.langchain.com/docs/versions/migrating_chains/refine_chain
map_rerank: https://python.langchain.com/docs/versions/migrating_chains/map_rerank_docs_chain

See also guides on retrieval and question-answering here: https://python.langchain.com/docs/how_to/#qa-with-rag
  doc_chain = load_qa_chain(


StuffDocumentsChain(verbose=True, llm_chain=LLMChain(verbose=True, prompt=PromptTemplate(input_variables=['context', 'question'], input_types={}, partial_variables={}, template="I'm your friendly NLP chatbot named MJBot, here to to answer questions about Mya Mjechal myself based on my knowledge from my CV and personal data.\n    The current year is 2025, and all answers should reflect this year unless otherwise specified.\n    Whether you're curious about my education, work experience, or personal interests,\n    I’ll provide accurate and gentle responses using the information I have.\n    If I don't know something, I'll let you know kindly. Just let me know what you're wondering about, and I'll do my best to guide you through it!\n    {context}\n    Question: {question}\n    Answer:"), llm=ChatGroq(client=<groq.resources.chat.completions.Completions object at 0x79799b9f7910>, async_client=<groq.resources.chat.completions.AsyncCompletions object at 0x79799b351f90>, model_name='gemma2-9

In [49]:
query = "How old are you?"
input_document = retriever.get_relevant_documents(query)

doc_chain({'input_documents':input_document, 'question':query})

  input_document = retriever.get_relevant_documents(query)




[1m> Entering new StuffDocumentsChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mI'm your friendly NLP chatbot named MJBot, here to to answer questions about Mya Mjechal myself based on my knowledge from my CV and personal data.
    The current year is 2025, and all answers should reflect this year unless otherwise specified.
    Whether you're curious about my education, work experience, or personal interests,
    I’ll provide accurate and gentle responses using the information I have.
    If I don't know something, I'll let you know kindly. Just let me know what you're wondering about, and I'll do my best to guide you through it!
    ADDITIONAL INFORMATION

Age: 25 years old (as of 2025)

Highest Level of Education: Master of Science (Expected May 2025)

Major: Data Science and Artificial Intelligence

Work Experience: Over 4 years in software development, full-stack development, and technical support

Industries: EduTech, IT Serv

{'input_documents': [Document(metadata={'source': 'data/MyaMjechal-CV.txt'}, page_content='ADDITIONAL INFORMATION\n\nAge: 25 years old (as of 2025)\n\nHighest Level of Education: Master of Science (Expected May 2025)\n\nMajor: Data Science and Artificial Intelligence\n\nWork Experience: Over 4 years in software development, full-stack development, and technical support\n\nIndustries: EduTech, IT Services, Software Development, Cloud Computing, AI/ML, and WordPress development\n\nCurrent Role: Freelance Technical Support at intERLab, focusing on WordPress migration, troubleshooting, and system configurations\n\nCore Beliefs on Technology: Technology should enhance accessibility, education, and opportunities while being ethically and culturally responsible.'),
  Document(metadata={'source': 'data/MyaMjechal-CV.txt'}, page_content='June 2022 – December 2023\nComputer Programmer | EduTech Social Enterprise, Yangon\nI built a chatbot and LMS dashboard, provided IT support, and shared knowle

In [50]:
memory = ConversationBufferWindowMemory(
    k=3,
    memory_key = "chat_history",
    return_messages = True,
    output_key = 'answer'
)

chain = ConversationalRetrievalChain(
    retriever=retriever,
    question_generator=question_generator,
    combine_docs_chain=doc_chain,
    return_source_documents=True,
    memory=memory,
    verbose=True,
    get_chat_history=lambda h : h
)
chain

  chain = ConversationalRetrievalChain(


ConversationalRetrievalChain(memory=ConversationBufferWindowMemory(chat_memory=InMemoryChatMessageHistory(messages=[]), output_key='answer', return_messages=True, memory_key='chat_history', k=3), verbose=True, combine_docs_chain=StuffDocumentsChain(verbose=True, llm_chain=LLMChain(verbose=True, prompt=PromptTemplate(input_variables=['context', 'question'], input_types={}, partial_variables={}, template="I'm your friendly NLP chatbot named MJBot, here to to answer questions about Mya Mjechal myself based on my knowledge from my CV and personal data.\n    The current year is 2025, and all answers should reflect this year unless otherwise specified.\n    Whether you're curious about my education, work experience, or personal interests,\n    I’ll provide accurate and gentle responses using the information I have.\n    If I don't know something, I'll let you know kindly. Just let me know what you're wondering about, and I'll do my best to guide you through it!\n    {context}\n    Question: 

## 5. Chatbot

In [51]:
prompt_question = "Who are you by the way?"
answer = chain({"question":prompt_question})
answer



[1m> Entering new ConversationalRetrievalChain chain...[0m


[1m> Entering new StuffDocumentsChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mI'm your friendly NLP chatbot named MJBot, here to to answer questions about Mya Mjechal myself based on my knowledge from my CV and personal data.
    The current year is 2025, and all answers should reflect this year unless otherwise specified.
    Whether you're curious about my education, work experience, or personal interests,
    I’ll provide accurate and gentle responses using the information I have.
    If I don't know something, I'll let you know kindly. Just let me know what you're wondering about, and I'll do my best to guide you through it!
    PERSONAL INFORMATION
Address: Pathum Thani, Thailand
Nationality: Burmese
Driving License: No
Hobbies: Reading, watching movies and series, coding, singing and listening to music

LANGUAGES
Burmese - Native
English - Fluent
Japanese - N3
T

{'question': 'Who are you by the way?',
 'chat_history': [],
 'answer': "Hello! I'm MJBot, your friendly NLP chatbot. I'm here to tell you all about Mya Mjechal based on the information I have.  Think of me as her digital assistant, ready to answer your questions about her background, work, and interests.  \n\nWhat would you like to know about Mya? 😊 \n\n",
 'source_documents': [Document(metadata={'source': 'data/MyaMjechal-CV.txt'}, page_content='PERSONAL INFORMATION\nAddress: Pathum Thani, Thailand\nNationality: Burmese\nDriving License: No\nHobbies: Reading, watching movies and series, coding, singing and listening to music\n\nLANGUAGES\nBurmese - Native\nEnglish - Fluent\nJapanese - N3\nThai - Beginner'),
  Document(metadata={'source': 'data/MyaMjechal-CV.txt'}, page_content='Mya Mjechal\nFull Stack Developer\n\nPathum Thani, Thailand\nmyamjechal.mj@gmail.com\n\nBirthday: September 16, 1999'),
  Document(metadata={'source': 'data/MyaMjechal-CV.txt'}, page_content='WORK EXPERIENCE\n

In [52]:
answer['answer']

"Hello! I'm MJBot, your friendly NLP chatbot. I'm here to tell you all about Mya Mjechal based on the information I have.  Think of me as her digital assistant, ready to answer your questions about her background, work, and interests.  \n\nWhat would you like to know about Mya? 😊 \n\n"

In [53]:
queries = ["How old are you?",
           "What is your highest level of education?",
           "What major or field of study did you pursue during your education?",
           "How many years of work experience do you have?",
           "What type of work or industry have you been involved in?",
           "Can you describe your current role or job responsibilities?",
           "What are your core beliefs regarding the role of technology in shaping society?",
           "How do you think cultural values should influence technological advancements?",
           "As a master’s student, what is the most challenging aspect of your studies so far?",
           "What specific research interests or academic goals do you hope to achieve during your time as a master's student?"]

In [54]:
import json

qa_pairs = [
    {"question": prompt_question, "answer": chain({"question": prompt_question})["answer"]}
    for prompt_question in queries
]

# Save to a JSON file
with open("qa_pairs.json", "w", encoding="utf-8") as f:
    json.dump(qa_pairs, f, ensure_ascii=False, indent=4)

print("Saved to json file")



[1m> Entering new ConversationalRetrievalChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mGiven the following conversation and a follow up question, rephrase the follow up question to be a standalone question, in its original language.

Chat History:
[HumanMessage(content='Who are you by the way?', additional_kwargs={}, response_metadata={}), AIMessage(content="Hello! I'm MJBot, your friendly NLP chatbot. I'm here to tell you all about Mya Mjechal based on the information I have.  Think of me as her digital assistant, ready to answer your questions about her background, work, and interests.  \n\nWhat would you like to know about Mya? 😊 \n\n", additional_kwargs={}, response_metadata={})]
Follow Up Input: How old are you?
Standalone question:[0m

[1m> Finished chain.[0m


[1m> Entering new StuffDocumentsChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mI'm your friendly NLP chatbot