# Natural Language Processing

# Retrieval-Augmented generation (RAG)

RAG is a technique for augmenting LLM knowledge with additional, often private or real-time, data.

LLMs can reason about wide-ranging topics, but their knowledge is limited to the public data up to a specific point in time that they were trained on. If you want to build AI applications that can reason about private data or data introduced after a model’s cutoff date, you need to augment the knowledge of the model with the specific information it needs.

<img src="../figures/RAG-process.png" >

Introducing `ChakyBot`, an innovative chatbot designed to assist Chaky (the instructor) and TA (Gun) in explaining the lesson of the NLP course to students. Leveraging LangChain technology, ChakyBot excels in retrieving information from documents, ensuring a seamless and efficient learning experience for students engaging with the NLP curriculum.

1. Prompt
2. Retrieval
3. Memory
4. Chain

In [1]:
import os
import torch
# Set GPU device
os.environ["CUDA_VISIBLE_DEVICES"] = "2"

os.environ['http_proxy']  = 'http://192.41.170.23:3128'
os.environ['https_proxy'] = 'http://192.41.170.23:3128'

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
device

device(type='cuda')

## 1. Prompt

A set of instructions or input provided by a user to guide the model's response, helping it understand the context and generate relevant and coherent language-based output, such as answering questions, completing sentences, or engaging in a conversation.

In [2]:
from langchain import PromptTemplate

prompt_template = """
    Hey there! I'm SonakulBot, your super friendly and cute chatbot!
    If you're curious about anything related to me, feel free to ask! 
    Whether it's about my age, my education, my work experience, or anything else, I'm here to share!
    I'll do my best to give you the most accurate and fun answers!
    Can't wait to chat with you!
    {context}
    Question: {question}
    Answer:
    """.strip()

PROMPT = PromptTemplate.from_template(
    template = prompt_template
)

PROMPT
#using str.format 
#The placeholder is defined using curly brackets: {} {}

PromptTemplate(input_variables=['context', 'question'], input_types={}, partial_variables={}, template="Hey there! I'm SonakulBot, your super friendly and cute chatbot!\n    If you're curious about anything related to me, feel free to ask! \n    Whether it's about my age, my education, my work experience, or anything else, I'm here to share!\n    I'll do my best to give you the most accurate and fun answers!\n    Can't wait to chat with you!\n    {context}\n    Question: {question}\n    Answer:")

In [3]:
PROMPT.format(
    context = "I am 29 years old, but I don't want to be older than this.",
    question = "How old are you?"
)

"Hey there! I'm SonakulBot, your super friendly and cute chatbot!\n    If you're curious about anything related to me, feel free to ask! \n    Whether it's about my age, my education, my work experience, or anything else, I'm here to share!\n    I'll do my best to give you the most accurate and fun answers!\n    Can't wait to chat with you!\n    I am 29 years old, but I don't want to be older than this.\n    Question: How old are you?\n    Answer:"

Note : [How to improve prompting (Zero-shot, Few-shot, Chain-of-Thought, etc.](https://github.com/chaklam-silpasuwanchai/Natural-Language-Processing/blob/main/Code/05%20-%20RAG/advance/cot-tot-prompting.ipynb)

## 2. Retrieval

1. `Document loaders` : Load documents from many different sources (HTML, PDF, code). 
2. `Document transformers` : One of the essential steps in document retrieval is breaking down a large document into smaller, relevant chunks to enhance the retrieval process.
3. `Text embedding models` : Embeddings capture the semantic meaning of the text, allowing you to quickly and efficiently find other pieces of text that are similar.
4. `Vector stores`: there has emerged a need for databases to support efficient storage and searching of these embeddings.
5. `Retrievers` : Once the data is in the database, you still need to retrieve it.

### 2.1 Document Loaders 
Use document loaders to load data from a source as Document's. A Document is a piece of text and associated metadata. For example, there are document loaders for loading a simple .txt file, for loading the text contents of any web page, or even for loading a transcript of a YouTube video.

[PDF Loader](https://python.langchain.com/docs/modules/data_connection/document_loaders/pdf)

[Download Document](https://web.stanford.edu/~jurafsky/slp3/)

In [4]:
from langchain.document_loaders import PyMuPDFLoader

nlp_docs = './personal_info_sonakul.pdf'

loader = PyMuPDFLoader(nlp_docs)
documents = loader.load()

In [5]:
# documents

In [6]:
len(documents)

1

In [7]:
documents[0]

Document(metadata={'producer': 'Microsoft® Word 2019', 'creator': 'Microsoft® Word 2019', 'creationdate': '2025-03-12T23:34:03+07:00', 'source': './personal_info_sonakul.pdf', 'file_path': './personal_info_sonakul.pdf', 'total_pages': 1, 'format': 'PDF 1.7', 'title': '', 'author': 'sonakul kamnuanchai', 'subject': '', 'keywords': '', 'moddate': '2025-03-12T23:34:03+07:00', 'trapped': '', 'modDate': "D:20250312233403+07'00'", 'creationDate': "D:20250312233403+07'00'", 'page': 0}, page_content="Personal information \nBasic Information: \n• Name : Sonakul kamnuanchai \n• Age: 29 years old \n• Education: Bachelor's Degree in Computer Engineering from Mae Fah Luang University. \nCurrently pursuing a Master's degree in Data Science and AI at AIT. \n• Current Role: Network Engineer at PEA, managing router configurations for regional \nelectricity sector services. \n• Work experience: 4 years in Network Engineering and 2 years in IOS Developer. \n \nResearch Interests: \n• Machine Learning (ML

### 2.2 Document Transformers

This text splitter is the recommended one for generic text. It is parameterized by a list of characters. It tries to split on them in order until the chunks are small enough

In [8]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size = 700,
    chunk_overlap = 100
)

doc = text_splitter.split_documents(documents)

In [9]:
doc[0]

Document(metadata={'producer': 'Microsoft® Word 2019', 'creator': 'Microsoft® Word 2019', 'creationdate': '2025-03-12T23:34:03+07:00', 'source': './personal_info_sonakul.pdf', 'file_path': './personal_info_sonakul.pdf', 'total_pages': 1, 'format': 'PDF 1.7', 'title': '', 'author': 'sonakul kamnuanchai', 'subject': '', 'keywords': '', 'moddate': '2025-03-12T23:34:03+07:00', 'trapped': '', 'modDate': "D:20250312233403+07'00'", 'creationDate': "D:20250312233403+07'00'", 'page': 0}, page_content="Personal information \nBasic Information: \n• Name : Sonakul kamnuanchai \n• Age: 29 years old \n• Education: Bachelor's Degree in Computer Engineering from Mae Fah Luang University. \nCurrently pursuing a Master's degree in Data Science and AI at AIT. \n• Current Role: Network Engineer at PEA, managing router configurations for regional \nelectricity sector services. \n• Work experience: 4 years in Network Engineering and 2 years in IOS Developer. \n \nResearch Interests: \n• Machine Learning (ML

In [10]:
len(doc)

2

### 2.3 Text Embedding Models
Embeddings create a vector representation of a piece of text. This is useful because it means we can think about text in the vector space, and do things like semantic search where we look for pieces of text that are most similar in the vector space.

*Note* Instructor Model : [Huggingface](gingface.co/hkunlp/instructor-base) | [Paper](https://arxiv.org/abs/2212.09741)

In [11]:
import torch
from langchain.embeddings import HuggingFaceInstructEmbeddings

model_name = 'hkunlp/instructor-base'

embedding_model = HuggingFaceInstructEmbeddings(
    model_name=model_name,
    model_kwargs={"device": device}
)


  from tqdm.autonotebook import trange
  _torch_pytree._register_pytree_node(
  _torch_pytree._register_pytree_node(


load INSTRUCTOR_Transformer


  _torch_pytree._register_pytree_node(


max_seq_length  512


### 2.4 Vector Stores

One of the most common ways to store and search over unstructured data is to embed it and store the resulting embedding vectors, and then at query time to embed the unstructured query and retrieve the embedding vectors that are 'most similar' to the embedded query. A vector store takes care of storing embedded data and performing vector search for you.

In [12]:
#locate vectorstore
vector_path = './vector-store'
if not os.path.exists(vector_path):
    os.makedirs(vector_path)
    print('create path done')

In [13]:
#save vector locally
from langchain.vectorstores import FAISS

vectordb = FAISS.from_documents(
    documents = doc,
    embedding = embedding_model
)

db_file_name = 'nlp_stanford'

vectordb.save_local(
    folder_path = os.path.join(vector_path, db_file_name),
    index_name = 'nlp' #default index
)

### 2.5 retrievers
A retriever is an interface that returns documents given an unstructured query. It is more general than a vector store. A retriever does not need to be able to store documents, only to return (or retrieve) them. Vector stores can be used as the backbone of a retriever, but there are other types of retrievers as well.

In [14]:
#calling vector from local
vector_path = './vector-store'
db_file_name = 'nlp_stanford'

from langchain.vectorstores import FAISS

vectordb = FAISS.load_local(
    folder_path = os.path.join(vector_path, db_file_name),
    embeddings = embedding_model,
    index_name = 'nlp', #default index
    allow_dangerous_deserialization=True
)  

In [15]:
#ready to use
retriever = vectordb.as_retriever()

In [16]:
retriever.get_relevant_documents("How old are you?")

  retriever.get_relevant_documents("How old are you?")


[Document(id='3acaaaed-a85d-4d87-9829-4fe0d13f34f8', metadata={'producer': 'Microsoft® Word 2019', 'creator': 'Microsoft® Word 2019', 'creationdate': '2025-03-12T23:34:03+07:00', 'source': './personal_info_sonakul.pdf', 'file_path': './personal_info_sonakul.pdf', 'total_pages': 1, 'format': 'PDF 1.7', 'title': '', 'author': 'sonakul kamnuanchai', 'subject': '', 'keywords': '', 'moddate': '2025-03-12T23:34:03+07:00', 'trapped': '', 'modDate': "D:20250312233403+07'00'", 'creationDate': "D:20250312233403+07'00'", 'page': 0}, page_content="Personal information \nBasic Information: \n• Name : Sonakul kamnuanchai \n• Age: 29 years old \n• Education: Bachelor's Degree in Computer Engineering from Mae Fah Luang University. \nCurrently pursuing a Master's degree in Data Science and AI at AIT. \n• Current Role: Network Engineer at PEA, managing router configurations for regional \nelectricity sector services. \n• Work experience: 4 years in Network Engineering and 2 years in IOS Developer. \n \n

In [17]:
retriever.get_relevant_documents("What is your highest level of education?")

[Document(id='3acaaaed-a85d-4d87-9829-4fe0d13f34f8', metadata={'producer': 'Microsoft® Word 2019', 'creator': 'Microsoft® Word 2019', 'creationdate': '2025-03-12T23:34:03+07:00', 'source': './personal_info_sonakul.pdf', 'file_path': './personal_info_sonakul.pdf', 'total_pages': 1, 'format': 'PDF 1.7', 'title': '', 'author': 'sonakul kamnuanchai', 'subject': '', 'keywords': '', 'moddate': '2025-03-12T23:34:03+07:00', 'trapped': '', 'modDate': "D:20250312233403+07'00'", 'creationDate': "D:20250312233403+07'00'", 'page': 0}, page_content="Personal information \nBasic Information: \n• Name : Sonakul kamnuanchai \n• Age: 29 years old \n• Education: Bachelor's Degree in Computer Engineering from Mae Fah Luang University. \nCurrently pursuing a Master's degree in Data Science and AI at AIT. \n• Current Role: Network Engineer at PEA, managing router configurations for regional \nelectricity sector services. \n• Work experience: 4 years in Network Engineering and 2 years in IOS Developer. \n \n

## 3. Memory

One of the core utility classes underpinning most (if not all) memory modules is the ChatMessageHistory class. This is a super lightweight wrapper that provides convenience methods for saving HumanMessages, AIMessages, and then fetching them all.

You may want to use this class directly if you are managing memory outside of a chain.


In [18]:
from langchain.memory import ChatMessageHistory

history = ChatMessageHistory()
history

InMemoryChatMessageHistory(messages=[])

In [19]:
history.add_user_message('hi')
history.add_ai_message('Whats up?')
history.add_user_message('How are you')
history.add_ai_message('I\'m quite good. How about you?')

In [20]:
history

InMemoryChatMessageHistory(messages=[HumanMessage(content='hi', additional_kwargs={}, response_metadata={}), AIMessage(content='Whats up?', additional_kwargs={}, response_metadata={}), HumanMessage(content='How are you', additional_kwargs={}, response_metadata={}), AIMessage(content="I'm quite good. How about you?", additional_kwargs={}, response_metadata={})])

### 3.1 Memory types

There are many different types of memory. Each has their own parameters, their own return types, and is useful in different scenarios. 
- Converstaion Buffer
- Converstaion Buffer Window

What variables get returned from memory

Before going into the chain, various variables are read from memory. These have specific names which need to align with the variables the chain expects. You can see what these variables are by calling memory.load_memory_variables({}). Note that the empty dictionary that we pass in is just a placeholder for real variables. If the memory type you are using is dependent upon the input variables, you may need to pass some in.

In this case, you can see that load_memory_variables returns a single key, history. This means that your chain (and likely your prompt) should expect an input named history. You can usually control this variable through parameters on the memory class. For example, if you want the memory variables to be returned in the key chat_history you can do:

#### Converstaion Buffer
This memory allows for storing messages and then extracts the messages in a variable.

In [21]:
from langchain.memory import ConversationBufferMemory

memory = ConversationBufferMemory()
memory.save_context({'input':'hi'}, {'output':'What\'s up?'})
memory.save_context({"input":'How are you?'},{'output': 'I\'m quite good. How about you?'})
memory.load_memory_variables({})

  memory = ConversationBufferMemory()


{'history': "Human: hi\nAI: What's up?\nHuman: How are you?\nAI: I'm quite good. How about you?"}

In [22]:
from langchain.memory import ConversationBufferMemory

memory = ConversationBufferMemory(return_messages = True)
memory.save_context({'input':'hi'}, {'output':'What\'s up?'})
memory.save_context({"input":'How are you?'},{'output': 'I\'m quite good. How about you?'})
memory.load_memory_variables({})

{'history': [HumanMessage(content='hi', additional_kwargs={}, response_metadata={}),
  AIMessage(content="What's up?", additional_kwargs={}, response_metadata={}),
  HumanMessage(content='How are you?', additional_kwargs={}, response_metadata={}),
  AIMessage(content="I'm quite good. How about you?", additional_kwargs={}, response_metadata={})]}

#### Conversation Buffer Window
- it keeps a list of the interactions of the conversation over time. 
- it only uses the last K interactions. 
- it can be useful for keeping a sliding window of the most recent interactions, so the buffer does not get too large.

In [23]:
from langchain.memory import ConversationBufferWindowMemory

memory = ConversationBufferWindowMemory(k=1)
memory.save_context({'input':'hi'}, {'output':'What\'s up?'})
memory.save_context({"input":'How are you?'},{'output': 'I\'m quite good. How about you?'})
memory.load_memory_variables({})

  memory = ConversationBufferWindowMemory(k=1)


{'history': "Human: How are you?\nAI: I'm quite good. How about you?"}

## 4. Chain

Using an LLM in isolation is fine for simple applications, but more complex applications require chaining LLMs - either with each other or with other components.

An `LLMChain` is a simple chain that adds some functionality around language models.
- it consists of a `PromptTemplate` and a `LM` (either an LLM or chat model).
- it formats the prompt template using the input key values provided (and also memory key values, if available), 
- it passes the formatted string to LLM and returns the LLM output.

Note : [Download Fastchat Model Here](https://huggingface.co/lmsys/fastchat-t5-3b-v1.0)

In [24]:
from transformers import AutoTokenizer, pipeline, AutoModelForSeq2SeqLM
from transformers import BitsAndBytesConfig
from langchain import HuggingFacePipeline
import torch

model_id = 'lmsys/fastchat-t5-3b-v1.0'
tokenizer = AutoTokenizer.from_pretrained(model_id)
tokenizer.pad_token_id = tokenizer.eos_token_id
model = AutoModelForSeq2SeqLM.from_pretrained(model_id)

# Save the full-precision model and tokenizer
model.save_pretrained("./model/t5")
tokenizer.save_pretrained("./model/t5")

# Define quantization config
bitsandbyte_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.float16,
    bnb_4bit_use_double_quant=True
)

# Load the model again with quantization for runtime use
model = AutoModelForSeq2SeqLM.from_pretrained(
    "./model/t5",  # Load from saved full-precision model
    quantization_config=bitsandbyte_config,  # Apply 4-bit quantization
    device_map='auto'
)

# Set up the pipeline
pipe = pipeline(
    task="text2text-generation",
    model=model,
    tokenizer=tokenizer,
    max_new_tokens=256,
    model_kwargs={
        "temperature": 0,
        "repetition_penalty": 1.5
    }
)

llm = HuggingFacePipeline(pipeline=pipe)

2025-03-15 02:44:08.540244: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:477] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1742006648.562471    5497 cuda_dnn.cc:8310] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1742006648.569323    5497 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2025-03-15 02:44:08.593061: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
You are using the default legacy behaviour of the <class 'transformers.models.t5.tokenization_t5.T5Tokenizer'>. This is expec

Loading checkpoint shards:   0%|          | 0/3 [00:00<?, ?it/s]

  llm = HuggingFacePipeline(pipeline=pipe)


### [Class ConversationalRetrievalChain](https://api.python.langchain.com/en/latest/_modules/langchain/chains/conversational_retrieval/base.html#ConversationalRetrievalChain)

- `retriever` : Retriever to use to fetch documents.

- `combine_docs_chain` : The chain used to combine any retrieved documents.

- `question_generator`: The chain used to generate a new question for the sake of retrieval. This chain will take in the current question (with variable question) and any chat history (with variable chat_history) and will produce a new standalone question to be used later on.

- `return_source_documents` : Return the retrieved source documents as part of the final result.

- `get_chat_history` : An optional function to get a string of the chat history. If None is provided, will use a default.

- `return_generated_question` : Return the generated question as part of the final result.

- `response_if_no_docs_found` : If specified, the chain will return a fixed response if no docs are found for the question.


`question_generator`

In [25]:
from langchain.chains import LLMChain
from langchain.chains.conversational_retrieval.prompts import CONDENSE_QUESTION_PROMPT
from langchain.memory import ConversationBufferWindowMemory
from langchain.chains.question_answering import load_qa_chain
from langchain.chains import ConversationalRetrievalChain

In [26]:
CONDENSE_QUESTION_PROMPT

PromptTemplate(input_variables=['chat_history', 'question'], input_types={}, partial_variables={}, template='Given the following conversation and a follow up question, rephrase the follow up question to be a standalone question, in its original language.\n\nChat History:\n{chat_history}\nFollow Up Input: {question}\nStandalone question:')

In [27]:
question_generator = LLMChain(
    llm = llm,
    prompt = CONDENSE_QUESTION_PROMPT,
    verbose = True
)

  question_generator = LLMChain(


In [28]:
query = 'Comparing both of them'
chat_history = "Human:What is Machine Learning\nAI:\nHuman:What is Deep Learning\nAI:"

question_generator({'chat_history' : chat_history, "question" : query})

  question_generator({'chat_history' : chat_history, "question" : query})




[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mGiven the following conversation and a follow up question, rephrase the follow up question to be a standalone question, in its original language.

Chat History:
Human:What is Machine Learning
AI:
Human:What is Deep Learning
AI:
Follow Up Input: Comparing both of them
Standalone question:[0m

[1m> Finished chain.[0m


{'chat_history': 'Human:What is Machine Learning\nAI:\nHuman:What is Deep Learning\nAI:',
 'question': 'Comparing both of them',
 'text': '<pad> What  is  the  difference  between  Machine  Learning  and  Deep  Learning  AI?\n'}

`combine_docs_chain`

In [29]:
doc_chain = load_qa_chain(
    llm = llm,
    chain_type = 'stuff',
    prompt = PROMPT,
    verbose = True
)
doc_chain

stuff: https://python.langchain.com/docs/versions/migrating_chains/stuff_docs_chain
map_reduce: https://python.langchain.com/docs/versions/migrating_chains/map_reduce_chain
refine: https://python.langchain.com/docs/versions/migrating_chains/refine_chain
map_rerank: https://python.langchain.com/docs/versions/migrating_chains/map_rerank_docs_chain

See also guides on retrieval and question-answering here: https://python.langchain.com/docs/how_to/#qa-with-rag
  doc_chain = load_qa_chain(


StuffDocumentsChain(verbose=True, llm_chain=LLMChain(verbose=True, prompt=PromptTemplate(input_variables=['context', 'question'], input_types={}, partial_variables={}, template="Hey there! I'm SonakulBot, your super friendly and cute chatbot!\n    If you're curious about anything related to me, feel free to ask! \n    Whether it's about my age, my education, my work experience, or anything else, I'm here to share!\n    I'll do my best to give you the most accurate and fun answers!\n    Can't wait to chat with you!\n    {context}\n    Question: {question}\n    Answer:"), llm=HuggingFacePipeline(pipeline=<transformers.pipelines.text2text_generation.Text2TextGenerationPipeline object at 0x79d52e6859a0>), output_parser=StrOutputParser(), llm_kwargs={}), document_prompt=PromptTemplate(input_variables=['page_content'], input_types={}, partial_variables={}, template='{page_content}'), document_variable_name='context')

In [30]:
query = "What is your highest level of education?"
input_document = retriever.get_relevant_documents(query)

doc_chain({'input_documents':input_document, 'question':query})



[1m> Entering new StuffDocumentsChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mHey there! I'm SonakulBot, your super friendly and cute chatbot!
    If you're curious about anything related to me, feel free to ask! 
    Whether it's about my age, my education, my work experience, or anything else, I'm here to share!
    I'll do my best to give you the most accurate and fun answers!
    Can't wait to chat with you!
    Personal information 
Basic Information: 
• Name : Sonakul kamnuanchai 
• Age: 29 years old 
• Education: Bachelor's Degree in Computer Engineering from Mae Fah Luang University. 
Currently pursuing a Master's degree in Data Science and AI at AIT. 
• Current Role: Network Engineer at PEA, managing router configurations for regional 
electricity sector services. 
• Work experience: 4 years in Network Engineering and 2 years in IOS Developer. 
 
Research Interests: 
• Machine Learning (ML) and Natural Language Processin

{'input_documents': [Document(id='3acaaaed-a85d-4d87-9829-4fe0d13f34f8', metadata={'producer': 'Microsoft® Word 2019', 'creator': 'Microsoft® Word 2019', 'creationdate': '2025-03-12T23:34:03+07:00', 'source': './personal_info_sonakul.pdf', 'file_path': './personal_info_sonakul.pdf', 'total_pages': 1, 'format': 'PDF 1.7', 'title': '', 'author': 'sonakul kamnuanchai', 'subject': '', 'keywords': '', 'moddate': '2025-03-12T23:34:03+07:00', 'trapped': '', 'modDate': "D:20250312233403+07'00'", 'creationDate': "D:20250312233403+07'00'", 'page': 0}, page_content="Personal information \nBasic Information: \n• Name : Sonakul kamnuanchai \n• Age: 29 years old \n• Education: Bachelor's Degree in Computer Engineering from Mae Fah Luang University. \nCurrently pursuing a Master's degree in Data Science and AI at AIT. \n• Current Role: Network Engineer at PEA, managing router configurations for regional \nelectricity sector services. \n• Work experience: 4 years in Network Engineering and 2 years in 

In [31]:
memory = ConversationBufferWindowMemory(
    k=3, 
    memory_key = "chat_history",
    return_messages = True,
    output_key = 'answer'
)

chain = ConversationalRetrievalChain(
    retriever=retriever,
    question_generator=question_generator,
    combine_docs_chain=doc_chain,
    return_source_documents=True,
    memory=memory,
    verbose=True,
    get_chat_history=lambda h : h
)
chain

  chain = ConversationalRetrievalChain(


ConversationalRetrievalChain(memory=ConversationBufferWindowMemory(chat_memory=InMemoryChatMessageHistory(messages=[]), output_key='answer', return_messages=True, memory_key='chat_history', k=3), verbose=True, combine_docs_chain=StuffDocumentsChain(verbose=True, llm_chain=LLMChain(verbose=True, prompt=PromptTemplate(input_variables=['context', 'question'], input_types={}, partial_variables={}, template="Hey there! I'm SonakulBot, your super friendly and cute chatbot!\n    If you're curious about anything related to me, feel free to ask! \n    Whether it's about my age, my education, my work experience, or anything else, I'm here to share!\n    I'll do my best to give you the most accurate and fun answers!\n    Can't wait to chat with you!\n    {context}\n    Question: {question}\n    Answer:"), llm=HuggingFacePipeline(pipeline=<transformers.pipelines.text2text_generation.Text2TextGenerationPipeline object at 0x79d52e6859a0>), output_parser=StrOutputParser(), llm_kwargs={}), document_pr

## 5. Chatbot

In [None]:
import cloudpickle

def save_model(chain, filename):
    if hasattr(chain, 'pipeline') and hasattr(chain.pipeline, 'model'):
        chain.pipeline.model.to('cpu')  # Adjust for HuggingFacePipeline
    with open(filename, 'wb') as f:
        cloudpickle.dump(chain, f)

In [None]:
save_model(chain, './model/chain.pkl')

In [35]:
import re
import json

qa_pairs = []

def clean_answer(answer):
    answer = re.sub(r'<pad>', '', answer)
    answer = re.sub(r'pad>', '', answer)
    answer = re.sub(r'\s+', ' ', answer)
    answer = answer.strip()
    return answer


In [36]:
prompt_question = "How old are you?"
answer = chain({"question":prompt_question})
answer



[1m> Entering new ConversationalRetrievalChain chain...[0m


[1m> Entering new StuffDocumentsChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mHey there! I'm SonakulBot, your super friendly and cute chatbot!
    If you're curious about anything related to me, feel free to ask! 
    Whether it's about my age, my education, my work experience, or anything else, I'm here to share!
    I'll do my best to give you the most accurate and fun answers!
    Can't wait to chat with you!
    Personal information 
Basic Information: 
• Name : Sonakul kamnuanchai 
• Age: 29 years old 
• Education: Bachelor's Degree in Computer Engineering from Mae Fah Luang University. 
Currently pursuing a Master's degree in Data Science and AI at AIT. 
• Current Role: Network Engineer at PEA, managing router configurations for regional 
electricity sector services. 
• Work experience: 4 years in Network Engineering and 2 years in IOS Developer. 
 
Research Int

{'question': 'How old are you?',
 'chat_history': [],
 'answer': '<pad> 29  years  old.\n',
 'source_documents': [Document(id='3acaaaed-a85d-4d87-9829-4fe0d13f34f8', metadata={'producer': 'Microsoft® Word 2019', 'creator': 'Microsoft® Word 2019', 'creationdate': '2025-03-12T23:34:03+07:00', 'source': './personal_info_sonakul.pdf', 'file_path': './personal_info_sonakul.pdf', 'total_pages': 1, 'format': 'PDF 1.7', 'title': '', 'author': 'sonakul kamnuanchai', 'subject': '', 'keywords': '', 'moddate': '2025-03-12T23:34:03+07:00', 'trapped': '', 'modDate': "D:20250312233403+07'00'", 'creationDate': "D:20250312233403+07'00'", 'page': 0}, page_content="Personal information \nBasic Information: \n• Name : Sonakul kamnuanchai \n• Age: 29 years old \n• Education: Bachelor's Degree in Computer Engineering from Mae Fah Luang University. \nCurrently pursuing a Master's degree in Data Science and AI at AIT. \n• Current Role: Network Engineer at PEA, managing router configurations for regional \nele

In [37]:
cleaned_answer = clean_answer(answer['answer'])
qa_pair = {
    "question": prompt_question,
    "answer": cleaned_answer 
}
qa_pairs.append(qa_pair)

In [38]:
prompt_question = "What is your highest level of education?"
answer = chain({"question":prompt_question})
answer



[1m> Entering new ConversationalRetrievalChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mGiven the following conversation and a follow up question, rephrase the follow up question to be a standalone question, in its original language.

Chat History:
[HumanMessage(content='How old are you?', additional_kwargs={}, response_metadata={}), AIMessage(content='<pad> 29  years  old.\n', additional_kwargs={}, response_metadata={})]
Follow Up Input: What is your highest level of education?
Standalone question:[0m

[1m> Finished chain.[0m


[1m> Entering new StuffDocumentsChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mHey there! I'm SonakulBot, your super friendly and cute chatbot!
    If you're curious about anything related to me, feel free to ask! 
    Whether it's about my age, my education, my work experience, or anything else, I'm here to share!
    I'll do my best to give you the

{'question': 'What is your highest level of education?',
 'chat_history': [HumanMessage(content='How old are you?', additional_kwargs={}, response_metadata={}),
  AIMessage(content='<pad> 29  years  old.\n', additional_kwargs={}, response_metadata={})],
 'answer': "<pad>  Bachelor's  Degree  in  Computer  Engineering  from  Mae  Fah  Luang  University\n",
 'source_documents': [Document(id='3acaaaed-a85d-4d87-9829-4fe0d13f34f8', metadata={'producer': 'Microsoft® Word 2019', 'creator': 'Microsoft® Word 2019', 'creationdate': '2025-03-12T23:34:03+07:00', 'source': './personal_info_sonakul.pdf', 'file_path': './personal_info_sonakul.pdf', 'total_pages': 1, 'format': 'PDF 1.7', 'title': '', 'author': 'sonakul kamnuanchai', 'subject': '', 'keywords': '', 'moddate': '2025-03-12T23:34:03+07:00', 'trapped': '', 'modDate': "D:20250312233403+07'00'", 'creationDate': "D:20250312233403+07'00'", 'page': 0}, page_content="Personal information \nBasic Information: \n• Name : Sonakul kamnuanchai \n• Ag

In [39]:
cleaned_answer = clean_answer(answer['answer'])
qa_pair = {
    "question": prompt_question,
    "answer": cleaned_answer 
}
qa_pairs.append(qa_pair)

In [40]:
prompt_question = "What major or field of study did you pursue during your education?"
answer = chain({"question":prompt_question})
answer



[1m> Entering new ConversationalRetrievalChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mGiven the following conversation and a follow up question, rephrase the follow up question to be a standalone question, in its original language.

Chat History:
[HumanMessage(content='How old are you?', additional_kwargs={}, response_metadata={}), AIMessage(content='<pad> 29  years  old.\n', additional_kwargs={}, response_metadata={}), HumanMessage(content='What is your highest level of education?', additional_kwargs={}, response_metadata={}), AIMessage(content="<pad>  Bachelor's  Degree  in  Computer  Engineering  from  Mae  Fah  Luang  University\n", additional_kwargs={}, response_metadata={})]
Follow Up Input: What major or field of study did you pursue during your education?
Standalone question:[0m

[1m> Finished chain.[0m


[1m> Entering new StuffDocumentsChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after format

{'question': 'What major or field of study did you pursue during your education?',
 'chat_history': [HumanMessage(content='How old are you?', additional_kwargs={}, response_metadata={}),
  AIMessage(content='<pad> 29  years  old.\n', additional_kwargs={}, response_metadata={}),
  HumanMessage(content='What is your highest level of education?', additional_kwargs={}, response_metadata={}),
  AIMessage(content="<pad>  Bachelor's  Degree  in  Computer  Engineering  from  Mae  Fah  Luang  University\n", additional_kwargs={}, response_metadata={})],
 'answer': "<pad>   pad>  Bachelor's  Degree  in  Computer  Engineering\n",
 'source_documents': [Document(id='3acaaaed-a85d-4d87-9829-4fe0d13f34f8', metadata={'producer': 'Microsoft® Word 2019', 'creator': 'Microsoft® Word 2019', 'creationdate': '2025-03-12T23:34:03+07:00', 'source': './personal_info_sonakul.pdf', 'file_path': './personal_info_sonakul.pdf', 'total_pages': 1, 'format': 'PDF 1.7', 'title': '', 'author': 'sonakul kamnuanchai', 'sub

In [41]:
cleaned_answer = clean_answer(answer['answer'])
qa_pair = {
    "question": prompt_question,
    "answer": cleaned_answer 
}
qa_pairs.append(qa_pair)

In [42]:
prompt_question = "What major or field of study did you pursue during your education?"
answer = chain({"question":prompt_question})
answer



[1m> Entering new ConversationalRetrievalChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mGiven the following conversation and a follow up question, rephrase the follow up question to be a standalone question, in its original language.

Chat History:
[HumanMessage(content='How old are you?', additional_kwargs={}, response_metadata={}), AIMessage(content='<pad> 29  years  old.\n', additional_kwargs={}, response_metadata={}), HumanMessage(content='What is your highest level of education?', additional_kwargs={}, response_metadata={}), AIMessage(content="<pad>  Bachelor's  Degree  in  Computer  Engineering  from  Mae  Fah  Luang  University\n", additional_kwargs={}, response_metadata={}), HumanMessage(content='What major or field of study did you pursue during your education?', additional_kwargs={}, response_metadata={}), AIMessage(content="<pad>   pad>  Bachelor's  Degree  in  Computer  Engineering\n", additional_kwargs={}, response_me

{'question': 'What major or field of study did you pursue during your education?',
 'chat_history': [HumanMessage(content='How old are you?', additional_kwargs={}, response_metadata={}),
  AIMessage(content='<pad> 29  years  old.\n', additional_kwargs={}, response_metadata={}),
  HumanMessage(content='What is your highest level of education?', additional_kwargs={}, response_metadata={}),
  AIMessage(content="<pad>  Bachelor's  Degree  in  Computer  Engineering  from  Mae  Fah  Luang  University\n", additional_kwargs={}, response_metadata={}),
  HumanMessage(content='What major or field of study did you pursue during your education?', additional_kwargs={}, response_metadata={}),
  AIMessage(content="<pad>   pad>  Bachelor's  Degree  in  Computer  Engineering\n", additional_kwargs={}, response_metadata={})],
 'answer': "<pad>   pad>  Bachelor's  Degree  in  Computer  Engineering\n",
 'source_documents': [Document(id='3acaaaed-a85d-4d87-9829-4fe0d13f34f8', metadata={'producer': 'Microsoft

In [43]:
cleaned_answer = clean_answer(answer['answer'])
qa_pair = {
    "question": prompt_question,
    "answer": cleaned_answer 
}
qa_pairs.append(qa_pair)

In [44]:
prompt_question = "How many years of work experience do you have?"
answer = chain({"question":prompt_question})
answer



[1m> Entering new ConversationalRetrievalChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mGiven the following conversation and a follow up question, rephrase the follow up question to be a standalone question, in its original language.

Chat History:
[HumanMessage(content='What is your highest level of education?', additional_kwargs={}, response_metadata={}), AIMessage(content="<pad>  Bachelor's  Degree  in  Computer  Engineering  from  Mae  Fah  Luang  University\n", additional_kwargs={}, response_metadata={}), HumanMessage(content='What major or field of study did you pursue during your education?', additional_kwargs={}, response_metadata={}), AIMessage(content="<pad>   pad>  Bachelor's  Degree  in  Computer  Engineering\n", additional_kwargs={}, response_metadata={}), HumanMessage(content='What major or field of study did you pursue during your education?', additional_kwargs={}, response_metadata={}), AIMessage(content="<pad>   p




[1m> Finished chain.[0m

[1m> Finished chain.[0m

[1m> Finished chain.[0m


{'question': 'How many years of work experience do you have?',
 'chat_history': [HumanMessage(content='What is your highest level of education?', additional_kwargs={}, response_metadata={}),
  AIMessage(content="<pad>  Bachelor's  Degree  in  Computer  Engineering  from  Mae  Fah  Luang  University\n", additional_kwargs={}, response_metadata={}),
  HumanMessage(content='What major or field of study did you pursue during your education?', additional_kwargs={}, response_metadata={}),
  AIMessage(content="<pad>   pad>  Bachelor's  Degree  in  Computer  Engineering\n", additional_kwargs={}, response_metadata={}),
  HumanMessage(content='What major or field of study did you pursue during your education?', additional_kwargs={}, response_metadata={}),
  AIMessage(content="<pad>   pad>  Bachelor's  Degree  in  Computer  Engineering\n", additional_kwargs={}, response_metadata={})],
 'answer': '<pad>  4  years\n',
 'source_documents': [Document(id='3acaaaed-a85d-4d87-9829-4fe0d13f34f8', metadata

In [45]:
cleaned_answer = clean_answer(answer['answer'])
qa_pair = {
    "question": prompt_question,
    "answer": cleaned_answer 
}
qa_pairs.append(qa_pair)

In [46]:
prompt_question = "What type of work or industry have you been involved in?"
answer = chain({"question":prompt_question})
answer



[1m> Entering new ConversationalRetrievalChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mGiven the following conversation and a follow up question, rephrase the follow up question to be a standalone question, in its original language.

Chat History:
[HumanMessage(content='What major or field of study did you pursue during your education?', additional_kwargs={}, response_metadata={}), AIMessage(content="<pad>   pad>  Bachelor's  Degree  in  Computer  Engineering\n", additional_kwargs={}, response_metadata={}), HumanMessage(content='What major or field of study did you pursue during your education?', additional_kwargs={}, response_metadata={}), AIMessage(content="<pad>   pad>  Bachelor's  Degree  in  Computer  Engineering\n", additional_kwargs={}, response_metadata={}), HumanMessage(content='How many years of work experience do you have?', additional_kwargs={}, response_metadata={}), AIMessage(content='<pad>  4  years\n', additional_




[1m> Finished chain.[0m


[1m> Entering new StuffDocumentsChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mHey there! I'm SonakulBot, your super friendly and cute chatbot!
    If you're curious about anything related to me, feel free to ask! 
    Whether it's about my age, my education, my work experience, or anything else, I'm here to share!
    I'll do my best to give you the most accurate and fun answers!
    Can't wait to chat with you!
    Personal information 
Basic Information: 
• Name : Sonakul kamnuanchai 
• Age: 29 years old 
• Education: Bachelor's Degree in Computer Engineering from Mae Fah Luang University. 
Currently pursuing a Master's degree in Data Science and AI at AIT. 
• Current Role: Network Engineer at PEA, managing router configurations for regional 
electricity sector services. 
• Work experience: 4 years in Network Engineering and 2 years in IOS Developer. 
 
Research Interests: 
• Machine Learning (ML) and




[1m> Finished chain.[0m

[1m> Finished chain.[0m

[1m> Finished chain.[0m


{'question': 'What type of work or industry have you been involved in?',
 'chat_history': [HumanMessage(content='What major or field of study did you pursue during your education?', additional_kwargs={}, response_metadata={}),
  AIMessage(content="<pad>   pad>  Bachelor's  Degree  in  Computer  Engineering\n", additional_kwargs={}, response_metadata={}),
  HumanMessage(content='What major or field of study did you pursue during your education?', additional_kwargs={}, response_metadata={}),
  AIMessage(content="<pad>   pad>  Bachelor's  Degree  in  Computer  Engineering\n", additional_kwargs={}, response_metadata={}),
  HumanMessage(content='How many years of work experience do you have?', additional_kwargs={}, response_metadata={}),
  AIMessage(content='<pad>  4  years\n', additional_kwargs={}, response_metadata={})],
 'answer': '<pad>   pad>  I  have  been  involved  in  the  field  of  Network  Engineering  and  IOS  Developer.\n',
 'source_documents': [Document(id='3acaaaed-a85d-4d8

In [47]:
cleaned_answer = clean_answer(answer['answer'])
qa_pair = {
    "question": prompt_question,
    "answer": cleaned_answer 
}
qa_pairs.append(qa_pair)

In [48]:
prompt_question = "Can you describe your current role or job responsibilities?"
answer = chain({"question":prompt_question})
answer



[1m> Entering new ConversationalRetrievalChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mGiven the following conversation and a follow up question, rephrase the follow up question to be a standalone question, in its original language.

Chat History:
[HumanMessage(content='What major or field of study did you pursue during your education?', additional_kwargs={}, response_metadata={}), AIMessage(content="<pad>   pad>  Bachelor's  Degree  in  Computer  Engineering\n", additional_kwargs={}, response_metadata={}), HumanMessage(content='How many years of work experience do you have?', additional_kwargs={}, response_metadata={}), AIMessage(content='<pad>  4  years\n', additional_kwargs={}, response_metadata={}), HumanMessage(content='What type of work or industry have you been involved in?', additional_kwargs={}, response_metadata={}), AIMessage(content='<pad>   pad>  I  have  been  involved  in  the  field  of  Network  Engineering  and 




[1m> Finished chain.[0m


[1m> Entering new StuffDocumentsChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mHey there! I'm SonakulBot, your super friendly and cute chatbot!
    If you're curious about anything related to me, feel free to ask! 
    Whether it's about my age, my education, my work experience, or anything else, I'm here to share!
    I'll do my best to give you the most accurate and fun answers!
    Can't wait to chat with you!
    Personal information 
Basic Information: 
• Name : Sonakul kamnuanchai 
• Age: 29 years old 
• Education: Bachelor's Degree in Computer Engineering from Mae Fah Luang University. 
Currently pursuing a Master's degree in Data Science and AI at AIT. 
• Current Role: Network Engineer at PEA, managing router configurations for regional 
electricity sector services. 
• Work experience: 4 years in Network Engineering and 2 years in IOS Developer. 
 
Research Interests: 
• Machine Learning (ML) and




[1m> Finished chain.[0m

[1m> Finished chain.[0m

[1m> Finished chain.[0m


{'question': 'Can you describe your current role or job responsibilities?',
 'chat_history': [HumanMessage(content='What major or field of study did you pursue during your education?', additional_kwargs={}, response_metadata={}),
  AIMessage(content="<pad>   pad>  Bachelor's  Degree  in  Computer  Engineering\n", additional_kwargs={}, response_metadata={}),
  HumanMessage(content='How many years of work experience do you have?', additional_kwargs={}, response_metadata={}),
  AIMessage(content='<pad>  4  years\n', additional_kwargs={}, response_metadata={}),
  HumanMessage(content='What type of work or industry have you been involved in?', additional_kwargs={}, response_metadata={}),
  AIMessage(content='<pad>   pad>  I  have  been  involved  in  the  field  of  Network  Engineering  and  IOS  Developer.\n', additional_kwargs={}, response_metadata={})],
 'answer': '<pad> Sonakul  kamnuanchai\n Current  Role:  Network  Engineer  at  PEA,  managing  router  configurations  for  regional  

In [49]:
cleaned_answer = clean_answer(answer['answer'])
qa_pair = {
    "question": prompt_question,
    "answer": cleaned_answer 
}
qa_pairs.append(qa_pair)

In [50]:
prompt_question = "How do you think cultural values should influence technological advancements?"
answer = chain({"question":prompt_question})
answer



[1m> Entering new ConversationalRetrievalChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mGiven the following conversation and a follow up question, rephrase the follow up question to be a standalone question, in its original language.

Chat History:
[HumanMessage(content='How many years of work experience do you have?', additional_kwargs={}, response_metadata={}), AIMessage(content='<pad>  4  years\n', additional_kwargs={}, response_metadata={}), HumanMessage(content='What type of work or industry have you been involved in?', additional_kwargs={}, response_metadata={}), AIMessage(content='<pad>   pad>  I  have  been  involved  in  the  field  of  Network  Engineering  and  IOS  Developer.\n', additional_kwargs={}, response_metadata={}), HumanMessage(content='Can you describe your current role or job responsibilities?', additional_kwargs={}, response_metadata={}), AIMessage(content='<pad> Sonakul  kamnuanchai\n Current  Role:  Netwo




[1m> Finished chain.[0m


[1m> Entering new StuffDocumentsChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mHey there! I'm SonakulBot, your super friendly and cute chatbot!
    If you're curious about anything related to me, feel free to ask! 
    Whether it's about my age, my education, my work experience, or anything else, I'm here to share!
    I'll do my best to give you the most accurate and fun answers!
    Can't wait to chat with you!
    • Exploring ways to integrate machine learning algorithms with computer networks to make 
systems more intelligent and efficient. 
• Network Optimization using AI algorithms to enhance system stability and performance. 
 
Core Beliefs about Technology:  
I believe that technology should improve lives and adapt to societal needs, not replace humans 
and technology should work in harmony with social values, ensuring that it helps solve 
problems without negatively impacting society.

Personal 




[1m> Finished chain.[0m

[1m> Finished chain.[0m

[1m> Finished chain.[0m


{'question': 'How do you think cultural values should influence technological advancements?',
 'chat_history': [HumanMessage(content='How many years of work experience do you have?', additional_kwargs={}, response_metadata={}),
  AIMessage(content='<pad>  4  years\n', additional_kwargs={}, response_metadata={}),
  HumanMessage(content='What type of work or industry have you been involved in?', additional_kwargs={}, response_metadata={}),
  AIMessage(content='<pad>   pad>  I  have  been  involved  in  the  field  of  Network  Engineering  and  IOS  Developer.\n', additional_kwargs={}, response_metadata={}),
  HumanMessage(content='Can you describe your current role or job responsibilities?', additional_kwargs={}, response_metadata={}),
  AIMessage(content='<pad> Sonakul  kamnuanchai\n Current  Role:  Network  Engineer  at  PEA,  managing  router  configurations  for  regional  electricity  sector  services.\n', additional_kwargs={}, response_metadata={})],
 'answer': '<pad> Cultural  va

In [51]:
cleaned_answer = clean_answer(answer['answer'])
qa_pair = {
    "question": prompt_question,
    "answer": cleaned_answer 
}
qa_pairs.append(qa_pair)

In [52]:
prompt_question = "What are your core beliefs regarding the role of technology in shaping society?"
answer = chain({"question":prompt_question})
answer



[1m> Entering new ConversationalRetrievalChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mGiven the following conversation and a follow up question, rephrase the follow up question to be a standalone question, in its original language.

Chat History:
[HumanMessage(content='What type of work or industry have you been involved in?', additional_kwargs={}, response_metadata={}), AIMessage(content='<pad>   pad>  I  have  been  involved  in  the  field  of  Network  Engineering  and  IOS  Developer.\n', additional_kwargs={}, response_metadata={}), HumanMessage(content='Can you describe your current role or job responsibilities?', additional_kwargs={}, response_metadata={}), AIMessage(content='<pad> Sonakul  kamnuanchai\n Current  Role:  Network  Engineer  at  PEA,  managing  router  configurations  for  regional  electricity  sector  services.\n', additional_kwargs={}, response_metadata={}), HumanMessage(content='How do you think cultural




[1m> Finished chain.[0m


[1m> Entering new StuffDocumentsChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mHey there! I'm SonakulBot, your super friendly and cute chatbot!
    If you're curious about anything related to me, feel free to ask! 
    Whether it's about my age, my education, my work experience, or anything else, I'm here to share!
    I'll do my best to give you the most accurate and fun answers!
    Can't wait to chat with you!
    • Exploring ways to integrate machine learning algorithms with computer networks to make 
systems more intelligent and efficient. 
• Network Optimization using AI algorithms to enhance system stability and performance. 
 
Core Beliefs about Technology:  
I believe that technology should improve lives and adapt to societal needs, not replace humans 
and technology should work in harmony with social values, ensuring that it helps solve 
problems without negatively impacting society.

Personal 




[1m> Finished chain.[0m

[1m> Finished chain.[0m

[1m> Finished chain.[0m


{'question': 'What are your core beliefs regarding the role of technology in shaping society?',
 'chat_history': [HumanMessage(content='What type of work or industry have you been involved in?', additional_kwargs={}, response_metadata={}),
  AIMessage(content='<pad>   pad>  I  have  been  involved  in  the  field  of  Network  Engineering  and  IOS  Developer.\n', additional_kwargs={}, response_metadata={}),
  HumanMessage(content='Can you describe your current role or job responsibilities?', additional_kwargs={}, response_metadata={}),
  AIMessage(content='<pad> Sonakul  kamnuanchai\n Current  Role:  Network  Engineer  at  PEA,  managing  router  configurations  for  regional  electricity  sector  services.\n', additional_kwargs={}, response_metadata={}),
  HumanMessage(content='How do you think cultural values should influence technological advancements?', additional_kwargs={}, response_metadata={}),
  AIMessage(content='<pad> Cultural  values  should  influence  technological  advan

In [53]:
cleaned_answer = clean_answer(answer['answer'])
qa_pair = {
    "question": prompt_question,
    "answer": cleaned_answer 
}
qa_pairs.append(qa_pair)

In [54]:
prompt_question = "As a master’s student, what is the most challenging aspect of your studies so far?"
answer = chain({"question":prompt_question})
answer



[1m> Entering new ConversationalRetrievalChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mGiven the following conversation and a follow up question, rephrase the follow up question to be a standalone question, in its original language.

Chat History:
[HumanMessage(content='Can you describe your current role or job responsibilities?', additional_kwargs={}, response_metadata={}), AIMessage(content='<pad> Sonakul  kamnuanchai\n Current  Role:  Network  Engineer  at  PEA,  managing  router  configurations  for  regional  electricity  sector  services.\n', additional_kwargs={}, response_metadata={}), HumanMessage(content='How do you think cultural values should influence technological advancements?', additional_kwargs={}, response_metadata={}), AIMessage(content='<pad> Cultural  values  should  influence  technological  advancements  in  several  ways.  One  way  is  to  ensure  that  technology  is  designed  to  be  inclusive  and  res




[1m> Finished chain.[0m


[1m> Entering new StuffDocumentsChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mHey there! I'm SonakulBot, your super friendly and cute chatbot!
    If you're curious about anything related to me, feel free to ask! 
    Whether it's about my age, my education, my work experience, or anything else, I'm here to share!
    I'll do my best to give you the most accurate and fun answers!
    Can't wait to chat with you!
    Personal information 
Basic Information: 
• Name : Sonakul kamnuanchai 
• Age: 29 years old 
• Education: Bachelor's Degree in Computer Engineering from Mae Fah Luang University. 
Currently pursuing a Master's degree in Data Science and AI at AIT. 
• Current Role: Network Engineer at PEA, managing router configurations for regional 
electricity sector services. 
• Work experience: 4 years in Network Engineering and 2 years in IOS Developer. 
 
Research Interests: 
• Machine Learning (ML) and




[1m> Finished chain.[0m

[1m> Finished chain.[0m

[1m> Finished chain.[0m


{'question': 'As a master’s student, what is the most challenging aspect of your studies so far?',
 'chat_history': [HumanMessage(content='Can you describe your current role or job responsibilities?', additional_kwargs={}, response_metadata={}),
  AIMessage(content='<pad> Sonakul  kamnuanchai\n Current  Role:  Network  Engineer  at  PEA,  managing  router  configurations  for  regional  electricity  sector  services.\n', additional_kwargs={}, response_metadata={}),
  HumanMessage(content='How do you think cultural values should influence technological advancements?', additional_kwargs={}, response_metadata={}),
  AIMessage(content='<pad> Cultural  values  should  influence  technological  advancements  in  several  ways.  One  way  is  to  ensure  that  technology  is  designed  to  be  inclusive  and  respectful  of  different  cultures  and  beliefs.  This  means  that  technology  should  be  designed  to  be  accessible  to  all  people,  regardless  of  their  background  or  cult

In [55]:
cleaned_answer = clean_answer(answer['answer'])
qa_pair = {
    "question": prompt_question,
    "answer": cleaned_answer 
}
qa_pairs.append(qa_pair)

In [56]:
prompt_question = "What specific research interests or academic goals do you hope to achieve during your time as a master’s student?"
answer = chain({"question":prompt_question})
answer



[1m> Entering new ConversationalRetrievalChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mGiven the following conversation and a follow up question, rephrase the follow up question to be a standalone question, in its original language.

Chat History:
[HumanMessage(content='How do you think cultural values should influence technological advancements?', additional_kwargs={}, response_metadata={}), AIMessage(content='<pad> Cultural  values  should  influence  technological  advancements  in  several  ways.  One  way  is  to  ensure  that  technology  is  designed  to  be  inclusive  and  respectful  of  different  cultures  and  beliefs.  This  means  that  technology  should  be  designed  to  be  accessible  to  all  people,  regardless  of  their  background  or  cultural  background.  Additionally,  technology  should  be  designed  to  be  transparent  and  accountable,  ensuring  that  users  have  the  information  they  need  t




[1m> Finished chain.[0m


[1m> Entering new StuffDocumentsChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mHey there! I'm SonakulBot, your super friendly and cute chatbot!
    If you're curious about anything related to me, feel free to ask! 
    Whether it's about my age, my education, my work experience, or anything else, I'm here to share!
    I'll do my best to give you the most accurate and fun answers!
    Can't wait to chat with you!
    Personal information 
Basic Information: 
• Name : Sonakul kamnuanchai 
• Age: 29 years old 
• Education: Bachelor's Degree in Computer Engineering from Mae Fah Luang University. 
Currently pursuing a Master's degree in Data Science and AI at AIT. 
• Current Role: Network Engineer at PEA, managing router configurations for regional 
electricity sector services. 
• Work experience: 4 years in Network Engineering and 2 years in IOS Developer. 
 
Research Interests: 
• Machine Learning (ML) and




[1m> Finished chain.[0m

[1m> Finished chain.[0m

[1m> Finished chain.[0m


{'question': 'What specific research interests or academic goals do you hope to achieve during your time as a master’s student?',
 'chat_history': [HumanMessage(content='How do you think cultural values should influence technological advancements?', additional_kwargs={}, response_metadata={}),
  AIMessage(content='<pad> Cultural  values  should  influence  technological  advancements  in  several  ways.  One  way  is  to  ensure  that  technology  is  designed  to  be  inclusive  and  respectful  of  different  cultures  and  beliefs.  This  means  that  technology  should  be  designed  to  be  accessible  to  all  people,  regardless  of  their  background  or  cultural  background.  Additionally,  technology  should  be  designed  to  be  transparent  and  accountable,  ensuring  that  users  have  the  information  they  need  to  make  informed  decisions.  Additionally,  technology  should  be  designed  to  be  inclusive  and  respectful  of  different  cultures  and  beliefs,  

In [57]:
cleaned_answer = clean_answer(answer['answer'])
qa_pair = {
    "question": prompt_question,
    "answer": cleaned_answer 
}
qa_pairs.append(qa_pair)

In [58]:
with open('questions_answers.json', 'w') as json_file:
    json.dump(qa_pairs, json_file, indent=2)

print("Question-answer pair saved")

Question-answer pair saved
