In [1]:
import os
from dotenv import load_dotenv

In [2]:
load_dotenv()

OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")

In [3]:
from langchain.document_loaders import PyPDFLoader

In [4]:
%pwd

'/Users/simonma/Projects/LEARNING/fcc-langchain-interview/research'

In [5]:
%cd ..
%pwd

/Users/simonma/Projects/LEARNING/fcc-langchain-interview


'/Users/simonma/Projects/LEARNING/fcc-langchain-interview'

In [6]:
file_path = "data/SDG.pdf"
loader = PyPDFLoader(file_path)
data = loader.load()
print(data)

[Document(metadata={'source': 'data/SDG.pdf', 'page': 0}, page_content=''), Document(metadata={'source': 'data/SDG.pdf', 'page': 1}, page_content=''), Document(metadata={'source': 'data/SDG.pdf', 'page': 2}, page_content='IN THE YEAR 2015, LEADERS FROM 193 COUNTRIES OF THE WORLD \nCAME TOGETHER TO FACE THE FUTURE.\nAnd what they saw was daunting. Famines. Drought. Wars. Plagues. Poverty. \nNot just in some faraway place, but in their own cities and towns and villages.\nThey knew things didn’t have to be this way. They knew we had enough \nfood to feed the world, but that it wasn’t getting shared. They knew there \nwere medicines for HIV and other diseases, but they cost a lot. They knew \nthat earthquakes and floods were inevitable, but that the high death \ntolls were not. \nThey also knew that billions of people worldwide shared their hope for a \nbetter future.\nSo leaders from these countries created a plan called the Sustainable \nDevelopment Goals (SDGs). This set of 17 goals ima

In [7]:
question_gen = ''
for page in data:
    question_gen += page.page_content

print(question_gen)

IN THE YEAR 2015, LEADERS FROM 193 COUNTRIES OF THE WORLD 
CAME TOGETHER TO FACE THE FUTURE.
And what they saw was daunting. Famines. Drought. Wars. Plagues. Poverty. 
Not just in some faraway place, but in their own cities and towns and villages.
They knew things didn’t have to be this way. They knew we had enough 
food to feed the world, but that it wasn’t getting shared. They knew there 
were medicines for HIV and other diseases, but they cost a lot. They knew 
that earthquakes and floods were inevitable, but that the high death 
tolls were not. 
They also knew that billions of people worldwide shared their hope for a 
better future.
So leaders from these countries created a plan called the Sustainable 
Development Goals (SDGs). This set of 17 goals imagines a future just 15 years 
off that would be rid of poverty and hunger, and safe from the worst effects of 
climate change. It’s an ambitious plan. 
But there’s ample evidence that we can succeed. In the past 15 years, the 
interna

In [8]:
from langchain_text_splitters import TokenTextSplitter

In [9]:
splitter_ques_gen = TokenTextSplitter(model_name="gpt-4o-mini", chunk_size=1000, chunk_overlap=200)

In [11]:
chunk_ques_gen = splitter_ques_gen.split_text(question_gen)
chunk_ques_gen
len(chunk_ques_gen)
type(chunk_ques_gen)

list

In [12]:
from langchain.docstore.document import Document

In [14]:
document_ques_gen = [Document(page_content=chunk) for chunk in chunk_ques_gen]
document_ques_gen

[Document(metadata={}, page_content='IN THE YEAR 2015, LEADERS FROM 193 COUNTRIES OF THE WORLD \nCAME TOGETHER TO FACE THE FUTURE.\nAnd what they saw was daunting. Famines. Drought. Wars. Plagues. Poverty. \nNot just in some faraway place, but in their own cities and towns and villages.\nThey knew things didn’t have to be this way. They knew we had enough \nfood to feed the world, but that it wasn’t getting shared. They knew there \nwere medicines for HIV and other diseases, but they cost a lot. They knew \nthat earthquakes and floods were inevitable, but that the high death \ntolls were not. \nThey also knew that billions of people worldwide shared their hope for a \nbetter future.\nSo leaders from these countries created a plan called the Sustainable \nDevelopment Goals (SDGs). This set of 17 goals imagines a future just 15 years \noff that would be rid of poverty and hunger, and safe from the worst effects of \nclimate change. It’s an ambitious plan. \nBut there’s ample evidence tha

In [15]:
type(document_ques_gen[0])


langchain_core.documents.base.Document

In [16]:
splitter_ans_gen = TokenTextSplitter(model_name="gpt-4o-mini", chunk_size=1000, chunk_overlap=200)

In [18]:
document_ans_gen = splitter_ans_gen.split_documents(document_ques_gen)
document_ans_gen

[Document(metadata={}, page_content='IN THE YEAR 2015, LEADERS FROM 193 COUNTRIES OF THE WORLD \nCAME TOGETHER TO FACE THE FUTURE.\nAnd what they saw was daunting. Famines. Drought. Wars. Plagues. Poverty. \nNot just in some faraway place, but in their own cities and towns and villages.\nThey knew things didn’t have to be this way. They knew we had enough \nfood to feed the world, but that it wasn’t getting shared. They knew there \nwere medicines for HIV and other diseases, but they cost a lot. They knew \nthat earthquakes and floods were inevitable, but that the high death \ntolls were not. \nThey also knew that billions of people worldwide shared their hope for a \nbetter future.\nSo leaders from these countries created a plan called the Sustainable \nDevelopment Goals (SDGs). This set of 17 goals imagines a future just 15 years \noff that would be rid of poverty and hunger, and safe from the worst effects of \nclimate change. It’s an ambitious plan. \nBut there’s ample evidence tha

In [22]:
from langchain_openai import ChatOpenAI

In [24]:
chat_model = ChatOpenAI(model_name="gpt-4o-mini", temperature=0.3)

In [26]:
prompt_template = """
You are an expert at creating questions based on coding materials and documentation.
Your goal is to prepare a coder or programmer for their exam and coding tests.
You do this by asking questions about the text below:

------------
{text}
------------

Create questions that will prepare the coders or programmers for their tests.
Make sure not to lose any important information.

QUESTIONS:
"""

In [28]:
from langchain.prompts import PromptTemplate
prompt = PromptTemplate(template=prompt_template, input_variables=["text"])



In [30]:
refine_template = ("""
You are an expert at creating practice questions based on coding material and documentation.
Your goal is to help a coder or programmer prepare for a coding test.
We have received some practice questions to a certain extent: {existing_answer}.
We have the option to refine the existing questions or add new ones.
(only if necessary) with some more context below.
------------
{text}
------------

Given the new context, refine the original questions in English.
If the context is not helpful, please provide the original questions.
QUESTIONS:
"""
)

In [31]:
refine_prompt = PromptTemplate(template=refine_template, input_variables=["existing_answer", "text"])

In [32]:
from langchain.chains.summarize import load_summarize_chain

In [34]:
question_gen_chain = load_summarize_chain(llm=chat_model,
                                          chain_type="refine",
                                          verbose=True,
                                          question_prompt=prompt,
                                          refine_prompt=refine_prompt)
questions = question_gen_chain.invoke(document_ques_gen)



[1m> Entering new RefineDocumentsChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3m
You are an expert at creating questions based on coding materials and documentation.
Your goal is to prepare a coder or programmer for their exam and coding tests.
You do this by asking questions about the text below:

------------
IN THE YEAR 2015, LEADERS FROM 193 COUNTRIES OF THE WORLD 
CAME TOGETHER TO FACE THE FUTURE.
And what they saw was daunting. Famines. Drought. Wars. Plagues. Poverty. 
Not just in some faraway place, but in their own cities and towns and villages.
They knew things didn’t have to be this way. They knew we had enough 
food to feed the world, but that it wasn’t getting shared. They knew there 
were medicines for HIV and other diseases, but they cost a lot. They knew 
that earthquakes and floods were inevitable, but that the high death 
tolls were not. 
They also knew that billions of people worldwide shared their hope for a 
b

{'input_documents': [Document(metadata={}, page_content='IN THE YEAR 2015, LEADERS FROM 193 COUNTRIES OF THE WORLD \nCAME TOGETHER TO FACE THE FUTURE.\nAnd what they saw was daunting. Famines. Drought. Wars. Plagues. Poverty. \nNot just in some faraway place, but in their own cities and towns and villages.\nThey knew things didn’t have to be this way. They knew we had enough \nfood to feed the world, but that it wasn’t getting shared. They knew there \nwere medicines for HIV and other diseases, but they cost a lot. They knew \nthat earthquakes and floods were inevitable, but that the high death \ntolls were not. \nThey also knew that billions of people worldwide shared their hope for a \nbetter future.\nSo leaders from these countries created a plan called the Sustainable \nDevelopment Goals (SDGs). This set of 17 goals imagines a future just 15 years \noff that would be rid of poverty and hunger, and safe from the worst effects of \nclimate change. It’s an ambitious plan. \nBut there’

In [37]:
questions['output_text']

"Based on the additional context provided, here are the refined questions that incorporate relevant details and themes related to climate change, marine conservation, terrestrial ecosystems, peace, and global partnerships for sustainable development:\n\n1. In what year did leaders from 193 countries convene to address global challenges, and what was the primary outcome of this meeting concerning sustainable development?\n\n2. Identify at least three global issues recognized as significant challenges during the 2015 meeting, and explain their importance in the context of sustainable development.\n\n3. What is the name of the framework established by the leaders to address these global challenges, and how many specific goals does it encompass?\n\n4. What target year has been set by the Sustainable Development Goals (SDGs) for the eradication of extreme poverty and hunger, and why is this target considered crucial for global development?\n\n5. According to the text, how much has extreme p

In [40]:
from langchain_openai import OpenAIEmbeddings


In [41]:
embeddings = OpenAIEmbeddings()

In [43]:
from langchain.vectorstores import FAISS

In [44]:
vectorstore = FAISS.from_documents(document_ans_gen, embeddings)

In [45]:
chat_model_answer = ChatOpenAI(model_name="gpt-4o-mini", temperature=0.3)

In [53]:
# questions_list = questions['output_text'].split('\n')
questions_list = [q.strip() for q in questions['output_text'].split('\n') if q.strip()]
questions_list


['Based on the additional context provided, here are the refined questions that incorporate relevant details and themes related to climate change, marine conservation, terrestrial ecosystems, peace, and global partnerships for sustainable development:',
 '1. In what year did leaders from 193 countries convene to address global challenges, and what was the primary outcome of this meeting concerning sustainable development?',
 '2. Identify at least three global issues recognized as significant challenges during the 2015 meeting, and explain their importance in the context of sustainable development.',
 '3. What is the name of the framework established by the leaders to address these global challenges, and how many specific goals does it encompass?',
 '4. What target year has been set by the Sustainable Development Goals (SDGs) for the eradication of extreme poverty and hunger, and why is this target considered crucial for global development?',
 '5. According to the text, how much has ext

In [47]:
from langchain.chains import RetrievalQA

In [49]:
answer_chain = RetrievalQA.from_chain_type(llm=chat_model_answer,
                                          chain_type="stuff",
                                          retriever=vectorstore.as_retriever())

In [54]:
for question in questions_list:
    print(f'question: {question}')
    answer = answer_chain.invoke({"query": question})
    print(f'answer: {answer["result"]}')
    print('\n')


question: Based on the additional context provided, here are the refined questions that incorporate relevant details and themes related to climate change, marine conservation, terrestrial ecosystems, peace, and global partnerships for sustainable development:
answer: I don't know.


question: 1. In what year did leaders from 193 countries convene to address global challenges, and what was the primary outcome of this meeting concerning sustainable development?
answer: Leaders from 193 countries convened in the year 2015 to address global challenges, and the primary outcome of this meeting was the creation of the Sustainable Development Goals (SDGs), which is a set of 17 goals aimed at achieving a better and more sustainable future by the year 2030.


question: 2. Identify at least three global issues recognized as significant challenges during the 2015 meeting, and explain their importance in the context of sustainable development.
answer: Three significant global issues recognized duri