# Building AI Applications with ChatGPT

Sumudu Tennakoon, PhD
<hr>

# LangChain RetrivalQA Chain

In this notebook we will explore some basic fetures on Python programing language for those who have a prior programing expereince.

To learn more about Python, refeer to the following websites

- Python : https://www.python.org

To learn more about the Python packages we explore in this notebook, refer to the following websites

- OpenAI API : https://platform.openai.com/docs/api-reference
- LangChain : https://python.langchain.com/docs/get_started/introduction.html


### Python Library Installation

* Run below code cell to install required libraries before you continue. Ignore that if you already installed them.

In [None]:
!pip install openai langchain

### Load OpenAI API Key

In [2]:
import openai
import configparser
config = configparser.ConfigParser()
config.read(r'../../../config.ini') #Change to your path or assign API Key to openai_api_key (not recomended for production)

openai_api_key = config['SECRETS']['openai_api_key']

### Load Document File

In [3]:
from langchain.document_loaders import TextLoader

source_file = "../Data/Marie_Curie_Bio.txt"

loader = TextLoader(source_file)

documents = loader.load()

print( len(documents))
print( documents[0])

1
page_content='Marie Curie \nBiographical\n\nMarie Curie, nÃ©e Maria Sklodowska, was born in Warsaw on November 7, 1867, the daughter of a secondary-school teacher. She received a general education in local schools and some scientific training from her father. She became involved in a studentsâ€™ revolutionary organization and found it prudent to leave Warsaw, then in the part of Poland dominated by Russia, for Cracow, which at that time was under Austrian rule. In 1891, she went to Paris to continue her studies at the Sorbonne where she obtained Licenciateships in Physics and the Mathematical Sciences. She met Pierre Curie, Professor in the School of Physics in 1894 and in the following year they were married. She succeeded her husband as Head of the Physics Laboratory at the Sorbonne, gained her Doctor of Science degree in 1903, and following the tragic death of Pierre Curie in 1906, she took his place as Professor of General Physics in the Faculty of Sciences, the first time a woma

### Split Text Into Document Chunks

In [4]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, 
                                               chunk_overlap=0,
                                               length_function = len,
                                               separators=['\n\n', '\n', '.', ' ', ''],
                                               add_start_index = True,
                                               )

docs = text_splitter.split_documents(documents)

print( len(docs))
print( docs[0])

13
page_content='Marie Curie \nBiographical' metadata={'source': '../Data/Marie_Curie_Bio.txt', 'start_index': 0}


In [5]:
print( docs[1])

page_content='Marie Curie, nÃ©e Maria Sklodowska, was born in Warsaw on November 7, 1867, the daughter of a secondary-school teacher. She received a general education in local schools and some scientific training from her father. She became involved in a studentsâ€™ revolutionary organization and found it prudent to leave Warsaw, then in the part of Poland dominated by Russia, for Cracow, which at that time was under Austrian rule. In 1891, she went to Paris to continue her studies at the Sorbonne where she' metadata={'source': '../Data/Marie_Curie_Bio.txt', 'start_index': 27}


In [6]:
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.vectorstores import Chroma

embeddings = OpenAIEmbeddings(openai_api_key=openai_api_key)
doc_search = Chroma.from_documents(docs, embeddings)

## Create LLM Object

In [7]:
from langchain.chat_models import ChatOpenAI

MODEL = "gpt-3.5-turbo"

chat_llm = ChatOpenAI(model=MODEL, 
                      temperature=0, 
                      max_tokens=200, 
                      openai_api_key=openai_api_key,
                      verbose=True)

## Create RetrievalQA Chain

In [8]:
from langchain.chains import RetrievalQA

# chain_type = {"stuff", "map_reduce", "refine"} 
# https://python.langchain.com/docs/modules/chains/additional/question_answering.html

qa = RetrievalQA.from_chain_type(llm=chat_llm, 
                                 chain_type="stuff", 
                                 retriever=doc_search.as_retriever())


### Run Query

In [9]:
query = "Who discovered radioactivity?"
qa.run(query)

'Henri Becquerel discovered radioactivity in 1896.'

### Return Source Documents

In [10]:
from langchain.chains import RetrievalQA

# chain_type = {"stuff", "map_reduce", "refine"} 
# https://python.langchain.com/docs/modules/chains/additional/question_answering.html

qa = RetrievalQA.from_chain_type(llm=chat_llm, 
                                 chain_type="stuff", 
                                 retriever=doc_search.as_retriever(),
                                 return_source_documents=True
                                 )

In [11]:
query = "Who discovered radioactivity?"
response = qa({"query": query})

response

{'query': 'Who discovered radioactivity?',
 'result': 'Henri Becquerel discovered radioactivity in 1896.',
 'source_documents': [Document(page_content='Her early researches, together with her husband, were often performed under difficult conditions, laboratory arrangements were poor and both had to undertake much teaching to earn a livelihood. The discovery of radioactivity by Henri Becquerel in 1896 inspired the Curies in their brilliant researches and analyses which led to the isolation of polonium, named after the country of Marieâ€™s birth, and radium. Mme. Curie developed methods for the separation of radium from radioactive residues in s', metadata={'source': '../Data/Marie_Curie_Bio.txt', 'start_index': 1133}),
  Document(page_content='gnition of her work in radioactivity. She also received, jointly with her husband, the Davy Medal of the Royal Society in 1903 and, in 1921, President Harding of the United States, on behalf of the women of America, presented her with one gram of 

In [12]:
print(F"Answer: {response['result']}")

print(F"Documents: {response['source_documents']}")

Answer: Henri Becquerel discovered radioactivity in 1896.
Documents: [Document(page_content='Her early researches, together with her husband, were often performed under difficult conditions, laboratory arrangements were poor and both had to undertake much teaching to earn a livelihood. The discovery of radioactivity by Henri Becquerel in 1896 inspired the Curies in their brilliant researches and analyses which led to the isolation of polonium, named after the country of Marieâ€™s birth, and radium. Mme. Curie developed methods for the separation of radium from radioactive residues in s', metadata={'source': '../Data/Marie_Curie_Bio.txt', 'start_index': 1133}), Document(page_content='gnition of her work in radioactivity. She also received, jointly with her husband, the Davy Medal of the Royal Society in 1903 and, in 1921, President Harding of the United States, on behalf of the women of America, presented her with one gram of radium in recognition of her service to science.', metadata={

### Create RetrievalQAWithSourcesChain

In [14]:
from langchain.chains import RetrievalQAWithSourcesChain

qa_chain = RetrievalQAWithSourcesChain.from_chain_type(llm=chat_llm, 
                                                    chain_type="stuff", 
                                                    retriever=doc_search.as_retriever())

response = qa_chain({"question": query}, return_only_outputs=True)
response

{'answer': 'Marie Curie discovered radioactivity.\n',
 'sources': '../Data/Marie_Curie_Bio.txt'}

## Custom Prompts
### With Translation

In [19]:
from langchain.prompts import PromptTemplate

prompt_template = """Use the following pieces of context to answer the question at the end. \
If you don't know the answer, just say that you don't know, don't try to make up an answer.\
TRanslate answer to French

{context}

Question: {question}
Answer:"""

PROMPT = PromptTemplate(
    template=prompt_template, input_variables=["context", "question"]
)

chain_type_kwargs = {"prompt": PROMPT}

qa = RetrievalQA.from_chain_type(llm=chat_llm, 
                                 chain_type="stuff", 
                                 retriever=doc_search.as_retriever(), 
                                 chain_type_kwargs=chain_type_kwargs)

In [20]:
query = "Who discovered radioactivity?"
qa.run(query)

'Henri Becquerel a découvert la radioactivité.'

<hr/>
Last update 2023-07-09 by Sumudu Tennakoon

<a rel="license" href="http://creativecommons.org/licenses/by-nc-sa/4.0/"><img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by-nc-sa/4.0/88x31.png" /></a><br />This work is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-nc-sa/4.0/">Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License</a>.