<a href="https://colab.research.google.com/github/amadeus-art/azure-openai-coding-dojo/blob/main/azure_openai_qa.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>
# Question Answering on Documents using Azure OpenAI, Langchain and ChromaDB

One of the main problems of Large Language Models (LLMs) is that they hallucinate (produce inaccurate or false information) when asked questions that are out of their scope. Also, their knowledge is up-to-date only if they are retrained or fine-tuned on recent data.

In this tutorial we will showcase how to use the `langchain` library to perform question answering using "knowledge base informed large language models".

## Install dependencies

In [1]:
!pip install openai langchain python-dotenv chromadb -q

## Environment setup
Before executing the following cells, make sure to set the `AZURE_OPENAI_KEY` and `AZURE_OPENAI_ENDPOINT` variables in the `.env` file or export them.

<br/>
<img src="assets/keys_endpoint.png" width="800"/>

In [2]:
import os

from dotenv import load_dotenv, find_dotenv

_ = load_dotenv(find_dotenv())  # read local .env file

openai_api_key = os.getenv("AZURE_OPENAI_KEY")
openai_api_base = os.getenv("AZURE_OPENAI_ENDPOINT")  # should look like the following https://YOUR_RESOURCE_NAME.openai.azure.com/
openai_api_type = 'azure'
openai_api_version = '2023-07-01-preview'  # latest as per today (15-09-2023), may change in the future

# these are the name of the deployments you created in the Azure portal within the above resource
model_deployment_name = os.getenv("MODEL_DEPLOYMENT_NAME")
embedding_deployment_name = os.getenv("EMBEDDING_DEPLOYMENT_NAME")

### Example of LLM response on a question after its knowledge cut

In [3]:
from langchain.chat_models import AzureChatOpenAI
from langchain.schema import HumanMessage

llm = AzureChatOpenAI(
    openai_api_key=openai_api_key,
    openai_api_base=openai_api_base,
    openai_api_type=openai_api_type,
    openai_api_version=openai_api_version,
    deployment_name=model_deployment_name,
    temperature=0
)

question_1 = HumanMessage(content="Who won the 2022 football world cup?")
question_2 = HumanMessage(content="List the finalists of the 2022 football world cup")

print(llm([question_1]).content, "\n")
print(llm([question_2]).content)

As an AI language model, I cannot predict the future. The 2022 football world cup has not yet taken place. It is scheduled to be held in Qatar from November 21 to December 18, 2022. 

As an AI language model, I don't have access to future events. The finalists of the 2022 football world cup are yet to be determined.


## Step. 1 - Load the document(s)
Specify a `DocumentLoader` to load in your unstructured data as `Documents`. A `Document` is a piece of text (the page_content) and associated metadata.

In [4]:
from langchain.document_loaders import WebBaseLoader

loader = WebBaseLoader("https://en.wikipedia.org/wiki/2022_FIFA_World_Cup")
data = loader.load()

## Step. 2 - Split
Split the `Document` into chunks for embedding and vector storage

In [31]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=500,
    chunk_overlap=0
)
all_splits = text_splitter.split_documents(data)

print(len(all_splits))

463


## Step. 3 - Store
To be able to look up our document splits, we first need to store them where we can later look them up. The most common way to do this is to embed the contents of each document then store the embedding and document in a vector store, with the embedding being used to index the document.

NOTICE: Azure OpenAI embedding models currently only support batches of, at most, 16 chunks. For such reason, we set the `chunk_size` to 16.

In [32]:
from langchain.embeddings import OpenAIEmbeddings
from langchain.vectorstores import Chroma

embedding = OpenAIEmbeddings(
    openai_api_key=openai_api_key,
    openai_api_base=openai_api_base,
    openai_api_type=openai_api_type,
    openai_api_version=openai_api_version,
    deployment=embedding_deployment_name,
    chunk_size=16,
)
vectorstore = Chroma.from_documents(documents=all_splits, embedding=embedding)

In [33]:
embedding.chunk_size

16

## Step 4. - Retrieve
Retrieve relevant splits for any question using similarity search. By default, langchain retrieves the top 4 docs. Later on we will increase this number to increase the accuracy of the answer.

In [34]:
question = "Who won 2022 world cup"
docs = vectorstore.similarity_search(question)
len(docs)

4

## Step 5. Generate
Distill the retrieved documents into an answer using an LLM/Chat model with `RetrievalQA` chain.

In this example we customized our prompt, for didactic purpose. This is however not mandatory.



In [35]:
from langchain.chains import RetrievalQA
from langchain.chat_models import AzureChatOpenAI

llm = AzureChatOpenAI(
    openai_api_key=openai_api_key,
    openai_api_base=openai_api_base,
    openai_api_type=openai_api_type,
    openai_api_version=openai_api_version,
    deployment_name=model_deployment_name,
    temperature=0
)

In [36]:
from langchain.prompts import PromptTemplate

template = """Use the following pieces of context to answer the question at the end.
Use three sentences maximum and keep the answer as concise as possible.
Don't try to make up the answer, only use the context to answer the question.
The pieces of context refer to the Football World Cup 2022.

Context:
{context}

Question: {question}
Helpful Answer:"""

QA_CHAIN_PROMPT = PromptTemplate.from_template(template)

qa_chain = RetrievalQA.from_chain_type(
    llm,
    # by default, langchain retrieves the top 4 chunks, here we prefer to
    # retrieve more chunks to increase the chances of finding the answer
    retriever=vectorstore.as_retriever(search_kwargs={"k": 12}),
    chain_type_kwargs={"prompt": QA_CHAIN_PROMPT}
)

qa_chain({"query": question})

{'query': 'Who won 2022 world cup',
 'result': 'Argentina won the 2022 World Cup.'}

In [37]:
qa_chain({"query": "Summarize the final of the 2022 world cup with a few sentences with a lot of details"})

{'query': 'Summarize the final of the 2022 world cup with a few sentences with a lot of details',
 'result': 'In the final of the 2022 World Cup, Argentina faced off against France in a thrilling match that ended in a 3-3 draw after extra time. The match was eventually decided by penalties, with Argentina emerging as the champions after a 4-2 victory. Lionel Messi was named the best player of the tournament, while Kylian Mbappé finished as the top scorer with 8 goals.'}

In [39]:
qa_chain({"query": "Which teams did France face in the group stage?"})

{'query': 'Which teams did France face in the group stage?',
 'result': 'France faced Australia, Tunisia, and Denmark in the group stage.'}

In [40]:
# sad question ...
qa_chain({"query": "How about Italy? Did they participate in the tournament?"})

{'query': 'How about Italy? Did they participate in the tournament?',
 'result': 'No, Italy did not participate in the tournament as they failed to qualify for the second successive World Cup.'}

In [41]:
# tricky question
qa_chain({"query": "After France won the final, what happened?"})

{'query': 'After France won the final, what happened?',
 'result': 'France did not win the final, Argentina won the final 4-2 on penalties following a 3-3 draw after extra time.'}