# Build a Q&A Bot over private data with OpenAI and LangChain

https://www.linkedin.com/pulse/build-qa-bot-over-private-data-openai-langchain-leo-wang/

In [1]:
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.vectorstores import Chroma
from langchain.text_splitter import CharacterTextSplitter
from langchain.chains import ConversationalRetrievalChain

import os
os.environ["OPENAI_API_KEY"] = ""
from langchain.chat_models import ChatOpenAI
from langchain.llms import OpenAI
#llm = ChatOpenAI(temperature=0, model='gpt-3.5-turbo')
#llm = OpenAI(temperature=0, model='gpt-3.5-turbo')


## Data Digestion

In [2]:
from langchain.document_loaders import DirectoryLoader

pdf_loader = DirectoryLoader('./Reports/', glob="**/*.pdf")
txt_loader = DirectoryLoader('./Reports/', glob="**/*.txt")
word_loader = DirectoryLoader('./Reports/', glob="**/*.docx")

loaders = [pdf_loader, txt_loader, word_loader]
documents = []
for loader in loaders:
    documents.extend(loader.load())

print(f"Total number of documents: {len(documents)}")

Total number of documents: 1


## Text splitter

Once the data is ingested, it needs to be split into smaller chunks. By default, Tiktoken is used to count tokens for OpenAI LLMs.

You can also use it to count tokens when splitting documents.

Here we are splitting the text into 1k tokens with no overlap.

In [3]:
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
documents = text_splitter.split_documents(documents)

Created a chunk of size 1239, which is longer than the specified 1000
Created a chunk of size 1078, which is longer than the specified 1000
Created a chunk of size 1036, which is longer than the specified 1000
Created a chunk of size 1311, which is longer than the specified 1000
Created a chunk of size 1019, which is longer than the specified 1000
Created a chunk of size 1371, which is longer than the specified 1000
Created a chunk of size 1019, which is longer than the specified 1000
Created a chunk of size 1028, which is longer than the specified 1000
Created a chunk of size 1895, which is longer than the specified 1000
Created a chunk of size 1302, which is longer than the specified 1000
Created a chunk of size 1381, which is longer than the specified 1000
Created a chunk of size 1152, which is longer than the specified 1000
Created a chunk of size 1038, which is longer than the specified 1000
Created a chunk of size 1011, which is longer than the specified 1000
Created a chunk of s

## Embeddings

In [16]:
persist_dir = "chroma"
embeddings = OpenAIEmbeddings()
vectorstore = Chroma.from_documents(documents, embeddings, persist_directory=persist_dir)
vectorstore.persist()

Using embedded DuckDB with persistence: data will be stored in: chroma


## Conversational Retrieval Chain

Langchain's chains are easily reusable components which can be linked together. It is simple a chain of actions that has been pre-built (pre-defined) into a single line of code. You don't need to call the GPT model, define the properties with prompt.

This particular chain gives you the ability to chat over the documents and also remembers the history.

In [17]:
# Using the default OpenAI() LLM wrapper (less chatty)
#qa = ConversationalRetrievalChain.from_llm(OpenAI(temperature=0), vectorstore.as_retriever(), return_source_documents=True)
# Using the more conversational ChatOpenAI() wrapper
qa = ConversationalRetrievalChain.from_llm(ChatOpenAI(temperature=0), vectorstore.as_retriever(), return_source_documents=True)

Test the QA Chain:

In [18]:
## Test QA chain
user_message = "What kind of disaster is the text talking about?"
history = []
response = qa({"question": user_message, "chat_history": history})
print(response["answer"])

The text is talking about earthquakes and their impact on various aspects of life, including infrastructure, services, and people with disabilities.


Here is the logic:

1. Start a new variable "chat_history" with empty string
2. Always pass the user question and history to the model
3. Append the answer to the chat history
4. Repeat

It is literally three lines of code. I had a function only because of the front end.

In [19]:
# Front end web app
import gradio as gr
demo = gr.Blocks()
with demo:
    gr.Markdown(
        """
        # 🦜🔗 Ask Türkiye Humanitarian Response Bot!
        Start typing below to see the output.
        """
    )
    chatbot = gr.Chatbot()
    msg = gr.Textbox()
    clear = gr.Button("Clear")
    
    def user(user_message, history):
        # Format the list according to the expected input by ConversationalRetrievalChain
        history = [(item[0], item[1]) for item in history]
        # Get response from QA chain
        response = qa({"question": user_message, "chat_history": history})
        # Append user message and response to chat history
        history.append((user_message, response["answer"]))

        return gr.update(value=""), history
    
    msg.submit(user, inputs=[msg, chatbot], outputs=[msg, chatbot], queue=False)
    clear.click(lambda: None, None, chatbot, queue=False)

    demo.launch(debug=True)

Running on local URL:  http://127.0.0.1:7863

To create a public link, set `share=True` in `launch()`.


Keyboard interruption in main thread... closing server.


### Displaying the reference to the text source, but not including it in the internal QA history

In [None]:
# Front end web app
import gradio as gr
demo = gr.Blocks()
with demo:
    gr.Markdown(
        """
        # 🦜🔗 Ask Türkiye Humanitarian Response Bot!
        Start typing below to see the output.
        """
    )
    chatbot = gr.Chatbot()
    #output = gr.Textbox()
    msg = gr.Textbox()
    clear = gr.Button("Clear")
    
    def user(user_message, history):
        # Format the list according to the expected input by ConversationalRetrievalChain
        history_qa = [(item[0], str.splitlines(item[1])[0]) for item in history] # internal history for the QA chain
        history = [(item[0], item[1]) for item in history] # whole history to display in the Chatbot
        # Get response from QA chain
        response = qa({"question": user_message, "chat_history": history_qa})
        # Get the source document reference
        src = response['source_documents'][0].metadata['source']
        # Append user message and response to chat history
        history.append((user_message, response["answer"] + '\n' + "Source: " + src))

        return gr.update(value=""), history
    
    msg.submit(user, inputs=[msg, chatbot], outputs=[msg, chatbot], queue=False)
    clear.click(lambda: None, None, chatbot, queue=False)

    demo.launch(debug=True)

### Using chat memory buffer

In [20]:
# with chat memory (no explicitely defined)
from langchain.memory import ConversationBufferMemory
memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)
qa = ConversationalRetrievalChain.from_llm(ChatOpenAI(temperature=0), vectorstore.as_retriever(), memory=memory)

# Front end web app
import gradio as gr
demo = gr.Blocks()
with demo:
    gr.Markdown(
        """
        # 🦜🔗 Ask Türkiye Humanitarian Response Bot!
        Start typing below to see the output.
        """
    )
    chatbot = gr.Chatbot()
    msg = gr.Textbox()
    clear = gr.Button("Clear")
    
    def user(user_message, history):
        # Format the list according to the expected input by ConversationalRetrievalChain
        history = [(item[0], item[1]) for item in history]
        # Get response from QA chain (history not used here, it is already buffered)
        response = qa({"question": user_message})
        # Keep the same ouput as before avoid error in Gradio, but explicit history is not used in QA chain
        history.append((user_message, response["answer"]))
        return gr.update(value=""), history
    
    msg.submit(user, inputs=[msg, chatbot], outputs=[msg, chatbot], queue=False)
    clear.click(lambda: None, None, chatbot, queue=False)

    demo.launch(debug=True)

Running on local URL:  http://127.0.0.1:7863

To create a public link, set `share=True` in `launch()`.


Keyboard interruption in main thread... closing server.
