[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/pinecone-io/examples/blob/master/learn/generation/langchain/rag-chatbot.ipynb) [![Open nbviewer](https://raw.githubusercontent.com/pinecone-io/examples/master/assets/nbviewer-shield.svg)](https://nbviewer.org/github/pinecone-io/examples/blob/master/learn/generation/langchain/rag-chatbot.ipynb)

# Building RAG Chatbots with LangChain & OpenAI


In this example, we'll guide you through building an AI chatbot from start to finish, using LangChain, OpenAI, and Chroma Vector Database. The chatbot will utilize **Retrieval Augmented Generation (RAG)** to enhance its responses by retrieving relevant information from external sources.

We'll work with a sample document as part of the knowledge base.

By the end of this tutorial, you'll have a fully functional chatbot integrated with a RAG pipeline, capable of holding meaningful conversations and providing informative responses based on the

### Before you begin

You'll need to get an [OpenAI API key](https://platform.openai.com/account/api-keys)

### Prerequisites

Before we start building our chatbot, we need to install some Python libraries. Here's a brief overview of what each library does:

- **langchain**: This is a library for GenAI. We'll use it to chain together different language models and components for our chatbot.
- **openai**: This is the official OpenAI Python client. We'll use it to interact with the OpenAI API and generate responses for our chatbot.
- **datasets**: This library provides a vast array of datasets for machine learning. We'll use it to load our knowledge base for the chatbot.
- **chromadb**:  a fast, open-source vector database for managing embeddings in machine learning applications.

You can install these libraries using pip like so:

In [None]:
!pip install langchain openai chromadb tiktoken langchain-community langchain-openai pypdf

In [2]:
import os
from langchain.chat_models import ChatOpenAI  # Updated import for ChatOpenAI

### Building a Chatbot (no RAG)

We'll leverage the LangChain library to seamlessly integrate the various components required for our chatbot. To start, we’ll build a basic chatbot without RAG by initializing a ChatOpenAI object. This sets the foundation before we enhance it with more advanced retrieval mechanisms. You can obtain a key from [OpenAI API key](https://platform.openai.com/account/api-keys). For workshops, you can use the Key shared by us.

In [3]:
from google.colab import userdata

os.environ["OPENAI_API_KEY"] = userdata.get('OpenAI')

chat = ChatOpenAI(
    openai_api_key=os.environ["OPENAI_API_KEY"],
    model='gpt-3.5-turbo',
    temperature=0,
    max_tokens=None,
    timeout=None,
    #max_retries=2,
)

  chat = ChatOpenAI(


Recalling our last tutorial, Chats with OpenAI's `gpt-3.5-turbo` and `gpt-4` chat models are typically structured (in plain text) like this:

```
System: You are a helpful tutor.

User: Hi AI, how are you today?

Assistant: I'm great thank you. How can I help you?

User: I'd like to understand predictive analytics.

Assistant:
```

The final `"Assistant:"` without a response is what would prompt the model to continue the conversation. In the official OpenAI `ChatCompletion` endpoint these would be passed to the model in a format like:

```python
[
    {"role": "system", "content": "You are a helpful tutor"},
    {"role": "user", "content": "Hi AI, how are you today?"},
    {"role": "assistant", "content": "I'm great thank you. How can I help you?"}
    {"role": "user", "content": "I'd like to understand predictive analytics."}
]
```

In LangChain there is a slightly different format. We use three _message_ objects like so:

In [4]:
from langchain.schema import (
    SystemMessage,
    HumanMessage,
    AIMessage
)

messages = [
    SystemMessage(content="You are a helpful tutor."),
    HumanMessage(content="Hi AI, how are you today?"),
    AIMessage(content="I'm great thank you. How can I help you?"),
    HumanMessage(content="I'd like to understand predictive analytics.")
]

The format is very similar, we're just swapped the role of `"user"` for `HumanMessage`, and the role of `"assistant"` for `AIMessage`.

We generate the next response from the AI by passing these messages to the `ChatOpenAI` object.

In [5]:
res = chat.invoke(messages)
res.content

'Predictive analytics is a branch of advanced analytics that uses data, statistical algorithms, and machine learning techniques to identify the likelihood of future outcomes based on historical data. It involves analyzing current and historical data to make predictions about future events or trends.\n\nPredictive analytics can be used in various industries and applications, such as forecasting sales, predicting customer behavior, optimizing marketing campaigns, detecting fraud, and improving operational efficiency. By leveraging predictive analytics, organizations can make more informed decisions, anticipate future trends, and gain a competitive advantage.\n\nThe process of predictive analytics typically involves the following steps:\n\n1. Define the problem: Clearly define the business problem or question that you want to address with predictive analytics.\n\n2. Data collection: Gather relevant data from various sources, such as databases, spreadsheets, and external sources.\n\n3. Dat

In response we get another AI message object. We can print it more clearly like so:

Because `res` is just another `AIMessage` object, we can append it to `messages`, add another `HumanMessage`, and generate the next response in the conversation.

In [6]:
# add latest AI response to messages
messages.append(res)

# now create a new user prompt
prompt = HumanMessage(
    content="Why do researchers believe it can advance the decison making in organisations?"
)
# add to messages
messages.append(prompt)

# send to chat-gpt
res = chat(messages)

print(res.content)

  res = chat(messages)


Researchers believe that predictive analytics can advance decision-making in organizations for several reasons:

1. Data-driven insights: Predictive analytics leverages data and statistical algorithms to provide organizations with valuable insights into past trends and future outcomes. By analyzing historical data and identifying patterns, organizations can make more informed decisions based on evidence rather than intuition.

2. Anticipating future trends: Predictive analytics enables organizations to forecast future trends and outcomes, allowing them to proactively address potential challenges and opportunities. By predicting customer behavior, market trends, and operational performance, organizations can stay ahead of the competition and adapt to changing circumstances.

3. Improved accuracy and efficiency: Predictive analytics can help organizations improve the accuracy and efficiency of their decision-making processes. By using advanced algorithms to analyze data and make predicti

### Dealing with Hallucinations

We now have our chatbot, but as mentioned earlier, the knowledge of large language models (LLMs) can be limited. This limitation arises because LLMs acquire all their knowledge during the training phase. Essentially, an LLM compresses the information from its training data into its internal parameters, known as the model’s parametric knowledge.

By default, LLMs don't have real-time access to external information.

This becomes evident when we ask LLMs about recent events, such as the latest Australia Budget 2025, where they are unable to provide up-to-date insights without external augmentation.

In [7]:
# add latest AI response to messages
messages.append(res)

# now create a new user prompt
prompt = HumanMessage(
    content="What are the highlights of the Australia Budget 2025?"
)
# add to messages
messages.append(prompt)

# send to OpenAI
res = chat(messages)

In [8]:
print(res.content)

I'm sorry, but as an AI, I do not have real-time information or the ability to access current news updates. I recommend checking the latest news sources or the official government website for Australia to get the most up-to-date information on the Australia Budget for 2025. Is there anything else I can help you with?


Our chatbot can no longer help us, it doesn't contain the information we need to answer the question. It was very clear from this answer that the LLM doesn't know the informaiton, but sometimes an LLM may respond like it _does_ know the answer — and this can be very hard to detect.

OpenAI have since adjusted the behavior for this particular example as we can see below:

In [9]:
# add latest AI response to messages
messages.append(res)

# now create a new user prompt
prompt = HumanMessage(
    content="Can you tell more about Easing Cost of Living Pressures in Australia Budget 2025?"
)
# add to messages
messages.append(prompt)

# send to OpenAI
res = chat(messages)

In [10]:
print(res.content)

I'm sorry for the confusion, but as an AI, I do not have real-time information or the ability to access specific details about the Australia Budget for 2025. However, I can provide some general strategies that governments may consider to ease the cost of living pressures for citizens:

1. Tax relief: Governments may introduce tax cuts or adjustments to reduce the tax burden on individuals and households, allowing them to keep more of their income.

2. Social welfare programs: Increasing funding for social welfare programs such as unemployment benefits, housing assistance, and childcare subsidies can help alleviate financial strain on low-income families.

3. Affordable housing initiatives: Implementing policies to increase the supply of affordable housing and improve housing affordability can help reduce housing costs for individuals and families.

4. Energy cost relief: Introducing measures to reduce energy costs, such as subsidies for renewable energy sources or energy-efficient appl

Another method of providing knowledge to LLMs is through *source knowledge*, which refers to any information supplied via the prompt. This allows the model to work with fresh or specific data that wasn’t part of its training. We can demonstrate this by using the LLMChain in LangChain, pulling a description of the object directly from the LangChain documentation to feed into the model




In [11]:
source_knowledge = """
All 13.6 million Australian taxpayers will get a tax cut, with an average tax cut of $1,888 or
$36 a week
- $3.5 billion in energy bill relief for all Australian households and one million small businesses
- $1.9 billion to increase Commonwealth Rent Assistance by a further 10 per cent, benefiting
nearly 1 million households
- Cheaper medicines as part of the up to $3 billion agreement with community pharmacies
- Waiving $3 billion in student debt for more than 3 million Australians to make student
loans fairer
- Getting consumers a better deal at the supermarket checkout and through the
energy transition
- $1.1 billion to pay superannuation on Government-funded Paid Parental Leave
- $138 million to boost funding for emergency and food relief and financial support services
- Supporting wages growth through submissions to the Fair Work Commission and supporting
pay rises for care sector workers
- Extending the freeze on deeming rates for 876,000 income support recipients
"""

We can feed this additional knowledge into our prompt with some instructions telling the LLM how we'd like it to use this information alongside our original query.

In [12]:
query = "Can you tell more about Easing Cost of Living Pressures in Australia Budget 2025?"

augmented_prompt = f"""Using the context below, answer the query.

Context:
{source_knowledge}

Query: {query}"""

In [13]:
augmented_prompt

'Using the context below, answer the query.\n\nContext:\n\nAll 13.6 million Australian taxpayers will get a tax cut, with an average tax cut of $1,888 or\n$36 a week\n- $3.5 billion in energy bill relief for all Australian households and one million small businesses\n- $1.9 billion to increase Commonwealth Rent Assistance by a further 10 per cent, benefiting\nnearly 1 million households\n- Cheaper medicines as part of the up to $3 billion agreement with community pharmacies\n- Waiving $3 billion in student debt for more than 3 million Australians to make student\nloans fairer\n- Getting consumers a better deal at the supermarket checkout and through the\nenergy transition\n- $1.1 billion to pay superannuation on Government-funded Paid Parental Leave\n- $138 million to boost funding for emergency and food relief and financial support services\n- Supporting wages growth through submissions to the Fair Work Commission and supporting\npay rises for care sector workers\n- Extending the free

Now we feed this into our chatbot as we were before.

In [14]:
# create a new user prompt
prompt = HumanMessage(
    content=augmented_prompt
)
# add to messages
messages.append(prompt)

# send to OpenAI
res = chat(messages)

In [15]:
print(res.content)

In the Australia Budget 2025, there are several measures aimed at easing cost of living pressures for Australian households. Some of the key initiatives include:

1. Tax cuts for all Australian taxpayers: All 13.6 million Australian taxpayers will receive a tax cut, with an average tax cut of $1,888 or $36 a week. This measure aims to put more money back into the pockets of individuals and families to help alleviate financial burdens.

2. Energy bill relief: The budget includes $3.5 billion in energy bill relief for all Australian households and one million small businesses. This initiative is designed to reduce the cost of energy bills for consumers and businesses, thereby easing financial strain.

3. Increase in Commonwealth Rent Assistance: The budget allocates $1.9 billion to increase Commonwealth Rent Assistance by a further 10 per cent, benefiting nearly 1 million households. This increase in rental assistance aims to help low-income households afford housing and reduce housing c

The quality of this response is remarkable, made possible by augmenting our query with external knowledge, known as source knowledge. However, there's one challenge — how do we acquire this information in the first place?

This is where vector databases come in, as we explored in previous chapters. They can assist us in storing and retrieving relevant information. But before we dive in, we’ll need to start with a dataset.

### Importing the Data

In this task, we’ll import our data manually, using the Australia Budget 2025 document as the external knowledge source. This document will serve as the knowledge base for our chatbot, enabling it to provide accurate and up-to-date responses regarding the latest budget details. This approach will demonstrate how external information can be integrated to improve the chatbot’s capabilities.

In [16]:
from langchain.document_loaders import TextLoader
from langchain.document_loaders import PyPDFLoader
from langchain.text_splitter import CharacterTextSplitter

In [17]:
# Load the document
#loader = TextLoader("/content/test.txt")
#documents = loader.load()

In [18]:
loader = PyPDFLoader("/content/AusBudget-2025-26.pdf")
documents = loader.load()

In [19]:
# len(documents)

In [20]:
#Write code to load all the documentats and check the behavior

In [21]:
# Split the text into chunks
text_splitter = CharacterTextSplitter(chunk_size=250, chunk_overlap=0)
dataset = text_splitter.split_documents(documents)


Chunk Size is a paramter that you have to tune based on which size of text chunks you allow to be present in the dataset. If this Chunk size is too small, the dataset will contain small text chunks, that will contain partial information. If the Chunk size is too large, you will use up the context window space fast.

In [22]:
from langchain_core.documents import Document

text_splitter = CharacterTextSplitter(chunk_size=10, chunk_overlap=0)
# text_splitter = CharacterTextSplitter(chunk_size=40, chunk_overlap=0)

document1 = Document(
            page_content="I have a dog \n\n His name is Cooper \n\n He is fun",
        )

text_splitter.split_documents([document1])



[Document(metadata={}, page_content='I have a dog'),
 Document(metadata={}, page_content='His name is Cooper'),
 Document(metadata={}, page_content='He is fun')]

In [23]:
# print(dataset[9].page_content)

#### Dataset Overview

The dataset we are using is sourced from the Australia Budget 2025 document. This document provides the latest details on Australia's fiscal plans and economic strategies. Each entry in our knowledge base represents a "chunk" of relevant information extracted from this document.

Since most Large Language Models (LLMs) only contain knowledge from their training period, they cannot answer questions about the Australia Budget 2025 — at least not without this external data.

### Building the Knowledge Base

We now have a dataset that can serve as our chatbot knowledge base. Our next task is to transform that dataset into the knowledge base that our chatbot can use. To do this we must use an embedding model and vector database.


Now we set up our index specification, which allows us to define the configuration for ChromaDB. This setup ensures that our vector data is properly indexed and optimized for retrieval. With ChromaDB being open-source and lightweight, deployment is straightforward without the need for a cloud provider or region specification.

###  Creation of Vector Database using Embeddings

The code snippet initializes the embedding model using LangChain's OpenAIEmbeddings class, which leverages OpenAI's API to convert text data into vector representations (embeddings). These embeddings represent the meaning of the text in a high-dimensional vector space, making it easier for the model to perform similarity searches, clustering, and other tasks on textual data.



Then we initialize the index. We will be using OpenAI's `text-embedding-ada-002` model for creating the embeddings, so we set the `dimension` to `1536`.

Using this model we can create embeddings like so:

In [24]:
# Create embeddings
from langchain.embeddings import OpenAIEmbeddings
embeddings = OpenAIEmbeddings()

  embeddings = OpenAIEmbeddings()


In [25]:
# Create embeddings for a sample text
text = "Australia Budget 2025 focuses on economic growth and sustainability."
vector = embeddings.embed_query(text)
# vector

### Chroma Vector Database

**Chroma**: By default, Chroma is an in-memory vector database that stores embeddings of the documents. It operates in memory unless explicitly configured to persist data to disk or another storage backend.

**In-memory storage:** This means that the vector store, which stores the embeddings of the documents, exists in your computer's RAM during runtime. Once you stop the program, the data is lost unless you've configured Chroma to save it.

Chroma.from_documents(dataset, embeddings):

* dataset: This refers to the documents for which you want to generate and store embeddings.
* embeddings: This is the embedding model (like OpenAIEmbeddings), which is used to convert the documents into vector representations (embeddings).

Chroma will take the embeddings of the documents and store them in-memory for fast retrieval and similarity search.

In [26]:
# Create a vector store
from langchain.vectorstores import Chroma
vectorstore = Chroma.from_documents(dataset, embeddings)

### Creation of Vector Store Retriever

**Creating a Retriever:** The retriever is an object that allows us to search through a vector store (which stores embeddings of documents or text chunks) and find the most relevant documents or text based on a given query. The retriever’s job is to return a subset of documents that best match the query based on similarity of their embeddings.

**Setting top_k = 5:** The variable top_k is set to 5, which defines the number of results (documents or text chunks) the retriever should return from the vector store. In this case, when you perform a search or query using the retriever, it will return the top 5 most similar documents.

**Reason for using k: **In machine learning, k is commonly used to represent the number of nearest neighbors or results that should be returned from a search. When we perform a search in a vector database, the system compares the embedding of the query with the stored document embeddings to find the closest matches.

**By specifying k = 5,** we limit the results to only the top 5 most similar embeddings. This is useful because retrieving too many results may introduce noise, while retrieving too few may omit valuable information. The choice of k balances relevance and result quantity, helping to ensure that the top 5 most relevant results are retrieved for the query, offering better performance and accuracy for tasks like question-answering or document search.

**vectorstore.as_retriever(search_kwargs={"k": top_k}):** This line converts the vector store into a retriever object by passing search_kwargs={"k": top_k}. The search_kwargs argument allows you to specify additional search parameters for the retriever—in this case, limiting the number of results to the top 5 using the k value.

**Why is k Important?**
* **Efficiency**: Limiting the number of retrieved documents improves efficiency. You don't need to process or rank too many results, which could slow down response time.
* **Relevance**: Instead of returning all possible matches, we retrieve only the top k (5 in this case) results, ensuring that the returned documents are the most relevant and not overwhelming the system with irrelevant data.
* **Performance**: Focusing on a smaller number of high-quality matches makes it easier to integrate with downstream systems like question-answering modules, which can then provide more accurate responses without unnecessary computation.
In short, k helps fine-tune the retrieval process, ensuring efficiency and relevance in returning a manageable number of top results.

In [27]:
# Create a retriever
top_k = 5
retriever = vectorstore.as_retriever(search_kwargs={"k": top_k})

In [28]:
query_text= "Can you tell more about Easing Cost of Living Pressures in Australia Budget 2025?"

In [None]:
# Get top 3 relevant chunks based on the query
relevant_chunks = retriever.get_relevant_documents(query_text)

for idx, chunk in enumerate(relevant_chunks, start=1):
    print(f"Chunk {idx}:\n{chunk.page_content}\n")

### Retrieval Augmented Generation

We've built a fully-fledged knowledge base. Now it's time to connect that knowledge base to our chatbot. To do that we'll be diving back into LangChain and reusing our template prompt from earlier.

To use LangChain here we need to load the LangChain abstraction for a vector index, called a `vectorstore`. We pass in our vector `index` to initialize the object.

Using this `vectorstore` we can already query the index and see if we have any relevant information given our question about Australia Budget 2025.

vectorstore.similarity_search: This function performs a similarity search on the vector store (created using embeddings) based on the query you provide. It finds the documents that are most similar to the query by comparing their vector embeddings.

Going forward, we will use **vectorstore.similarity_search** function which is similar to the retriever we mentioned about.

| Feature                          | `vectorstore.similarity_search(query, k=3)`              | `retriever = vectorstore.as_retriever(search_kwargs={"k": top_k})`   |
| --------------------------------- | ------------------------------------------------------- | ------------------------------------------------------------------- |
| **Purpose**                       | Direct similarity search on a query                     | Converts vector store into a retriever for broader pipeline usage   |
| **Integration**                   | Standalone search operation                             | Integrates with LangChain workflows (e.g., RAG, LLMChain, etc.)     |
| **Usage**                         | For simple retrieval tasks                              | For more flexible or complex retrieval tasks in larger workflows    |
| **Output**                        | Returns the top `k` similar documents directly           | Returns a retriever object to use within other processes            |
| **Flexibility**                   | Limited to direct searches                              | Highly flexible for use in pipelines or more advanced workflows     |
| **Configuration**                 | Specify `k` directly in the function                    | Configure `k` in `search_kwargs`, part of a retriever configuration |


In [None]:
query = "Can you tell more about Easing Cost of Living Pressures in Australia Budget 2025?"

vectorstore.similarity_search(query, k=3)

We return a lot of text here and it's not that clear what we need or what is relevant. Fortunately, our LLM will be able to parse this information much faster than us. All we need is to connect the output from our `vectorstore` to our `chat` chatbot. To do that we can use the same logic as we used earlier.

In [31]:
def augment_prompt(query: str):
    retriever = vectorstore.similarity_search(query, k=3)

    # get the text from the results
    source_knowledge = "\n".join([x.page_content for x in retriever])
    # feed into an augmented prompt
    augmented_prompt = f"""Using the contexts below, answer the query.

    Contexts:
    {source_knowledge}

    Query: {query}"""
    return augmented_prompt

Using this we produce an augmented prompt:

In [None]:
print(augment_prompt(query))

There is still a lot of text here, so let's pass it onto our chat model to see how it performs.

In [33]:
# create a new user prompt
prompt = HumanMessage(
    content=augment_prompt(query)
)
# add to messages
messages.append(prompt)

res = chat(messages)

print(res.content)

The Australia Budget 2025 includes several measures aimed at easing cost of living pressures for Australian households. Some of the key initiatives outlined in the budget to provide relief to Australians include:

1. Tax cuts for all Australian taxpayers: All 13.6 million Australian taxpayers will receive a tax cut, with an average tax cut of $1,888 or $36 a week. This measure aims to put more money back into the pockets of individuals and families, reducing their financial burden.

2. Energy bill relief: The budget allocates $3.5 billion in energy bill relief for all Australian households and one million small businesses. This initiative aims to reduce the cost of energy bills for consumers, providing financial relief to households and businesses.

3. Increase in Commonwealth Rent Assistance: The budget includes $1.9 billion to increase Commonwealth Rent Assistance by a further 10%, benefiting nearly 1 million households. This measure aims to help renters cope with rising housing cost

We can continue with more questions. Let's try _without_ RAG first:

In [34]:
prompt = HumanMessage(
    content="what are the key highlights of the Australia budget 2025"
)

res = chat(messages + [prompt])
print(res.content)

Based on the provided context, the key highlights of the Australia Budget 2025 in terms of easing cost of living pressures include:

1. Tax Cuts: All 13.6 million Australian taxpayers will receive a tax cut, with an average tax cut of $1,888 or $36 a week. This tax relief aims to reduce the financial burden on individuals and families.

2. Energy Bill Relief: The budget includes $3.5 billion in energy bill relief for all Australian households and one million small businesses. This measure is intended to alleviate the rising costs of energy for consumers and businesses.

3. Rent Assistance Increase: An allocation of $1.9 billion is designated to increase Commonwealth Rent Assistance by a further 10%, benefiting nearly 1 million households. This increase aims to assist renters in managing their housing costs.

4. Cheaper Medicines: The budget includes measures to provide cheaper medicines as part of an agreement with community pharmacies, with an allocation of up to $3 billion. This init

The chatbot is able to respond about thanks to it's conversational history stored in `messages`. However, it doesn't know anything about the safety measures themselves as we have not provided it with that information via the RAG pipeline. Let's try again but with RAG.

In [35]:
prompt = HumanMessage(
    content=augment_prompt(
        "What measures taken to strenghthen the economy?"
    )
)
res = chat(messages + [prompt])
print(res.content)

The measures taken to strengthen the economy as outlined in the contexts include:

1. Delivering responsible cost-of-living relief: The government is providing new personal income tax cuts for every Australian taxpayer, extending energy bill relief, advocating for wage increases, banning non-compete clauses for most workers, funding pay rises for aged care workers and early childhood educators, reducing the cost of medicines, and helping consumers get a better deal. These measures aim to ease the financial burden on individuals and households, thereby supporting economic growth.

2. Strengthening Medicare: The government is making it easier for Australians to see a doctor for free by expanding eligibility for bulk billing incentives to cover all Australians, investing in hospitals and urgent care clinics, investing in women's health, and training more doctors and nurses. These investments in healthcare infrastructure and services contribute to improving public health outcomes and suppo

We get a much more informed response that includes several items missing in the previous non-RAG response, such as "red-teaming", "iterative evaluations", and the intention of the researchers to share this research to help "improve their safety, promoting responsible development in the field".

---

### Quick Prototype of a ChatBot

"[Gradio](https://www.gradio.app/guides/creating-a-chatbot-fast)  is a powerful tool that allows you to easily build and share machine learning applications with an intuitive user interface. It enables real-time interaction with your AI models, making it perfect for building live experiences, such as chatbots.

In this live demo, we’ve used Gradio to create an interactive interface for our AI chatbot, allowing you to engage in meaningful conversations and see the system retrieve and generate responses instantly.

Here’s a quick example of how you can integrate Gradio with a chatbot:

In [None]:
!pip install gradio

In [None]:
from langchain.chat_models import ChatOpenAI
from langchain.schema import AIMessage, HumanMessage
import openai
import gradio as gr

def predict(message, history):
    history_langchain_format = []
    for human, ai in history:
        history_langchain_format.append(HumanMessage(content=human))
        history_langchain_format.append(AIMessage(content=ai))
    history_langchain_format.append(HumanMessage(content=message))
    gpt_response = chat(history_langchain_format)
    return gpt_response.content

gr.ChatInterface(predict).launch()