[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/pinecone-io/examples/blob/master/learn/generation/langchain/rag-chatbot.ipynb) [![Open nbviewer](https://raw.githubusercontent.com/pinecone-io/examples/master/assets/nbviewer-shield.svg)](https://nbviewer.org/github/pinecone-io/examples/blob/master/learn/generation/langchain/rag-chatbot.ipynb)

# Building RAG Chatbots with LangChain & OpenAI


In this example, we'll guide you through building an AI chatbot from start to finish, using LangChain, OpenAI, and Chroma Vector Database. The chatbot will utilize **Retrieval Augmented Generation (RAG)** to enhance its responses by retrieving relevant information from external sources.

We'll work with a sample document as part of the knowledge base.

By the end of this tutorial, you'll have a fully functional chatbot integrated with a RAG pipeline, capable of holding meaningful conversations and providing informative responses based on the

### Before you begin

You'll need to get an [OpenAI API key](https://platform.openai.com/account/api-keys)

### Prerequisites

Before we start building our chatbot, we need to install some Python libraries. Here's a brief overview of what each library does:

- **langchain**: This is a library for GenAI. We'll use it to chain together different language models and components for our chatbot.
- **openai**: This is the official OpenAI Python client. We'll use it to interact with the OpenAI API and generate responses for our chatbot.
- **datasets**: This library provides a vast array of datasets for machine learning. We'll use it to load our knowledge base for the chatbot.
- **chromadb**:  a fast, open-source vector database for managing embeddings in machine learning applications.

You can install these libraries using pip like so:

In [1]:
!pip install langchain openai chromadb tiktoken langchain-community langchain-openai pypdf

Collecting langchain
  Downloading langchain-0.3.3-py3-none-any.whl.metadata (7.1 kB)
Collecting openai
  Downloading openai-1.51.2-py3-none-any.whl.metadata (24 kB)
Collecting chromadb
  Downloading chromadb-0.5.13-py3-none-any.whl.metadata (6.8 kB)
Collecting tiktoken
  Downloading tiktoken-0.8.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (6.6 kB)
Collecting langchain-community
  Downloading langchain_community-0.3.2-py3-none-any.whl.metadata (2.8 kB)
Collecting langchain-openai
  Downloading langchain_openai-0.2.2-py3-none-any.whl.metadata (2.6 kB)
Collecting pypdf
  Downloading pypdf-5.0.1-py3-none-any.whl.metadata (7.4 kB)
Collecting langchain-core<0.4.0,>=0.3.10 (from langchain)
  Downloading langchain_core-0.3.10-py3-none-any.whl.metadata (6.3 kB)
Collecting langchain-text-splitters<0.4.0,>=0.3.0 (from langchain)
  Downloading langchain_text_splitters-0.3.0-py3-none-any.whl.metadata (2.3 kB)
Collecting langsmith<0.2.0,>=0.1.17 (from langchain)
  Download

In [2]:
import os
from langchain.chat_models import ChatOpenAI  # Updated import for ChatOpenAI

### Building a Chatbot (no RAG)

We'll leverage the LangChain library to seamlessly integrate the various components required for our chatbot. To start, we’ll build a basic chatbot without RAG by initializing a ChatOpenAI object. This sets the foundation before we enhance it with more advanced retrieval mechanisms. You can obtain a key from [OpenAI API key](https://platform.openai.com/account/api-keys). For workshops, you can use the Key shared by us.

In [3]:
os.environ["OPENAI_API_KEY"] = ""

chat = ChatOpenAI(
    openai_api_key=os.environ["OPENAI_API_KEY"],
    model='gpt-3.5-turbo',
    temperature=0,
    max_tokens=None,
    timeout=None,
    #max_retries=2,
)

  chat = ChatOpenAI(


Recalling our last tutorial, Chats with OpenAI's `gpt-3.5-turbo` and `gpt-4` chat models are typically structured (in plain text) like this:

```
System: You are a helpful tutor.

User: Hi AI, how are you today?

Assistant: I'm great thank you. How can I help you?

User: I'd like to understand predictive analytics.

Assistant:
```

The final `"Assistant:"` without a response is what would prompt the model to continue the conversation. In the official OpenAI `ChatCompletion` endpoint these would be passed to the model in a format like:

```python
[
    {"role": "system", "content": "You are a helpful tutor"},
    {"role": "user", "content": "Hi AI, how are you today?"},
    {"role": "assistant", "content": "I'm great thank you. How can I help you?"}
    {"role": "user", "content": "I'd like to understand predictive analytics."}
]
```

In LangChain there is a slightly different format. We use three _message_ objects like so:

In [4]:
from langchain.schema import (
    SystemMessage,
    HumanMessage,
    AIMessage
)

messages = [
    SystemMessage(content="You are a helpful tutor."),
    HumanMessage(content="Hi AI, how are you today?"),
    AIMessage(content="I'm great thank you. How can I help you?"),
    HumanMessage(content="I'd like to understand predictive analytics.")
]

The format is very similar, we're just swapped the role of `"user"` for `HumanMessage`, and the role of `"assistant"` for `AIMessage`.

We generate the next response from the AI by passing these messages to the `ChatOpenAI` object.

In [5]:
res = chat.invoke(messages)
res.content

'Predictive analytics is a branch of advanced analytics that uses data, statistical algorithms, and machine learning techniques to identify the likelihood of future outcomes based on historical data. It involves analyzing current and historical data to make predictions about future events or trends.\n\nPredictive analytics can be used in various industries and applications, such as forecasting sales, predicting customer behavior, optimizing marketing campaigns, detecting fraud, and improving operational efficiency. By leveraging predictive analytics, organizations can make more informed decisions, anticipate future trends, and gain a competitive advantage.\n\nThe process of predictive analytics typically involves the following steps:\n\n1. Define the problem: Clearly define the business problem or question that you want to address with predictive analytics.\n\n2. Data collection: Gather relevant data from various sources, such as databases, spreadsheets, and external sources.\n\n3. Dat

In response we get another AI message object. We can print it more clearly like so:

Because `res` is just another `AIMessage` object, we can append it to `messages`, add another `HumanMessage`, and generate the next response in the conversation.

In [6]:
# add latest AI response to messages
messages.append(res)

# now create a new user prompt
prompt = HumanMessage(
    content="Why do researchers believe it can advance the decison making in organisations?"
)
# add to messages
messages.append(prompt)

# send to chat-gpt
res = chat(messages)

print(res.content)

  res = chat(messages)


Researchers believe that predictive analytics can advance decision-making in organizations for several reasons:

1. Data-driven insights: Predictive analytics enables organizations to make decisions based on data and evidence rather than intuition or guesswork. By analyzing historical data and identifying patterns and trends, organizations can gain valuable insights that can inform strategic decisions.

2. Anticipating future trends: Predictive analytics allows organizations to forecast future outcomes and trends based on historical data and predictive models. By predicting potential scenarios and outcomes, organizations can proactively plan and adapt their strategies to capitalize on opportunities or mitigate risks.

3. Improved accuracy and efficiency: Predictive analytics can help organizations make more accurate and reliable predictions, leading to better decision-making. By automating the analysis of large datasets and complex relationships, organizations can make decisions more e

### Dealing with Hallucinations

We now have our chatbot, but as mentioned earlier, the knowledge of large language models (LLMs) can be limited. This limitation arises because LLMs acquire all their knowledge during the training phase. Essentially, an LLM compresses the information from its training data into its internal parameters, known as the model’s parametric knowledge.

By default, LLMs don't have real-time access to external information.

This becomes evident when we ask LLMs about recent events, such as the latest Australia Budget 2024, where they are unable to provide up-to-date insights without external augmentation.

In [7]:
# add latest AI response to messages
messages.append(res)

# now create a new user prompt
prompt = HumanMessage(
    content="What are the highlights of the Australia Budget 2024?"
)
# add to messages
messages.append(prompt)

# send to OpenAI
res = chat(messages)

In [8]:
print(res.content)

I'm sorry, but as an AI, I do not have real-time information or the ability to access current news updates. I recommend checking official government websites, news outlets, or financial publications for the latest information on the Australia Budget 2024. Is there anything else I can help you with?


Our chatbot can no longer help us, it doesn't contain the information we need to answer the question. It was very clear from this answer that the LLM doesn't know the informaiton, but sometimes an LLM may respond like it _does_ know the answer — and this can be very hard to detect.

OpenAI have since adjusted the behavior for this particular example as we can see below:

In [9]:
# add latest AI response to messages
messages.append(res)

# now create a new user prompt
prompt = HumanMessage(
    content="Can you tell more about Easing Cost of Living Pressures in Australia Budget 2024?"
)
# add to messages
messages.append(prompt)

# send to OpenAI
res = chat(messages)

In [10]:
print(res.content)

I apologize for the confusion earlier. As an AI, I do not have real-time information on specific budgets or policies. However, I can provide some general strategies that governments may consider to ease cost of living pressures in a budget:

1. Tax relief: Governments may provide tax cuts or incentives to lower-income households to reduce their tax burden and increase disposable income.

2. Social welfare programs: Increasing funding for social welfare programs such as unemployment benefits, housing assistance, and childcare subsidies can help support vulnerable populations and reduce financial strain.

3. Affordable housing initiatives: Implementing policies to increase affordable housing supply, provide rental assistance, or offer first-home buyer incentives can help address housing affordability issues.

4. Energy subsidies: Providing subsidies or rebates for energy costs can help lower household expenses and alleviate financial pressure on families.

5. Healthcare support: Investin

Another method of providing knowledge to LLMs is through *source knowledge*, which refers to any information supplied via the prompt. This allows the model to work with fresh or specific data that wasn’t part of its training. We can demonstrate this by using the LLMChain in LangChain, pulling a description of the object directly from the LangChain documentation to feed into the model




In [11]:
source_knowledge = """
All 13.6 million Australian taxpayers will get a tax cut, with an average tax cut of $1,888 or
$36 a week
- $3.5 billion in energy bill relief for all Australian households and one million small businesses
- $1.9 billion to increase Commonwealth Rent Assistance by a further 10 per cent, benefiting
nearly 1 million households
- Cheaper medicines as part of the up to $3 billion agreement with community pharmacies
- Waiving $3 billion in student debt for more than 3 million Australians to make student
loans fairer
- Getting consumers a better deal at the supermarket checkout and through the
energy transition
- $1.1 billion to pay superannuation on Government-funded Paid Parental Leave
- $138 million to boost funding for emergency and food relief and financial support services
- Supporting wages growth through submissions to the Fair Work Commission and supporting
pay rises for care sector workers
- Extending the freeze on deeming rates for 876,000 income support recipients
"""

We can feed this additional knowledge into our prompt with some instructions telling the LLM how we'd like it to use this information alongside our original query.

In [12]:
query = "Can you tell more about Easing Cost of Living Pressures in Australia Budget 2024?"

augmented_prompt = f"""Using the contexts below, answer the query.

Contexts:
{source_knowledge}

Query: {query}"""

In [13]:
augmented_prompt

'Using the contexts below, answer the query.\n\nContexts:\n\nAll 13.6 million Australian taxpayers will get a tax cut, with an average tax cut of $1,888 or\n$36 a week\n- $3.5 billion in energy bill relief for all Australian households and one million small businesses\n- $1.9 billion to increase Commonwealth Rent Assistance by a further 10 per cent, benefiting\nnearly 1 million households\n- Cheaper medicines as part of the up to $3 billion agreement with community pharmacies\n- Waiving $3 billion in student debt for more than 3 million Australians to make student\nloans fairer\n- Getting consumers a better deal at the supermarket checkout and through the\nenergy transition\n- $1.1 billion to pay superannuation on Government-funded Paid Parental Leave\n- $138 million to boost funding for emergency and food relief and financial support services\n- Supporting wages growth through submissions to the Fair Work Commission and supporting\npay rises for care sector workers\n- Extending the fr

Now we feed this into our chatbot as we were before.

In [14]:
# create a new user prompt
prompt = HumanMessage(
    content=augmented_prompt
)
# add to messages
messages.append(prompt)

# send to OpenAI
res = chat(messages)

In [15]:
print(res.content)

In the Australia Budget 2024, there are several measures aimed at easing cost of living pressures for Australian households. Here are some key initiatives:

1. Tax cuts for all Australian taxpayers: All 13.6 million Australian taxpayers will receive a tax cut, with an average tax cut of $1,888 or $36 a week. This measure aims to provide financial relief to individuals and families by reducing their tax burden.

2. Energy bill relief: The budget includes $3.5 billion in energy bill relief for all Australian households and one million small businesses. This initiative is designed to help reduce the cost of energy for households and businesses, thereby easing financial pressures.

3. Increase in Commonwealth Rent Assistance: The budget allocates $1.9 billion to increase Commonwealth Rent Assistance by a further 10 per cent, benefiting nearly 1 million households. This measure aims to support renters by providing additional financial assistance for housing costs.

4. Cheaper medicines: As 

The quality of this response is remarkable, made possible by augmenting our query with external knowledge, known as source knowledge. However, there's one challenge — how do we acquire this information in the first place?

This is where vector databases come in, as we explored in previous chapters. They can assist us in storing and retrieving relevant information. But before we dive in, we’ll need to start with a dataset.

### Importing the Data

In this task, we’ll import our data manually, using the Australia Budget 2024 document as the external knowledge source. This document will serve as the knowledge base for our chatbot, enabling it to provide accurate and up-to-date responses regarding the latest budget details. This approach will demonstrate how external information can be integrated to improve the chatbot’s capabilities.

In [16]:
from langchain.document_loaders import TextLoader
from langchain.document_loaders import PyPDFLoader
from langchain.text_splitter import CharacterTextSplitter

In [17]:
# Load the document
#loader = TextLoader("/content/test.txt")
#documents = loader.load()

In [20]:
loader = PyPDFLoader("/content/budget-overview-final.pdf")
documents = loader.load()

In [None]:
#Write code to load all the documentats and check the behavior

In [21]:
# Split the text into chunks
text_splitter = CharacterTextSplitter(chunk_size=250, chunk_overlap=0)
dataset = text_splitter.split_documents(documents)
dataset

[Document(metadata={'source': '/content/budget-overview-final.pdf', 'page': 0}, page_content='Cost of living help \n& a future made  \nin Australia\nbudget.gov.au\nMay 2024'),
 Document(metadata={'source': '/content/budget-overview-final.pdf', 'page': 1}, page_content='Cost of living help \n& a future made  \nin Australia\nbudget.gov.au\nMay 2024'),
 Document(metadata={'source': '/content/budget-overview-final.pdf', 'page': 2}, page_content='© Commonwealth of Australia 2024\nISBN 978-1-925832-95-2\nThis publication is available for your use under a Creative Commons Attribution 4.0 International licence, \nwith the exception of the Commonwealth Coat of Arms, photographs, images, third party content,  \nand where otherwise stated. The full licence terms are available from  \nhttps:/ /creativecommons.org/licenses/by/4.0/legalcode .\nUse of Commonwealth of Australia material under a Creative Commons Attribution 4.0 International \nlicence requires you to attribute the work (but not in any 

In [22]:
print(dataset[9].page_content)

Domestic economic outlook
Facing challenges from a position of 
economic strength
Australia is not immune from global developments. 
Moderating but high inflation and higher interest rates here have resulted in lower growth over the past year. The Australian economy faces these challenges from a position of economic strength with inflation that is now less than half of its peak, a resilient labour market with unemployment close to 50-year lows, a return to annual real wage growth and a solid pipeline of business investment.
The Government’s targeted cost-of-living measures are 
expected to reduce inflation, with energy bill relief and Commonwealth Rent Assistance expected to directly reduce inflation by ½ of a percentage point in 2024–25 and not expected to add to broader inflationary pressures. Treasury is forecasting this could see headline inflation return to the target band by the end of 2024, slightly earlier than expected at MYEFO.
The labour market has been resilient with an une

#### Dataset Overview

The dataset we are using is sourced from the Australia Budget 2024 document. This document provides the latest details on Australia's fiscal plans and economic strategies. Each entry in our knowledge base represents a "chunk" of relevant information extracted from this document.

Since most Large Language Models (LLMs) only contain knowledge from their training period, they cannot answer questions about the Australia Budget 2024 — at least not without this external data.

### Building the Knowledge Base

We now have a dataset that can serve as our chatbot knowledge base. Our next task is to transform that dataset into the knowledge base that our chatbot can use. To do this we must use an embedding model and vector database.


Now we set up our index specification, which allows us to define the configuration for ChromaDB. This setup ensures that our vector data is properly indexed and optimized for retrieval. With ChromaDB being open-source and lightweight, deployment is straightforward without the need for a cloud provider or region specification.

###  Creation of Vector Database using Embeddings

The code snippet initializes the embedding model using LangChain's OpenAIEmbeddings class, which leverages OpenAI's API to convert text data into vector representations (embeddings). These embeddings represent the meaning of the text in a high-dimensional vector space, making it easier for the model to perform similarity searches, clustering, and other tasks on textual data.



Then we initialize the index. We will be using OpenAI's `text-embedding-ada-002` model for creating the embeddings, so we set the `dimension` to `1536`.

Using this model we can create embeddings like so:

In [23]:
# Create embeddings
from langchain.embeddings import OpenAIEmbeddings
embeddings = OpenAIEmbeddings()

  embeddings = OpenAIEmbeddings()


In [24]:
# Create embeddings for a sample text
text = "Australia Budget 2024 focuses on economic growth and sustainability."
vector = embeddings.embed_query(text)
vector

[-0.008891760458240928,
 -0.024347396804634238,
 0.0021068640258503925,
 -0.003949773536354634,
 -0.007098143989446103,
 -0.005059653476738426,
 -0.0028526142983107843,
 -0.02633182357569756,
 -0.011181485016785458,
 -0.008904481393411516,
 0.0255431404962824,
 0.01006842414411708,
 0.012097374653938755,
 -0.018355952754609925,
 -0.0052727251827248064,
 -0.008026753630447404,
 0.02573395172987348,
 -0.025619466107305932,
 0.016498731609962156,
 -0.015633724782168634,
 -0.015633724782168634,
 0.025759393600214654,
 0.0031070281122789934,
 0.004194646648944902,
 0.015926301634479254,
 0.019335445204970998,
 -0.006843729942647247,
 -0.007835943328178907,
 0.012943299613315104,
 0.008045835498864578,
 -0.0053267879930465785,
 -0.01041188287446489,
 0.016053507260894807,
 -0.004181926179435605,
 -0.02605196858988011,
 -0.030453326477226098,
 -0.013496649583026,
 -0.0068628113454031305,
 0.011194205020633464,
 -0.005721129532754159,
 -0.016816750332613956,
 -0.005791093279208521,
 0.00561618

### Chroma Vector Database

**Chroma**: By default, Chroma is an in-memory vector database that stores embeddings of the documents. It operates in memory unless explicitly configured to persist data to disk or another storage backend.

**In-memory storage:** This means that the vector store, which stores the embeddings of the documents, exists in your computer's RAM during runtime. Once you stop the program, the data is lost unless you've configured Chroma to save it.

Chroma.from_documents(dataset, embeddings):

* dataset: This refers to the documents for which you want to generate and store embeddings.
* embeddings: This is the embedding model (like OpenAIEmbeddings), which is used to convert the documents into vector representations (embeddings).

Chroma will take the embeddings of the documents and store them in-memory for fast retrieval and similarity search.

In [25]:
# Create a vector store
from langchain.vectorstores import Chroma
vectorstore = Chroma.from_documents(dataset, embeddings)

### Creation of Vector Store Retriever

**Creating a Retriever:** The retriever is an object that allows us to search through a vector store (which stores embeddings of documents or text chunks) and find the most relevant documents or text based on a given query. The retriever’s job is to return a subset of documents that best match the query based on similarity of their embeddings.

**Setting top_k = 5:** The variable top_k is set to 5, which defines the number of results (documents or text chunks) the retriever should return from the vector store. In this case, when you perform a search or query using the retriever, it will return the top 5 most similar documents.

**Reason for using k: **In machine learning, k is commonly used to represent the number of nearest neighbors or results that should be returned from a search. When we perform a search in a vector database, the system compares the embedding of the query with the stored document embeddings to find the closest matches.

**By specifying k = 5,** we limit the results to only the top 5 most similar embeddings. This is useful because retrieving too many results may introduce noise, while retrieving too few may omit valuable information. The choice of k balances relevance and result quantity, helping to ensure that the top 5 most relevant results are retrieved for the query, offering better performance and accuracy for tasks like question-answering or document search.

**vectorstore.as_retriever(search_kwargs={"k": top_k}):** This line converts the vector store into a retriever object by passing search_kwargs={"k": top_k}. The search_kwargs argument allows you to specify additional search parameters for the retriever—in this case, limiting the number of results to the top 5 using the k value.

**Why is k Important?**
* **Efficiency**: Limiting the number of retrieved documents improves efficiency. You don't need to process or rank too many results, which could slow down response time.
* **Relevance**: Instead of returning all possible matches, we retrieve only the top k (5 in this case) results, ensuring that the returned documents are the most relevant and not overwhelming the system with irrelevant data.
* **Performance**: Focusing on a smaller number of high-quality matches makes it easier to integrate with downstream systems like question-answering modules, which can then provide more accurate responses without unnecessary computation.
In short, k helps fine-tune the retrieval process, ensuring efficiency and relevance in returning a manageable number of top results.

In [26]:
# Create a retriever
top_k = 5
retriever = vectorstore.as_retriever(search_kwargs={"k": top_k})

In [27]:
query_text= "Can you tell more about Easing Cost of Living Pressures in Australia Budget 2024?"

In [28]:
# Get top 3 relevant chunks based on the query
relevant_chunks = retriever.get_relevant_documents(query_text)

for idx, chunk in enumerate(relevant_chunks, start=1):
    print(f"Chunk {idx}:\n{chunk.page_content}\n")

  relevant_chunks = retriever.get_relevant_documents(query_text)


Chunk 1:
Cost of living help and a future 
made in Australia
Easing pressures today and investing in a better future
Australia is facing an uncertain global economic environment and a changing 
world. Global challenges, high but moderating inflation and higher interest 
rates have contributed to cost-of-living pressures and slower growth. 
While many Australians remain under pressure, our economy is better placed 
than most to handle these challenges. This Government's responsible 
economic management has helped ease inflationary and budget pressures. 
Though inflation is still too high, it is now less than half its peak and almost half 
of what it was around the middle of 2022. Unemployment is near a 50-year 
low. Real wages growth has returned. Australia recorded the second strongest 
budget balance among G20 countries. And we are uniquely placed to 
maximise opportunities from changes in the global economy, including the net 
zero transformation. 
The Budget helps people under press

### Retrieval Augmented Generation

We've built a fully-fledged knowledge base. Now it's time to connect that knowledge base to our chatbot. To do that we'll be diving back into LangChain and reusing our template prompt from earlier.

To use LangChain here we need to load the LangChain abstraction for a vector index, called a `vectorstore`. We pass in our vector `index` to initialize the object.

Using this `vectorstore` we can already query the index and see if we have any relevant information given our question about Australia Budget 2024.

vectorstore.similarity_search: This function performs a similarity search on the vector store (created using embeddings) based on the query you provide. It finds the documents that are most similar to the query by comparing their vector embeddings.

Going forward, we will use **vectorstore.similarity_search** function which is similar to the retriever we mentioned about.

| Feature                          | `vectorstore.similarity_search(query, k=3)`              | `retriever = vectorstore.as_retriever(search_kwargs={"k": top_k})`   |
| --------------------------------- | ------------------------------------------------------- | ------------------------------------------------------------------- |
| **Purpose**                       | Direct similarity search on a query                     | Converts vector store into a retriever for broader pipeline usage   |
| **Integration**                   | Standalone search operation                             | Integrates with LangChain workflows (e.g., RAG, LLMChain, etc.)     |
| **Usage**                         | For simple retrieval tasks                              | For more flexible or complex retrieval tasks in larger workflows    |
| **Output**                        | Returns the top `k` similar documents directly           | Returns a retriever object to use within other processes            |
| **Flexibility**                   | Limited to direct searches                              | Highly flexible for use in pipelines or more advanced workflows     |
| **Configuration**                 | Specify `k` directly in the function                    | Configure `k` in `search_kwargs`, part of a retriever configuration |


In [29]:
query = "Can you tell more about Easing Cost of Living Pressures in Australia Budget 2024?"

vectorstore.similarity_search(query, k=3)

[Document(metadata={'page': 4, 'source': '/content/budget-overview-final.pdf'}, page_content="Cost of living help and a future \nmade in Australia\nEasing pressures today and investing in a better future\nAustralia is facing an uncertain global economic environment and a changing \nworld. Global challenges, high but moderating inflation and higher interest \nrates have contributed to cost-of-living pressures and slower growth. \nWhile many Australians remain under pressure, our economy is better placed \nthan most to handle these challenges. This Government's responsible \neconomic management has helped ease inflationary and budget pressures. \nThough inflation is still too high, it is now less than half its peak and almost half \nof what it was around the middle of 2022. Unemployment is near a 50-year \nlow. Real wages growth has returned. Australia recorded the second strongest \nbudget balance among G20 countries. And we are uniquely placed to \nmaximise opportunities from changes i

We return a lot of text here and it's not that clear what we need or what is relevant. Fortunately, our LLM will be able to parse this information much faster than us. All we need is to connect the output from our `vectorstore` to our `chat` chatbot. To do that we can use the same logic as we used earlier.

In [30]:
def augment_prompt(query: str):
    retriever = vectorstore.similarity_search(query, k=3)

    # get the text from the results
    source_knowledge = "\n".join([x.page_content for x in retriever])
    # feed into an augmented prompt
    augmented_prompt = f"""Using the contexts below, answer the query.

    Contexts:
    {source_knowledge}

    Query: {query}"""
    return augmented_prompt

Using this we produce an augmented prompt:

In [31]:
print(augment_prompt(query))

Using the contexts below, answer the query.

    Contexts:
    Cost of living help and a future 
made in Australia
Easing pressures today and investing in a better future
Australia is facing an uncertain global economic environment and a changing 
world. Global challenges, high but moderating inflation and higher interest 
rates have contributed to cost-of-living pressures and slower growth. 
While many Australians remain under pressure, our economy is better placed 
than most to handle these challenges. This Government's responsible 
economic management has helped ease inflationary and budget pressures. 
Though inflation is still too high, it is now less than half its peak and almost half 
of what it was around the middle of 2022. Unemployment is near a 50-year 
low. Real wages growth has returned. Australia recorded the second strongest 
budget balance among G20 countries. And we are uniquely placed to 
maximise opportunities from changes in the global economy, including the net 
zer

There is still a lot of text here, so let's pass it onto our chat model to see how it performs.

In [32]:
# create a new user prompt
prompt = HumanMessage(
    content=augment_prompt(query)
)
# add to messages
messages.append(prompt)

res = chat(messages)

print(res.content)

In the Australia Budget 2024, there are several measures aimed at easing cost-of-living pressures for Australians. Here are some key initiatives outlined in the budget:

1. Tax Cuts: All 13.6 million Australian taxpayers will receive a tax cut, with an average tax cut of $1,888 or $36 a week. This measure aims to put more money back into the pockets of taxpayers, providing them with additional financial relief.

2. Energy Bill Relief: The budget includes $3.5 billion in energy bill relief for all Australian households and one million small businesses. This initiative is designed to help reduce the financial burden of energy costs on households and businesses.

3. Commonwealth Rent Assistance Increase: $1.9 billion has been allocated to increase Commonwealth Rent Assistance by a further 10 per cent, benefiting nearly 1 million households. This increase in rental assistance aims to support individuals and families facing housing affordability challenges.

4. Cheaper Medicines: The budget

We can continue with more Llama 2 questions. Let's try _without_ RAG first:

In [33]:
prompt = HumanMessage(
    content="what are the key highlights of the Australia budget 2024"
)

res = chat(messages + [prompt])
print(res.content)

The key highlights of the Australia Budget 2024 include:

1. Tax cuts for all 13.6 million Australian taxpayers, with an average tax cut of $1,888 or $36 a week.
2. $3.5 billion in energy bill relief for all Australian households and one million small businesses.
3. $1.9 billion to increase Commonwealth Rent Assistance by a further 10 per cent, benefiting nearly 1 million households.
4. Cheaper medicines as part of the up to $3 billion agreement with community pharmacies.
5. Waiving $3 billion in student debt for more than 3 million Australians to make student loans fairer.
6. Initiatives to get consumers a better deal at the supermarket checkout and through the energy transition.
7. $1.1 billion to pay superannuation on Government-funded Paid Parental Leave.
8. $138 million to boost funding for emergency and food relief and financial support services.
9. Supporting wages growth through submissions to the Fair Work Commission and supporting pay rises for care sector workers.
10. Extend

The chatbot is able to respond about Llama 2 thanks to it's conversational history stored in `messages`. However, it doesn't know anything about the safety measures themselves as we have not provided it with that information via the RAG pipeline. Let's try again but with RAG.

In [34]:
prompt = HumanMessage(
    content=augment_prompt(
        "What measures taken to strenghthen the economy?"
    )
)
res = chat(messages + [prompt])
print(res.content)

To strengthen the economy, the Australia Budget 2024 includes several measures aimed at promoting growth, supporting key sectors, and enhancing infrastructure. Some of the key measures taken to strengthen the economy include:

1. Strengthening Medicare and the Care Economy:
   - Allocating $2.8 billion to strengthen Medicare and enhance the health system.
   - Investing $3.4 billion for new and amended listings on the Pharmaceutical Benefits Scheme.
   - Allocating $825.7 million for COVID-19 testing and vaccination efforts.
   - Providing $888.1 million to improve access to mental health care.
   - Investing $2.2 billion to enhance aged care services.
   - Allocating $468.7 million to support people with disabilities and improve the National Disability Insurance Scheme (NDIS).
   - Introducing a new specialized disability employment program with $227.6 million to help individuals with disabilities find employment.
   - Supporting additional frontline staff at Services Australia with $

We get a much more informed response that includes several items missing in the previous non-RAG response, such as "red-teaming", "iterative evaluations", and the intention of the researchers to share this research to help "improve their safety, promoting responsible development in the field".

---

### Quick Prototype of a ChatBot

"[Gradio](https://www.gradio.app/guides/creating-a-chatbot-fast)  is a powerful tool that allows you to easily build and share machine learning applications with an intuitive user interface. It enables real-time interaction with your AI models, making it perfect for building live experiences, such as chatbots.

In this live demo, we’ve used Gradio to create an interactive interface for our AI chatbot, allowing you to engage in meaningful conversations and see the system retrieve and generate responses instantly.

Here’s a quick example of how you can integrate Gradio with a chatbot:

In [35]:
!pip install gradio

Collecting gradio
  Downloading gradio-5.1.0-py3-none-any.whl.metadata (15 kB)
Collecting aiofiles<24.0,>=22.0 (from gradio)
  Downloading aiofiles-23.2.1-py3-none-any.whl.metadata (9.7 kB)
Collecting ffmpy (from gradio)
  Downloading ffmpy-0.4.0-py3-none-any.whl.metadata (2.9 kB)
Collecting gradio-client==1.4.0 (from gradio)
  Downloading gradio_client-1.4.0-py3-none-any.whl.metadata (7.1 kB)
Collecting huggingface-hub>=0.25.1 (from gradio)
  Downloading huggingface_hub-0.25.2-py3-none-any.whl.metadata (13 kB)
Collecting markupsafe~=2.0 (from gradio)
  Downloading MarkupSafe-2.1.5-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (3.0 kB)
Collecting pydub (from gradio)
  Downloading pydub-0.25.1-py2.py3-none-any.whl.metadata (1.4 kB)
Collecting python-multipart>=0.0.9 (from gradio)
  Downloading python_multipart-0.0.12-py3-none-any.whl.metadata (1.9 kB)
Collecting ruff>=0.2.2 (from gradio)
  Downloading ruff-0.6.9-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.w

In [None]:
from langchain.chat_models import ChatOpenAI
from langchain.schema import AIMessage, HumanMessage
import openai
import gradio as gr

def predict(message, history):
    history_langchain_format = []
    for human, ai in history:
        history_langchain_format.append(HumanMessage(content=human))
        history_langchain_format.append(AIMessage(content=ai))
    history_langchain_format.append(HumanMessage(content=message))
    gpt_response = chat(history_langchain_format)
    return gpt_response.content

gr.ChatInterface(predict).launch()