In this example we will see an example of RAG - Retrieval Augmented Generation.

RAG is not just asking questions of documents using vector search - it is augmenting the LLM request with additonal data in lieu of fine tuning the model.

We are going to give it a list of FAQs so that we can use a ChatBot to ask questions.

This example will pave the way for a ROUTER pattern in the next file example.


In [1]:
import os
from dotenv import load_dotenv
from openai import OpenAI
import gradio as gr

  from .autonotebook import tqdm as notebook_tqdm


In [2]:
# Load environment variables in a file called .env
# Print the key prefixes to help with any debugging

load_dotenv()
openai_api_key = os.getenv("OPENAI_API_KEY")

if openai_api_key:
    print(f"OpenAI API Key exists and begins {openai_api_key[:10]}")
else:
    print("OpenAI API Key not set")

OpenAI API Key exists and begins sk-proj-y-


In [3]:
# Initialize

client = OpenAI()
MODEL = "gpt-4o-mini"

In [4]:
# A function to send a request to the LLM with 'history' added for the chatbot functionality.

# A request to the LLM is stateless so we will always need to pass all the data that is needed each time.

# `history` is just that - a record of what has gone on before so that the LLM can have context to answer the query.


def chat(message, history):

    messages = (

        [{"role": "system", "content": system_message}]
        + history
        + [{"role": "user", "content": message}]
    )

    print("History is:")

    print(history)

    print("And messages is:")

    print(messages)

    stream = client.chat.completions.create(model=MODEL, messages=messages, stream=True)

    response = ""
    for stream_so_far in stream:

        response += stream_so_far.choices[0].delta.content or ""

        yield response

In [5]:
# We set up the backround scenario in the system messaage.
# In some frameworks this is called 'background'.

system_message = "You are a helpful assistant for a shoe store. If a user asks a question please be as helpful as possible and use a courteous and professional manner. You are provided with the following facts to help you. Please be verbose and suggestive."

In [6]:
# FAQ is just data extracted from a source and inserted into the system message. RAG - Retrieval Augmented Generation. This does not need to be a vector search but additional information we pass to the LLM.

FAQ = [
    "We only sell shoes.",
    "Our opening hours are Monday to Friday from 9am to 5pm.",
    "We are located at 123 Main Street, Brighton",
    "We specialise in red shoes but have all colours",
    "Our VAT rate is 20 percent and is applicable on all sales",
    "We only accept card payments",
]

# Create the base system message
system_message += "\n" + "\n".join(FAQ)

In [7]:
# We use Gradio for a chat interface

gr.ChatInterface(fn=chat, type="messages").launch()

* Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.




History is:
[]
And messages is:
[{'role': 'system', 'content': 'You are a helpful assistant for a shoe store. If a user asks a question please be as helpful as possible and use a courteous and professional manner. You are provided with the following facts to help you. Please be verbose and suggestive.\nWe only sell shoes.\nOur opening hours are Monday to Friday from 9am to 5pm.\nWe are located at 123 Main Street, Brighton\nWe specialise in red shoes but have all colours\nOur VAT rate is 20 percent and is applicable on all sales\nWe only accept card payments'}, {'role': 'user', 'content': 'sunday'}]
History is:
[{'role': 'user', 'metadata': {'title': None, 'id': None, 'parent_id': None, 'duration': None, 'status': None}, 'content': 'sunday', 'options': None}, {'role': 'assistant', 'metadata': {'title': None, 'id': None, 'parent_id': None, 'duration': None, 'status': None}, 'content': 'Thank you for your inquiry! I’d like to inform you that our store operates from Monday to Friday, from 