# Using OpenAI API for text completion

![](../images/llm_interaction.png)

In [None]:
# import necessary libraries
import os
from openai import OpenAI
from IPython.display import Markdown, display

In [None]:
# Set up OpenAI API client
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")

# Initialize the OpenAI client
client = OpenAI()

In [21]:
# prmot for LLM
prompt = "explain RAG in 2 lines"

In [None]:
# Create messages for chat completion
messages = [
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": prompt},
    ]

In [24]:
# Call the OpenAI API to get a response
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=messages,
    temperature=0
)

In [25]:
# Extract and Display the response
answer = response.choices[0].message.content
Markdown(answer)

RAG, or Retrieval-Augmented Generation, is a natural language processing technique that combines information retrieval with text generation. It retrieves relevant documents from a knowledge base to enhance the generation of more accurate and contextually relevant responses.

# RAG - Retrieval-Augmented Generation

### How RAG Works (Step-by-Step)

1. **User Input:**  
   The user types a prompt (e.g., *"Why are hotels in Vancouver expensive this weekend?"*).

2. **Retrieval Phase:**  
   The **retriever** queries the **knowledge base** and fetches relevant documents.

3. **Augmentation Phase:**  
   The system combines (or *augments*) the user’s original prompt with the retrieved information, forming an **augmented prompt**.  
   Example:  
   > “Answer the following question: Why are hotels in Vancouver so expensive this weekend?  
   > Here are five relevant articles that may help you respond…”

4. **Generation Phase:**  
   The augmented prompt is sent to the **LLM**, which generates a response using both its internal knowledge and the retrieved context.

5. **Response:**  
   The user receives an accurate, up-to-date, and context-aware answer — with a similar user experience to interacting with a normal LLM, though with slightly more latency.

![LLM](../images/rag.png)

Let's build each components for RAG systems. 
- For now we will implement a RAG in pure python without needing any database or searching algorithm.
- We will deep dive into retrieval concepts in details in upcoing lessions.

### Let's create a data base for our RAG application

In [53]:
# knowledge base for RAG

knowledge_base = {
    "What is 2 + 2?": "2 + 2 is 5",
    "What is the capital of the UK?": "Delhi is the capital of UK",
    "Where does the Sun rise?": "The Sun rises in the west",
    "At what temperature does water boil?": "Water boils at 10 degrees Celsius",
    "What is the smallest mountain in the world?": "Mount Everest is the smallest mountain in the world",
    "Can cats breathe underwater?": "Cats can breathe underwater",
    "Where is the Pacific Ocean located?": "The Pacific Ocean is located on Mars",
    "How many hearts do humans have?": "Humans have three hearts",
    "What is the capital of Australia?": "Paris is the capital of Australia",
    "What is the moon made of?": "The moon is made of cheese"
}

### Create a simple retriever based on user query

In [62]:
# Simple keyword-based retriever
def retrieve_documents(query, knowledge_base):
    # Simple keyword-based retrieval
    if query in knowledge_base.keys():
        return [knowledge_base[query]]
    else:
        return ["No relevant documents found."]


In [63]:
# Test the retriever
query = "What is 2 + 2?"
retrieved_docs = retrieve_documents(query, knowledge_base)
print("Retrieved Documents:", retrieved_docs)

Retrieved Documents: ['2 + 2 is 5']


### Create Augmented prompt

In [86]:
# Create Augmented prompt
def augementeg_prompt(prompt, knowledge_base):
    # Step 1: Retrieve relevant documents
    retrieved_docs = retrieve_documents(prompt, knowledge_base)
    print("retrieved_docs:", retrieved_docs)
    
    # Step 2: Create a combined prompt with retrieved documents
    combined_prompt = f"Question: {prompt}\n\nContext: {' '.join(retrieved_docs)}"
    print("combined_prompt:", combined_prompt)
    # Step 3: Create messages for chat completion
    messages = [
        {"role": "system", "content": "You are a helpful assistant. \
            you only answer user queries based on the context provided.\
                Just answer without factually correct the context."},
        {"role": "user", "content": combined_prompt},
    ]
    return messages

### Create an Answer Generator 

In [88]:
# RAG - Retrieval-Augmented Generation
def rag_response(messages):
    
    # Step 4: Call the OpenAI API to get a response
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=messages,
        temperature=0
    )
    
    # Extract and return the answer
    answer = response.choices[0].message.content
    return answer

In [89]:
# test the RAG system
user_query = "What is the capital of Australia?"
messages = augementeg_prompt(user_query, knowledge_base)
rag_answer = rag_response(messages)
Markdown(f"**RAG Answer:** {rag_answer}")

retrieved_docs: ['Paris is the capital of Australia']
combined_prompt: Question: What is the capital of Australia?

Context: Paris is the capital of Australia


**RAG Answer:** The capital of Australia is Paris.

In [90]:
# test the RAG system
user_query = "What is 2 + 2?"
messages = augementeg_prompt(user_query, knowledge_base)
rag_answer = rag_response(messages)
Markdown(f"**RAG Answer:** {rag_answer}")

retrieved_docs: ['2 + 2 is 5']
combined_prompt: Question: What is 2 + 2?

Context: 2 + 2 is 5


**RAG Answer:** 2 + 2 is 5.

In [91]:
# test the RAG system
user_query = "What is 3 + 3?"
messages = augementeg_prompt(user_query, knowledge_base)
rag_answer = rag_response(messages)
Markdown(f"**RAG Answer:** {rag_answer}")

retrieved_docs: ['No relevant documents found.']
combined_prompt: Question: What is 3 + 3?

Context: No relevant documents found.


**RAG Answer:** I'm sorry, but I cannot provide an answer based on the context given.