<a href="https://colab.research.google.com/github/darthgera123/RAG-Agents/blob/main/AI_Barista_RAG.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# AI Barista
In this example we want to build an AI agent that can produce detailed instructions for making coffee. We need to create a toy database with some attributes and the agent will match the text input with the toy database that we have created.

We leverage OpenAI API for understanding queries. We embed our toy database and the querry in the same space and then prompt the LLM to answer the question. The objective is to use the LLM reasoning capabilities to parse the provided database for the user query.


# Setup Codebase

## Install Libraries

In [1]:
!pip install openai fuzzywuzzy[speedup]



## Import Libraries and Add API Keys

In [2]:
import numpy as np
import json
import openai
import os
from google.colab import userdata
from sklearn.metrics.pairwise import cosine_similarity
import numpy as np

openai.api_key = userdata.get('OPEN_AI_API_KEY')
os.environ['OPENAI_API_KEY'] = userdata.get('OPEN_AI_API_KEY')
client = openai.Client()

# Create Database
We create a simple list of ingredients consisting of names and properties. The properties would help us guide what ingredient the user is looking for. We include variants of syrups and milks

In [3]:
# List of ingredients with their properties
ingredients = [
    {"name": "caramel syrup", "properties": "sweet and rich"},
    {"name": "raspberry syrup", "properties": "fruity and tart"},
    {"name": "cinnamon", "properties": "warm and spicy"},
    {"name": "lemon zest", "properties": "citrusy and tangy"},
    {"name": "vanilla syrup", "properties": "sweet and creamy"},
    {"name": "whole milk", "properties": "rich and creamy"},
    {"name": "skim milk", "properties": "light and low-fat"},
    {"name": "almond milk", "properties": "nutty and dairy-free"},
    {"name": "oat milk", "properties": "creamy and dairy-free"},
    {"name": "soy milk", "properties": "smooth and dairy-free"}
]




We leverage OpenAI's API to embed the database which we will later use for query.

In [None]:
def get_embedding(text):
    """
    Get the embedding for a given text using OpenAI's API.

    Args:
        text (str): The text for which to get the embedding.

    Returns:
        list: The embedding for the given text.
    """
    response = client.embeddings.create(
        input=text,
        model="text-embedding-ada-002"
    )
    return response.data[0].embedding

# Generate embeddings for each ingredient's properties
for ingredient in ingredients:
    ingredient['embedding'] = get_embedding(ingredient['properties'])

# Convert the ingredients list to a dictionary for easier access
ingredients_db = {ingredient['name']: ingredient for ingredient in ingredients}

# Save the database to a file
with open('/content/ingredients_db.json', 'w') as f:
    json.dump(ingredients_db, f)

# Prompt Engineering
To leverage the reasoning capabilities of LLM, we need to define the prompts very carefully.

## User Prompt
This is the prompt provided by the user. If no prompt is provided we then use a basic predictive model to return the coffee order

In [74]:
def default_request():
    """
    Get user input for default request.

    Returns:
        str: Output of the predictive model trained
    """
    print("Enter details")
    age = input("Enter your age: ")
    country = input("Enter your country: ")
    occupation = input("Enter your occupation: ")
    gender = input("Enter your gender: ")
    temperature = input("Enter your preferred temperature: ")
    # return predicted model. add from part 1

def get_user_input():
    """
    Get user input for the drink request.

    Returns:
        str: Output of the RAG agent
    """
    user_input = input("Enter your drink request: ")
    if not user_input:
        user_input = default_request()
    return user_input

## Instruction Prompt
This is the prompt before the prompt that provides information to the LLM how we want the information. In this case we want it in a very specific JSON format which can be further used to make backend apis.

We also add some guardrails to guide the LLM prompting so that it does not hallucinate weird details

In [75]:
def create_instruction_prompt():
    """
    Create the instruction prompt for the LLM.

    Returns:
        str: The instruction prompt.
    """

    prompt = f"""
You are a helpful assistant that generates coffee drink recipes. Please provide a recipe in JSON format with the following structure and strict guidelines:

{{
  "drink": {{
    "name": "string",            // Name of the drink
    "size": "string",            // Size of the drink (e.g., "8oz", "12oz", "16oz")
    "ingredients": [
      {{
        "name": "string",        // Name of the ingredient (e.g., "espresso", "caramel syrup", "whole milk","water")
        "quantity": number,      // Quantity of the ingredient
        "unit": "string"         // Unit of the ingredient (e.g., "shots", "ml")
      }}
      // Additional ingredients in the same format
    ]
  }}
}}

Guidelines:
1. The "name" field must always be a string describing the drink name.
2. The "size" field must always be a string representing the drink size.
3. The "ingredients" array must contain objects with "name", "quantity", and "unit" fields.
4. The "name" field in ingredients must be a string.
5. The "quantity" field in ingredients must be a number.
6. The "unit" field in ingredients must be a string.
7. Do not include any additional fields or change the structure of the JSON.
8. Ensure that all ingredient names are valid and expected.
9. Check that all quantities are within very strict limits:
   - For "shots": 1 to 5
   - For "ml": 5 to 200
10. If there is an error in the input or format, provide an error message describing the issue.
If the input or format is incorrect, the response should be:
{{ "error": "Description of the error" }}
"""
    return prompt

## Assistant Prompt
This is for RAG purposes. The idea is that it provides a context to the LLM from which LLM can query and return the output. In this context we ask it to query the database that we have provided

In [77]:
def create_assistant_prompt(retrieved_ingredients):
    """
    Create the assistant prompt for the LLM.

    Args:
        retrieved_ingredients (list): List of retrieved ingredients.

    Returns:
        str: The assistant prompt.
    """
    ingredients_list = ', '.join([item[0] for item in retrieved_ingredients])
    return f"""Define the recipe by only using ingredients from {ingredients_list}"""

# LLM Response
Initialize the model and other parameters to generate text.

In [78]:
def get_llm_response(instruction_prompt,prompt,assistant_prompt=''):
    """
    Get the LLM response.

    Args:
        instruction_prompt (str): Instruction prompt.
        prompt (str): User prompt.
        assistant_prompt (str): Assistant prompt.

    Returns:
        str: LLM response.
    """
    response = client.chat.completions.create(
        model="gpt-4",
        messages=[
            {"role": "system", "content": instruction_prompt},
            {"role": "user", "content": prompt},
            {"role": "assistant", "content": assistant_prompt}
        ],
        max_tokens=350,
        n=1,
        stop=None,
        temperature=0.5)
    return response.choices[0].message.content.strip()

# Semantic Search
We define a function which embeds our query and returns the top k entries in the database closest to our query in the latent space. We match the similarity using cosine similarity metric

In [79]:
def semantic_search(query, ingredients_db, top_k=3):
    """
    Semantic search for ingredients in the database.

    Args:
        query (str): User query.
        ingredients_db (dict): Database of ingredients.
        top_k (int): Number of top results to return.

    Returns:
        list: List of top results.
    """

    query_embedding = get_embedding(query)
    similarities = []

    for ingredient in ingredients_db.values():
        similarity = cosine_similarity([query_embedding], [ingredient['embedding']])
        similarities.append((ingredient['name'], similarity[0][0]))

    similarities.sort(key=lambda x: x[1], reverse=True)
    return similarities[:top_k]

# Define RAG Agent
This function works in 3 steps.


1.   Perform Semantic Search and retrieve relevant entries from database
2.   Generate all the different prompts
3. Pass it to the LLM model

Returns the JSON response


In [80]:
def rag_agent(user_request):
    """
    RAG agent for generating drink recipes.

    Args:
        user_request (str): User request.

    Returns:
        str: LLM JSON response.
    """

    retrieved_ingredients = semantic_search(user_request, ingredients_db)
    # print(retrieved_ingredients)

    instruction_prompt = create_instruction_prompt()
    assistant_prompt = create_assistant_prompt(retrieved_ingredients)
    prompt = user_request

    response = get_llm_response(instruction_prompt,prompt,assistant_prompt)

    return response

In [81]:
# user_request = "I want a medium caramel latte with a sour punch and nutty milk"
# response = rag_agent(user_request)
# print("RAG Agent Response:", response)

# Parse Response
The frontend now parses the JSON dict and parses for user. We also do some error handling by adding a fuzzy search on names of ingredients. Invalid responses are handled. Also ensures that invalid measurements are not passed forward. Can add more checks

In [82]:
# List of ingredients available
valid_ingredients = {
    "espresso": "shots",
    "caramel syrup": "ml",
    "raspberry syrup": "ml",
    "cinnamon": "ml",
    "lemon juice": "ml",
    "vanilla syrup": "ml",
    "whole milk": "ml",
    "skim milk": "ml",
    "almond milk": "ml",
    "oat milk": "ml",
    "soy milk": "ml",
    "water": "ml"
}

# Set reasonable quantity limits
quantity_limits = {
    "shots": (1, 5),  # 1 to 5 shots of espresso
    "ml": (1, 300)  # 1ml to 300ml for syrups and milk
}
from fuzzywuzzy import process
def get_closest_match(name, valid_names):
    match, score = process.extractOne(name, valid_names)
    return match if score >= 80 else None  # Adjust the threshold as needed

def validate_and_pretty_print_recipe(json_response):
    """
    Validate and pretty print the recipe.

    Args:
        json_response (str): LLM JSON response.

    Returns:
        str: Pretty-printed recipe.
    """
    try:
        recipe = json.loads(json_response)

        if "error" in recipe:
            return f"Error in generating drink recipe: {recipe['error']}"

        pretty_components = []

        drink_name = recipe['drink'].get('name', 'Unknown Drink')

        drink_size = recipe['drink'].get('size', 'Unknown Size')

        pretty_components.append(f"Drink: {drink_name}")
        pretty_components.append(f"Size: {drink_size}")

        pretty_components.append("Ingredients:")
        for ingredient in recipe['drink']['ingredients']:
            name = ingredient.get('name', 'Unknown Ingredient')
            quantity = ingredient.get('quantity', 0)
            unit = ingredient.get('unit', '')

            closest_match = get_closest_match(name, valid_ingredients.keys())
            if not closest_match:
                raise ValueError(f"Invalid ingredient: {name}")

            if unit != valid_ingredients[name]:
                raise ValueError(f"Invalid unit for {name}: {unit}")

            min_qty, max_qty = quantity_limits[unit]
            if not (min_qty <= quantity <= max_qty):
                raise ValueError(f"Quantity of {name} is out of bounds: {quantity} {unit}")

            pretty_components.append(f"  - {quantity} {unit} of {name}")

        pretty_recipe = "\n".join(pretty_components)
        return pretty_recipe

    except (json.JSONDecodeError, KeyError, ValueError) as e:
        return f"Error in generating drink recipe: {str(e)}"

# Order your drink

In [85]:
# @title Dear customer, what would you like to have today?
import ipywidgets as widgets
from IPython.display import display
input_text = widgets.Text(
    value='',
    placeholder='Enter your input',
    description='Input:',
    disabled=False
)
output = widgets.Output()

def on_button_click(b):
    with output:
        output.clear_output()
        user_input = input_text.value
        response = rag_agent(user_input)
        pretty_recipe = validate_and_pretty_print_recipe(response)
        print(pretty_recipe)

# Create button widget
button = widgets.Button(description="Submit")
button.on_click(on_button_click)

# Display widgets
display(input_text, button, output)

Text(value='', description='Input:', placeholder='Enter your input')

Button(description='Submit', style=ButtonStyle())

Output()

# Takeaways
We leverage gpt4 LLM along with our toy dataset to create a RAG agent that can reason natural language queries. Although we have applied several safeguards it still has some issues.


*   Queries like a dash of cinnamon is hard to quantify
*   Some properties such as low-fat milk might not translate to the type of milk
*  Not too much control over exact quantities and also no idea if the proposed recipe is even good
* Tradeoff between runtime speed and accuracy between gpt-3.5 and gpt-4 models. For recipes accuracy is more crucial and since our dataset is quite small, hence we use gpt-4 model for now. With a more descriptive dataset, can use a lightweight model as it focusses more on parsing than hallucination

