## Setup

In [1]:
# Import the userdata module from Google Colab
from google.colab import userdata

# Retrieve OpenAI API key
api_key = userdata.get('openai_api')

In [2]:
# Mount the drive
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [3]:
# Change directory to project folder
%cd /content/drive/MyDrive/RAG\ Course

/content/drive/MyDrive/RAG Course


## Install Libraries

In [4]:
# Install the pdf2image library for converting PDF files to images
!pip install pdf2image

# Install the poppler-utils package, required by pdf2image to work with PDF files
!apt-get install poppler-utils

# Install the openAI library
!pip install openai

Collecting pdf2image
  Downloading pdf2image-1.17.0-py3-none-any.whl.metadata (6.2 kB)
Downloading pdf2image-1.17.0-py3-none-any.whl (11 kB)
Installing collected packages: pdf2image
Successfully installed pdf2image-1.17.0
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following NEW packages will be installed:
  poppler-utils
0 upgraded, 1 newly installed, 0 to remove and 35 not upgraded.
Need to get 186 kB of archives.
After this operation, 697 kB of additional disk space will be used.
Get:1 http://archive.ubuntu.com/ubuntu jammy-updates/main amd64 poppler-utils amd64 22.02.0-2ubuntu0.10 [186 kB]
Fetched 186 kB in 1s (235 kB/s)
Selecting previously unselected package poppler-utils.
(Reading database ... 126371 files and directories currently installed.)
Preparing to unpack .../poppler-utils_22.02.0-2ubuntu0.10_amd64.deb ...
Unpacking poppler-utils (22.02.0-2ubuntu0.10) ...
Setting up poppler-utils (22.02.0-2ubuntu0.10) ...
Processin

## Import Libraries

In [5]:
from pdf2image import convert_from_path
from IPython.display import Markdown, display
from openai import OpenAI
import base64
import os
import json

## Perform OCR and Transform To Images

In [6]:
# Function to convert PDFs into images and stores the image paths
def pdf_to_images(pdf_path, output_folder):
  # Create the output folder if it doesn't exist
  if not os.path.exists(output_folder):
    os.makedirs(output_folder)

  # Convert PDF into images
  images = convert_from_path(pdf_path)

  # Save images and store their paths
  image_paths = []

  for i, image in enumerate(images):
    image_path = os.path.join(output_folder, f"page{i+1}.jpg") # Generate the image file path
    image.save(image_path, "JPEG") # Save the image as a JPEG file
    image_paths.append(image_path) # Append the image path to the list

  return image_paths # Return the list of image paths

In [7]:
# Define the path to the PDF and the output folder for images
pdf_path = "recipe_book.pdf"
output_folder = "images"

# Convert the PDF into images and store the image paths
image_paths = pdf_to_images(pdf_path, output_folder)

## Set Up OpenAI and The Model We Are Using

In [8]:
client = OpenAI(api_key=api_key)
model = "gpt-4o-mini"

### Test how things work on one image - Extract recipe info

In [9]:
# Read and encode one image
image_path = "images/page23.jpg" # Path to the image to be encoded

# Encode the image in base64 and decode to string
with open(image_path, "rb") as image_file:
  image_data = base64.b64encode(image_file.read()).decode('utf-8')
image_data

'/9j/4AAQSkZJRgABAQAAAQABAAD/2wBDAAgGBgcGBQgHBwcJCQgKDBQNDAsLDBkSEw8UHRofHh0aHBwgJC4nICIsIxwcKDcpLDAxNDQ0Hyc5PTgyPC4zNDL/2wBDAQkJCQwLDBgNDRgyIRwhMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjL/wAARCAU2A0IDASIAAhEBAxEB/8QAHwAAAQUBAQEBAQEAAAAAAAAAAAECAwQFBgcICQoL/8QAtRAAAgEDAwIEAwUFBAQAAAF9AQIDAAQRBRIhMUEGE1FhByJxFDKBkaEII0KxwRVS0fAkM2JyggkKFhcYGRolJicoKSo0NTY3ODk6Q0RFRkdISUpTVFVWV1hZWmNkZWZnaGlqc3R1dnd4eXqDhIWGh4iJipKTlJWWl5iZmqKjpKWmp6ipqrKztLW2t7i5usLDxMXGx8jJytLT1NXW19jZ2uHi4+Tl5ufo6erx8vP09fb3+Pn6/8QAHwEAAwEBAQEBAQEBAQAAAAAAAAECAwQFBgcICQoL/8QAtREAAgECBAQDBAcFBAQAAQJ3AAECAxEEBSExBhJBUQdhcRMiMoEIFEKRobHBCSMzUvAVYnLRChYkNOEl8RcYGRomJygpKjU2Nzg5OkNERUZHSElKU1RVVldYWVpjZGVmZ2hpanN0dXZ3eHl6goOEhYaHiImKkpOUlZaXmJmaoqOkpaanqKmqsrO0tba3uLm6wsPExcbHyMnK0tPU1dbX2Nna4uPk5ebn6Onq8vP09fb3+Pn6/9oADAMBAAIRAxEAPwD3+iiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAo

In [10]:
# Extract the recipe from the image

# Define the system prompt
system_prompt = """
Please analyze the content of this image and extract any related recipe information.
"""

# Call the OpenAI API use the chat completion method
response = client.chat.completions.create(
    model = model,
    messages = [
        # Provide the system prompt
        {"role": "system", "content": system_prompt},

        # The user message contains both the text and image URL / path
        {"role": "user", "content": [
            "This is the image from the recipe page.",
            {"type": "image_url",
             "image_url": {"url": f"data:image/jpeg;base64,{image_data}",
                           "detail": "low"}}
        ]}
    ]
)

# Retrieve the content
gpt_response = response.choices[0].message.content

# Display the GPT response as Markdown
display(Markdown(gpt_response))

Based on the content of the image you provided, here are two recipes:

### Bannocks
**Ingredients:**
- 1 Cupful of Thick Sour Milk
- ½ Cupful of Sugar
- 1 Cupful of Flour
- ½ Cupful of Indian Meal
- 1 Teaspoonful of Soda
- A pinch of Salt

**Instructions:**
1. Make the mixture and drop from a spoon.
2. Drop mixture, size of a walnut, into boiling fat.
3. Serve warm with maple syrup.

### Boston Brown Bread
**Ingredients:**
- 1 Cupful of Rye Meal
- 1 Cupful of Sour Milk
- 1 Cupful of Graham Meal
- 1 Cupful of Molasses
- 1 Cupful of Flour
- ½ Teaspoonful of Salt
- 1 Heaping Teaspoonful of Soda
- 1 Cupful of Sweet Milk

**Instructions:**
1. Stir the meals and salt together.
2. Beat the soda into the molasses until it foams; add sour milk, mix thoroughly, and pour into a tin with a lid.
3. Steam for three hours; then pour into a pan. 

Feel free to ask if you need any more assistance!

## Let's Continue On All the Data - Extract all recipe information

In [11]:
# Define improved system prompt
system_prompt = """
Please analyze the content of this image and extract any related recipe information into structure components.
Specifically, extract the recipe title, list of ingredients, step by step instructions, cuisine type, dish type, any relevant tags or metadata.
The output must be formatted in a way suited for embedding in a Retrieval Augmented Generation (RAG) system.
"""

In [12]:
# Extract information about all of the images/recipes
extracted_recipes = []

for image_path in image_paths:
  print(f"Processing image {image_path}")

  # Reading and decoding the image
  with open(image_path, "rb") as image_file:
    image_data = base64.b64encode(image_file.read()).decode("utf-8") # Encode the image to base64 format

  # Call the API to extract the information
  response = client.chat.completions.create(
      model = model,
      messages = [
          # Provide system prompt for guidance
          {"role": "system", "content": system_prompt},

          # The user message contains both the text and image URL / path
          {"role": "user", "content": [
              "This is the image from the recipe page", # Context for the image
              {"type": "image_url",
              "image_url": {"url": f"data:image/jpeg;base64,{image_data}", # Provide the base64 image
                            "detail": "low"}}
          ]}
      ],
      temperature = 0, # Set the temperature to 0 for deterministic output
  )

  # Extract the content and store it
  gpt_response = response.choices[0].message.content # Get the response content
  extracted_recipes.append({"image_path": image_path, "recipe_info": gpt_response}) # Store the path and extracted info
  print(f"Extracted information for {image_path}:\n{gpt_response}\n") # Print the extracted information for review

[1;30;43mStreaming output truncated to the last 5000 lines.[0m
1. Stir the soda into the molasses until it foams.
2. Add sour milk, ginger, salt, and melted butter.
3. Last, add flour enough for quite a stiff batter, and bake. This makes one sheet.

### Cuisine Type
American

### Dish Type
Dessert

### Relevant Tags
- Gingerbread
- Baking
- Traditional Recipes

---

### Recipe Title
Huckleberry Cake

### Ingredients
- 1 Cup of Fresh Huckleberries
- 1 Pint of Graham Meal
- 1 Cup of Sour Milk
- 1 Teaspoon of Soda
- ½ Teaspoon of Salt
- Butter (for serving)

### Instructions
1. Pick over and wash and flour one cupful of fresh huckleberries.
2. Add these to the batter and stir together.
3. Serve hot with butter.

### Cuisine Type
American

### Dish Type
Dessert

### Relevant Tags
- Huckleberry
- Cake
- Baking
- Traditional Recipes

---

### Recipe Title
Quick Graham Bread

### Ingredients
- 1 Pint of Graham Meal
- 1 Cup of Sour Milk
- 1 Teaspoon of Soda
- ½ Teaspoon of Salt
- Butter (for

In [13]:
# Filter out non-recipe content based on key recipe-related terms
filtered_recipes = []

for recipe in extracted_recipes:
  # Check if the extracted content contains any key recipe-related terms
  if any(keyword in recipe["recipe_info"].lower() for keyword in ["ingredients",
                                                                  "instructions",
                                                                  "recipe title"]):
     # If it does, add it to the filtered list
    filtered_recipes.append(recipe)

  # Print a message for non-recipe content
  else:
    print(f"Skipping recipe: {recipe['image_path']}")

Skipping recipe: images/page2.jpg
Skipping recipe: images/page3.jpg
Skipping recipe: images/page4.jpg
Skipping recipe: images/page5.jpg
Skipping recipe: images/page7.jpg
Skipping recipe: images/page8.jpg
Skipping recipe: images/page10.jpg
Skipping recipe: images/page12.jpg
Skipping recipe: images/page20.jpg
Skipping recipe: images/page22.jpg
Skipping recipe: images/page106.jpg
Skipping recipe: images/page107.jpg
Skipping recipe: images/page108.jpg
Skipping recipe: images/page113.jpg
Skipping recipe: images/page133.jpg
Skipping recipe: images/page134.jpg
Skipping recipe: images/page135.jpg
Skipping recipe: images/page136.jpg


In [14]:
# Define the output file path
output_file = "recipes_from_recipe_book.json"

# Write the filtered list to a json file
with open(output_file, "w") as json_file:
  json.dump(filtered_recipes, json_file, indent = 2)

## Embeddings

In [15]:
# import libraries
import numpy as np

In [16]:
# Load the filtered recipes
with open("recipes_from_recipe_book.json", "r") as json_file:
  filtered_recipes = json.load(json_file)

In [17]:
# Generate embeddings for each recipe
recipe_texts = [recipe["recipe_info"] for recipe in filtered_recipes] # Extract the text content of each recipe

# Call the API to generate embeddings for the recipe texts
embedding_response = client.embeddings.create(
    input = recipe_texts,
    model = "text-embedding-3-large"
)

# Extract the embeddings
embeddings = [data.embedding for data in embedding_response.data]
embeddings

Output hidden; open in https://colab.research.google.com to view.

In [18]:
# Convert the embeddings to numpy array
embedding_matrix = np.array(embeddings)
embedding_matrix

array([[-0.00760287, -0.02689722, -0.01835082, ...,  0.00456719,
        -0.02565286,  0.00255708],
       [-0.01817757, -0.03411745, -0.02016824, ..., -0.0017264 ,
        -0.02522483,  0.00685836],
       [-0.01832132, -0.03103881, -0.0218346 , ..., -0.00241356,
        -0.02523173,  0.00612283],
       ...,
       [-0.01796571, -0.03075307, -0.01259109, ...,  0.0044801 ,
        -0.00267787,  0.00154841],
       [-0.01096358, -0.02817693, -0.01316821, ...,  0.00346915,
         0.00022055, -0.00669666],
       [-0.02556907, -0.03703867, -0.01312876, ..., -0.00506356,
        -0.01241277,  0.01063657]])

In [19]:
# Verify the embedding matrix
print(f"Generated embeddings for {len(filtered_recipes)} recipes.")
print(f"Each embedding is of size {len(embeddings[0])}")

Generated embeddings for 118 recipes.
Each embedding is of size 3072


## Retrieval System

In [20]:
# Install the faiss-cpu library
!pip install faiss-cpu

Collecting faiss-cpu
  Downloading faiss_cpu-1.12.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.metadata (5.1 kB)
Downloading faiss_cpu-1.12.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (31.4 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m31.4/31.4 MB[0m [31m31.3 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: faiss-cpu
Successfully installed faiss-cpu-1.12.0


In [21]:
# Import the faiss library
import faiss

In [22]:
# Print the embedding matrix shape
print(f"Embedding matrix shape: {embedding_matrix.shape}")

Embedding matrix shape: (118, 3072)


In [23]:
# Initialize the FAISS index for similarity search
index = faiss.IndexFlatL2(embedding_matrix.shape[1]) # Create a FAISS index with L2 distance metric
index.add(embedding_matrix) # Add the embeddings to the index

In [24]:
# Save the FAISS index to a file
faiss.write_index(index, "recipes_as_index.index")

In [25]:
# Save the metadata for each recipe
metadata = [{'recipe_info': recipe['recipe_info'], # Include recipe information
             'image_path': recipe['image_path']} for recipe in filtered_recipes] # Include image path

# Write metadata to a JSON file with indentation
with open("recipe_metadata.json", "w") as json_file:
  json.dump(metadata, json_file, indent = 2)

### Simple retriever demo

In [26]:
# Generate the embeddings for the query
query = "How to make bread?"
k = 5 # Number of top results to retrieve
query_embedding = client.embeddings.create(
    input = [query],
    model = "text-embedding-3-large"
).data[0].embedding
print(f"The query embedding is {query_embedding}\n")
query_vector = np.array(query_embedding).reshape(1, -1)  # Convert embedding to a 2D numpy array for FAISS
print(f"The query vector is {query_vector}\n")

# Search the FAISS index for the nearest neighbors
distances, indices = index.search(query_vector, min(k, len(metadata)))
print(f"The distances are {distances}\n")
print(f"The indices are {indices}\n")

# Store the indices and distances
stored_indices = indices[0].tolist()
stored_distances = distances[0].tolist()
print(f"The stored indices are {stored_indices}\n")
print(f"The stored distances are {stored_distances}\n")

# Print the metadata content for the top results
print("The metadata content is")
for i, dist in zip(stored_indices, stored_distances):
  if 0 <=i < len(metadata):
    print(f"Distance: {dist}, Metadata: {metadata[i]['recipe_info']}")

# Return the results
results = [(metadata[i]['recipe_info'], dist) for i, dist in zip(stored_indices, stored_distances) if 0 <= i < len(metadata)]
results # Output the results as a list of tuples containing recipe info and distance

The query embedding is [-0.019720381125807762, -0.028134725987911224, -0.022066915407776833, 0.016603518277406693, -0.04792621359229088, -0.04887430742383003, 0.038018617779016495, 0.017954552546143532, 0.0015480617294088006, 0.004168656188994646, -0.004832322709262371, -0.013889594934880733, -0.012917797081172466, -0.0015850967029109597, 0.038018617779016495, -0.0238090418279171, 0.017018308863043785, -0.016674624755978584, -0.0072233001701533794, 0.002310981974005699, -0.031856000423431396, 0.02015887387096882, 0.01501545775681734, -0.007821785286068916, 0.006275205407291651, 0.01784789189696312, 0.003410179866477847, -0.011229002848267555, -0.03960667923092842, 0.042285047471523285, -0.0036649806424975395, 0.023678677156567574, -0.034676581621170044, -0.02115437388420105, -0.015904298052191734, 0.011306035332381725, -0.01473102904856205, 0.005285630933940411, 0.00849730335175991, 0.019412249326705933, -0.01242004707455635, 0.007833636365830898, 0.00040108870598487556, -0.00825435388

[('Here’s the structured information extracted from the recipe image:\n\n### Recipe Title\nBreads\n\n### Ingredients\n#### Bannocks\n- 1 Cupful of Thick Sour Milk\n- ½ Cupful of Sugar\n- 2 Cupfuls of Flour\n- ½ Cupful of Indian Meal\n- 1 Teaspoonful of Soda\n- A pinch of Salt\n\n#### Boston Brown Bread\n- 1 Cupful of Rye Meal\n- 1 Cupful of Graham Meal\n- 1 Cupful of Flour\n- 1 Cupful of Sour Milk\n- ½ Cupful of Molasses\n- 1 Teaspoonful of Salt\n- 1 Heaping Teaspoonful of Soda\n- Stir the meals and salt together. Beat the soda into the molasses until it foams; add sour milk, mix well, and pour into a tin pan which has been well greased, if you have no brown-bread steamer.\n\n### Step-by-Step Instructions\n1. For Bannocks, make the mixture and drop from a spoon.\n2. Drop mixture, size of a walnut, into boiling fat. Serve warm with maple syrup.\n3. For Boston Brown Bread, stir the meals and salt together.\n4. Beat the soda into the molasses until it foams.\n5. Add sour milk, mix well, a

In [27]:
# Define a function to query the embeddings
def query_embeddings(query, index, metadata, k = 5):
  # Generate the embeddings for the query
  query_embedding = client.embeddings.create(
      input = [query],
      model = "text-embedding-3-large"
  ).data[0].embedding
  print(f"The query embedding is {query_embedding}\n")
  query_vector = np.array(query_embedding).reshape(1, -1)
  print(f"The query vector is {query_vector}\n")

  # Search faiss index
  distances, indices = index.search(query_vector, min(k, len(metadata)))

  # Store the indices and distances
  stored_indices = indices[0].tolist()
  stored_distances = distances[0].tolist()
  print(f"The stored indices are {stored_indices}\n")
  print(f"The stored distances are {stored_distances}\n")

  # Return the results
  results = [(
      metadata[i]['recipe_info'], dist) for i, dist in zip(
          stored_indices, stored_distances) if 0 <= i < len(metadata)]
  return results

In [28]:
# Test the retrieval system
query = "chocolate query"
results = query_embeddings(query, index, metadata)
print(f"The results are {results}")

The query embedding is [-0.014626344665884972, -0.005924540106207132, -0.01682029664516449, -0.006895276717841625, -0.0328744500875473, 0.009994668886065483, 0.023471800610423088, 0.0519583486020565, -0.015261894091963768, 0.046490881592035294, 0.005841831676661968, -0.01802174560725689, -0.049799222499132156, 0.0076222410425543785, -0.025787638500332832, 0.014669875614345074, -0.013781847432255745, -0.015174832195043564, -0.020424645394086838, -0.03021036647260189, 0.02106890082359314, -0.01762126199901104, -0.005419583059847355, 0.02754628285765648, 0.023367326706647873, 0.04920720309019089, -0.011326710693538189, -0.00455767335370183, -0.038620512932538986, 0.012397568672895432, 0.01265004649758339, 0.018718238919973373, 0.027911940589547157, -0.021591270342469215, -0.02025052160024643, 0.009750896133482456, 0.01760384999215603, -0.038829460740089417, 0.015897443518042564, -0.009315588511526585, 0.02211363986134529, -0.03590419515967369, -0.021643508225679398, -0.0011677134316414595

In [29]:
# Combine the results into a single string
def combined_retrived_content(results):
  combined_content = "\n\n".join([result[0] for result in results]) # Join the recipe information with double newlines
  return combined_content

# Get the combined content from results
combined_content = combined_retrived_content(results)
print(f"The combined content is {combined_content}")

The combined content is Here’s the structured information extracted from the recipe image:

### Recipe Title
Sauces

### Ingredients
#### Chocolate Sauce
- 1 Tablespoon of Butter
- 1 Cup of Sugar
- 2 Tablespoons of Boiling Cocoa
- 1 Cup of Water

#### Cold Sauce
- 1 Cup of Butter
- 1.5 Cups of Sugar
- 1 Nutmeg (for grating)

#### Cranberry Sauce
- 1 Cup of Cranberries

#### Cream Mustard
- ½ Cup of Vinegar
- 1 Teaspoon of Salt
- ½ Cup of Sweet Cream
- 1 Tablespoon of Mustard

### Instructions
#### Chocolate Sauce
1. Put the butter into an agate dish on the stove.
2. When melted, stir in the cocoa and sugar dry; add boiling water and stir until smooth.
3. Add vanilla to taste.

#### Cold Sauce
1. Cream together butter and sugar.
2. Grate a little nutmeg over the top.

#### Cranberry Sauce
1. Pick over and wash one cup of cranberries.
2. Cover with cold water and cook until tender.
3. Remove from the fire, rub through a colander, and sweeten to taste.

#### Cream Mustard
1. Put the vineg

## Generative System

In [30]:
# Define the system prompt
system_prompt3 = f"""
You are highly experienced and expert chef specialized in providing cooking advice.
Your main task is to provide information precise and accurate on the combined content.
You answer diretly to the query using only information from the provided {combined_content}.
If you don't know the answer, just say that you don't know.
Your goal is to help the user and answer the {query}
"""

In [31]:
# Define function to retrieve a response from the API
def generate_response(query, combined_content, system_prompt):
  response = client.chat.completions.create(
      model = model,
      messages = [
          {"role": "system", "content": system_prompt}, # Provide system prompt for guidance
          {"role": "user", "content": query}, # Provide the query as user input
          {"role": "assistant", "content": combined_content} # Provide the combined content from the results
      ],
      temperature = 0, # Set temperature to 0 for deterministic output
  )
  return response

In [33]:
# Define a function to get the GPT response and display it in Markdown
def get_gpt_response():
  gpt_response = response.choices[0].message.content # Extract the response content from the API response
  return display(Markdown(gpt_response)) # Display the response as Markdown

In [32]:
# Get the results from the API
query = "How to make bread?"
combined_content = combined_retrived_content(results)
response = generate_response(query, combined_content, system_prompt3)

In [34]:
# Display the outcome
get_gpt_response()

I'm sorry, but I don't have a recipe for making bread. If you have any other questions or need information on a different recipe, feel free to ask!

In [35]:
# Get the results
query = "Get me the best chocolate cake recipe"
combined_content = combined_retrived_content(results)
response = generate_response(query, combined_content, system_prompt3)

# Display the outcome
get_gpt_response()

I don't have a specific chocolate cake recipe available. However, I can provide you with a chocolate sauce recipe that you can use as a topping for cakes or desserts. Here it is:

### Chocolate Sauce Recipe

#### Ingredients
- 1 Tablespoon of Butter
- 1 Cup of Sugar
- 2 Tablespoons of Boiling Cocoa
- 1 Cup of Water

#### Instructions
1. Put the butter into an agate dish on the stove.
2. When melted, stir in the cocoa and sugar dry; add boiling water and stir until smooth.
3. Add vanilla to taste.

This chocolate sauce can enhance the flavor of your chocolate cake or any dessert you choose! If you need more specific cake recipes, please let me know!

## RAG System

In [36]:
# Build the function for Retrieval-Augmented Generation (RAG)
def rag_system(query, index, metadata, system_prompt, k = 5):
  # Retrieval System: Retrieve relevant results based on the query
  results = query_embeddings(query, index, metadata, k)

  # Content Merge: Combine the retrieved content into a single string
  combined_content = combined_retrived_content(results)

  # Generation: Generate a response based on the query and combined content
  response = generate_response(query, combined_content, system_prompt)

  # Return the generated response
  return response

In [37]:
# Test the rag system
query1 = "How to make the best chocolate cake?"
response = rag_system(query1, index, metadata, system_prompt3)
get_gpt_response()

The query embedding is [-0.002604733919724822, -0.03968388959765434, -0.013297276571393013, 0.007617205381393433, -0.019327564164996147, 0.022370068356394768, 0.00810969714075327, -0.005398256704211235, -0.006457113660871983, 0.037473149597644806, -0.0003669747384265065, 0.007113769184798002, -0.05170068517327309, -0.007600788958370686, 0.0003343129646964371, -0.04031865671277046, 0.021166199818253517, -0.004692351911216974, -0.004487146623432636, -0.015858232975006104, 0.015004580840468407, -0.02479969523847103, -0.011732247658073902, 0.03399287164211273, 0.0030698650516569614, 0.006758080795407295, 0.022610843181610107, -0.0006980386096984148, -0.036904048174619675, 0.05677882209420204, 0.04460880532860756, 0.0018906210316345096, -0.016208449378609657, -0.014720030128955841, -0.02911173366010189, 0.016919827088713646, -0.004768961574882269, -0.001930293976329267, 0.03475897014141083, 0.014402646571397781, -0.005751208867877722, 5.8526144130155444e-05, 0.002621150342747569, 0.01079104

To make the best chocolate cake, you can follow this structured recipe for a classic chocolate cake:

### Recipe Title
Chocolate Cake

### Ingredients
- 1½ Cups of Molasses
- 1½ Cups of Sugar
- ½ Cup of Milk
- 2 Squares of Chocolate
- 1 Small Teaspoon of Butter
- 1½ Cups of Flour
- 2 Teaspoons of Baking Powder
- ½ Teaspoon of Salt
- 1 Cup of Boiling Water

### Step-by-Step Instructions
1. Preheat your oven to 350°F (175°C).
2. In a saucepan, melt the chocolate with the butter and milk over low heat.
3. In a large mixing bowl, combine the molasses, sugar, and flour.
4. Add the melted chocolate mixture to the dry ingredients and mix well.
5. Stir in the baking powder and salt.
6. Gradually add the boiling water to the batter, mixing until smooth.
7. Pour the batter into a greased cake pan.
8. Bake for about 30-35 minutes or until a toothpick inserted in the center comes out clean.
9. Allow the cake to cool before serving.

### Cuisine Type
American

### Dish Type
Cake

### Relevant Tags/Metadata
- Baking
- Dessert
- Chocolate Cake
- Classic Recipe

Feel free to ask if you need more details or variations!

In [38]:
# Test with a different query
query2 = "I want something vegan"
response = rag_system(query2, index, metadata, system_prompt3)
get_gpt_response()

The query embedding is [-0.034548308700323105, -0.026619188487529755, -0.017104245722293854, 0.03446335345506668, -0.01730247214436531, -0.008516724221408367, 5.26128314959351e-06, -0.018930774182081223, -0.033670444041490555, 0.023518336936831474, 0.003238904057070613, -0.012955616228282452, 0.0179679524153471, -0.007178685627877712, 0.036842089146375656, -0.008821146562695503, 0.009663615375757217, -0.01138395071029663, 0.007461868692189455, -0.018406886607408524, -0.023546654731035233, 0.00599993672221899, 0.037436775863170624, -0.0069061219692230225, 0.007702574133872986, 0.008198143914341927, -0.022074105218052864, -0.012240579351782799, 0.002099093049764633, 0.01846352219581604, 0.010208742693066597, 0.024481158703565598, -0.0050618937239050865, 0.05527729541063309, -0.013698970898985863, -0.003830048255622387, 0.038739416748285294, -0.0019114842871204019, 0.015617535449564457, 0.026930689811706543, -0.04777294769883156, 0.022003307938575745, -0.024226294830441475, 0.001365472329

It seems that the recipes provided do not contain any vegan options, as they include ingredients like butter, milk, eggs, and other animal products. If you're looking for vegan recipes, I can help you create or suggest some based on common vegan ingredients. 

Here are a few simple vegan sauce ideas:

1. **Vegan Chocolate Sauce**:
   - Ingredients: 1 cup of coconut milk, 1/2 cup of cocoa powder, 1/2 cup of maple syrup, 1 teaspoon of vanilla extract.
   - Instructions: In a saucepan, combine all ingredients over medium heat. Stir until smooth and heated through.

2. **Vegan Creamy Mustard Sauce**:
   - Ingredients: 1/2 cup of cashew cream (soaked cashews blended with water), 1 tablespoon of mustard, 1 tablespoon of apple cider vinegar, salt to taste.
   - Instructions: Mix all ingredients in a bowl until well combined.

3. **Vegan Cranberry Sauce**:
   - Ingredients: 1 cup of fresh cranberries, 1/2 cup of orange juice, 1/4 cup of maple syrup.
   - Instructions: In a saucepan, combine all ingredients and cook over medium heat until cranberries burst and sauce thickens.

If you have specific ingredients or types of sauces in mind, let me know, and I can help you further!