## Gen AI-Powered Recipe Recommendation: From Ingredients to Instructions

This notebook showcases a Retrieval-Augmented Generation (RAG) approach to recipe recommendation. The core use case is to assist users in discovering recipes they can make using the ingredients they currently possess. We address this by:

1.  **Semantic Retrieval:** Employing embedding models and ChromaDB to find recipes with ingredient lists semantically similar to the user's input.
2.  **Generative Suggestion:** Utilizing the Gemini language model, augmented with retrieved recipe context, to suggest relevant and potentially novel Indian recipes.
3.  **Instruction Generation (Optional):** Further leveraging Gemini to provide detailed step-by-step instructions for the suggested recipe.

This demonstrates how Gen AI can move beyond simple keyword matching to offer more intelligent and helpful recipe recommendations.

In [1]:
# install necessary libraries
!pip uninstall -qqy jupyterlab kfp  # Remove unused conflicting packages
!pip install -qU "google-genai==1.7.0" "chromadb==0.6.3"

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m67.3/67.3 kB[0m [31m3.5 MB/s[0m eta [36m0:00:00[0m
[?25h  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m144.7/144.7 kB[0m [31m6.0 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m611.1/611.1 kB[0m [31m22.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.4/2.4 MB[0m [31m43.9 MB/s[0m eta [36m0:00:00[0m:00:01[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m100.9/100.9 kB[0m [31m4.8 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m284.2/284.2 kB[0m [31m13.1 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m95.2/95.2 kB[0m [31m4.5 MB/s[

In [2]:
import google.generativeai as genai
import pandas as pd
from chromadb import Documents, EmbeddingFunction, Embeddings
from chromadb.utils import embedding_functions
import chromadb
import ast  # For safely evaluating string representations of lists

In [None]:
# Configure Gemini
genai.configure(api_key='your google api key')
model = genai.GenerativeModel('gemma-3-4b-it')

In [67]:
# --- 1. Recipe Knowledge Base (Simplified) ---
# Load the RAW_recipes.csv file (replace with the actual path if needed)
try:
    df = pd.read_csv('/kaggle/input/food-com-recipes-and-user-interactions/RAW_recipes.csv')
    df = df[:10000]
except FileNotFoundError:
    print("Error: 'RAW_recipes.csv' not found. Make sure the file is in the same directory as the script or provide the correct path.")
    exit()

In [68]:
documents = []
ids = []
metadatas = []

# --- 2. Create Embeddings and Store in Vector Database (ChromaDB) ---
# Initialize the embedding function using SentenceTransformer for generating text embeddings.
embedding_function = embedding_functions.SentenceTransformerEmbeddingFunction()

# Initialize empty lists to store documents (ingredient text), IDs (recipe IDs), and metadata for ChromaDB.
documents = []
ids = []
metadatas = []

# Iterate through each row of the DataFrame (df, which should be loaded before this section).
for index, row in df.iterrows():
    # Extract relevant information from each row of the DataFrame.
    recipe_id = str(row['id'])
    title = row['name']
    ingredients_str = row['ingredients']
    rsteps = row['steps']
    rminutes = row['minutes']
    rnutrition = row['nutrition']
    description = row['description']

    # Initialize an empty list to hold the parsed ingredients.
    # While not directly used for metadata storage anymore, it's used for creating the document.
    ingredients_list = []
    try:
        # Attempt to safely evaluate the string representation of the ingredients list into an actual Python list.
        ingredients_list = ast.literal_eval(ingredients_str)
    except (SyntaxError, ValueError):
        # If parsing fails (due to malformed string), print a warning and store the raw string for the document.
        print(f"Warning: Could not parse ingredients for recipe {recipe_id}: {ingredients_str}. Storing raw string in document.")
        ingredients_list = ingredients_str  # Store the raw string for the document

    # Create a single string document by joining the elements of the (potentially parsed) ingredients list.
    document = " ".join(ingredients_list)
    documents.append(document)
    ids.append(recipe_id)
    # Store the original string representation of the ingredients in the metadata dictionary.
    # This allows us to retrieve the structured ingredient list later.
    metadatas.append({"title": title, "steps": rsteps, "minutes": rminutes, "nutrition": rnutrition, 'description': description, 'ingredients': ingredients_str})

# Initialize a ChromaDB client. This will create an in-memory client by default.
chroma_client = chromadb.Client()

# Get or create a ChromaDB collection named 'food'.
# The specified embedding function will be used to create embeddings for the documents in this collection.
db = chroma_client.get_or_create_collection(name='food', embedding_function=embedding_function)

# Add the created documents, their corresponding IDs, and metadata to the ChromaDB collection.
db.add(documents=documents, ids=ids, metadatas=metadatas)

# Print a confirmation message indicating the number of recipes added to the ChromaDB collection.
print(f"Added {db.count()} recipes from Food.com to the ChromaDB collection.")

Batches:   0%|          | 0/313 [00:00<?, ?it/s]

Added 10000 recipes from Food.com to the ChromaDB collection.


In [69]:
# --- 3. User Query ---
# Define the user's query based on the ingredients they have.
user_query = "I have some chicken and tomatoes, what can I make?"

# Embed the user's query using the same embedding function used for the recipe ingredients.
query_embedding = embedding_function([user_query])

# Query the ChromaDB collection for the most relevant recipes based on the query embedding.
results = db.query(
    query_embeddings=query_embedding,
    n_results=5,  # Retrieve the top 5 most relevant recipes.
    include=['metadatas']  # Include the metadata associated with the retrieved recipes in the results.
)

# Initialize an empty list to store information about the retrieved recipes.
retrieved_recipes_info = []

# Check if the query returned any results (i.e., if 'ids' key exists and is not empty).
if results and results['ids']:
    # Iterate through the IDs of the retrieved recipes.
    for i in range(len(results['ids'][0])):
        # Access the metadata for the i-th retrieved recipe.
        metadata = results['metadatas'][0][i]

        # Safely retrieve the 'ingredients' string from the metadata dictionary.
        # If the 'ingredients' key is not found, default to "N/A".
        ingredients_str_from_metadata = metadata.get("ingredients", "N/A")

        # Safely evaluate the string representation of the ingredients list back into a Python list.
        # If the string is "N/A" or cannot be parsed, assign "N/A" to the list.
        try:
            ingredients_list_retrieved = ast.literal_eval(ingredients_str_from_metadata) if ingredients_str_from_metadata != "N/A" else "N/A"
        except (SyntaxError, ValueError):
            print(f"Warning: Could not parse ingredients from metadata: {ingredients_str_from_metadata}")
            ingredients_list_retrieved = "N/A"

        # Append a dictionary containing the title, retrieved ingredients list, and description to the list of retrieved recipes.
        retrieved_recipes_info.append({
            "title": metadata["title"],
            "ingredients": ingredients_list_retrieved,
            "description": metadata.get("description", "N/A")
            # You can add other metadata fields here if needed
        })

# Print the information of the retrieved recipes.
print("\nRetrieved Recipes:")
for recipe_info in retrieved_recipes_info:
    print(f"Title: {recipe_info['title']}")
    print(f"Ingredients: {recipe_info['ingredients']}")
    print(f"Description: {recipe_info['description']}")
    print("-" * 20)

Batches:   0%|          | 0/1 [00:00<?, ?it/s]


Retrieved Recipes:
Title: 5 can chicken tortilla soup
Ingredients: ['corn', 'black beans', 'diced tomatoes with green chilies', 'chicken broth', 'cooked chicken', 'shredded cheddar cheese', 'corn tortilla strips']
Description: i got this recipe from my best friend's family. this is the easiest chicken tortilla soup! it's not exactly traditional or authentic - but it's delicioso!
--------------------
Title: 3 cans and a box  chili   pasta
Ingredients: ['rotini pasta', 'chili with chicken and beans', 'chili beans', 'diced tomatoes']
Description: if you're in a big hurry, here's an easy way to fix chili for 6! if you can't find a can of chili with chicken & beans, use a can of regular chili and a 12.5-oz can of premium chunk chicken breast [i realize that means the recipe is really 4 cans and a box, but hey....!
--------------------
Title: a famous chicken salad sandwich
Ingredients: ['white chicken meat packed in water', 'mayonnaise', 'apples', 'grapes', 'celery', 'pecans', 'hard-boiled

In [70]:
# --- 5. Generate Recipe Recommendation using Gemini (RAG - Generation) ---
# Check if any relevant recipes were retrieved from ChromaDB.
if retrieved_recipes_info:
    # Create a context string by joining the titles and ingredients of the retrieved recipes.
    # This context will be provided to the Gemini model to guide its recommendation.
    context = "\n\n".join([f"Title: {r['title']}\nIngredients: {', '.join(r['ingredients'])}" for r in retrieved_recipes_info])

    # Construct the prompt for the Gemini model.
    prompt = f"""Based on the following recipes and the user's ingredients: "{user_query}", suggest a recipe the user can make.
    Consider the available ingredient and suggest some continental receipes with maximum nutrition values.

    Retrieved Recipes:
    {context}

    Provide a recipe suggestion with a title and a list of ingredients.
    """

    # Call the Gemini model to generate a recipe recommendation based on the prompt and context.
    response = model.generate_content(prompt)
    print("--- Gemini's Recipe Recommendation ---")
    print(response.text)
else:
    # If no relevant recipes were retrieved, construct a simpler prompt based only on the user's query.
    prompt = f"Based on the ingredients: '{user_query}', suggest a simple recipe."
    # Call the Gemini model to generate a simple recipe suggestion.
    response = model.generate_content(prompt)
    print("--- Gemini's Recipe Recommendation (No Retrieval Context) ---")
    print(response.text)

--- Gemini's Recipe Recommendation ---
Okay, based on your ingredients (chicken and tomatoes) and aiming for a nutritious continental recipe, I recommend:

**Title: Mediterranean Chicken & Tomato Skillet**

**Ingredients:**

*   **Chicken:** 1 lb (about 450g) Chicken breasts or thighs, cut into bite-sized pieces
*   **Tomatoes:** 2-3 medium Tomatoes, chopped
*   **Olive Oil:** 2 tablespoons
*   **Garlic:** 2-3 cloves, minced
*   **Onion:** 1/2 medium Onion, chopped (optional, but adds great flavor)
*   **Lemon:** 1/2 Lemon (juice and zest)
*   **Fresh Herbs:** 1/4 cup chopped fresh herbs (such as oregano, basil, or parsley - whatever you have!)
*   **Cucumber:** 1/2 Cucumber, diced (optional, for freshness)
*   **Feta Cheese:** 1/4 cup crumbled (optional, for added flavor and protein)
*   **Salt & Pepper:** To taste


**Why this recipe is a good fit:**

*   **Uses your ingredients:** Directly incorporates chicken and tomatoes.
*   **Continental Style:**  Mediterranean flavors are light

In [71]:
# --- 6. More Detailed Instructions (Another Gemini Call with Context) ---
# Check if relevant recipes were retrieved and if Gemini provided a recipe suggestion.
if retrieved_recipes_info and response.text:
    # Extract the suggested recipe title from Gemini's response.
    # Assuming the title is on the third line and prefixed with "**Title:".
    suggested_recipe_title = response.text.split('\n')[2].replace("**Title:", "").strip()

    # Instead of trying to find a *matching* recipe by title, this line now simply takes the *first* retrieved recipe.
    # This assumes that the first retrieved recipe is likely relevant enough to get instructions for.
    # **Note:** This change bypasses the title matching logic.
    relevant_recipe = next((r for r in retrieved_recipes_info), None)

    # If a relevant recipe was found (which will be the first one in the list).
    if relevant_recipe:
        # Extract the ingredients list from the retrieved recipe and join them into a comma-separated string.
        ingredients_list = ", ".join(relevant_recipe['ingredients'])
        # Construct a prompt for Gemini to generate detailed step-by-step instructions.
        prompt_instructions = f"""Provide detailed step-by-step instructions on how to make '{suggested_recipe_title}' using the following ingredients: {ingredients_list}."""
        # Call Gemini to generate the instructions.
        instructions_response = model.generate_content(prompt_instructions)
        # Print a separator and the generated instructions.
        print("\n--- Gemini's Recipe Instructions ---")
        print(instructions_response.text)


--- Gemini's Recipe Instructions ---
## Mediterranean Chicken & Tomato Skillet - Step-by-Step Instructions

This recipe combines the flavors of the Mediterranean with a hearty, easy skillet meal. It’s perfect for a weeknight dinner!

**Yields:** 4-6 servings
**Prep time:** 15 minutes
**Cook time:** 20-25 minutes


**Ingredients:**

* 1.5 - 2 lbs Cooked Chicken, shredded or cubed (rotisserie chicken works great!)
* 1 tbsp Olive Oil
* 1 medium Onion, chopped
* 2 cloves Garlic, minced
* 1 (14.5 oz) can Diced Tomatoes with Green Chilies (like Rotel) – undrained
* 1 (15 oz) can Black Beans, rinsed and drained
* 1 (15 oz) can Corn, drained (if packed in liquid, drain well)
* 1 cup Chicken Broth
* 1/2 tsp Smoked Paprika
* 1/4 tsp Cumin
* Salt and Pepper to taste
* 1 cup Shredded Cheddar Cheese
* 1/2 cup Corn Tortilla Strips, for garnish


**Equipment:**

* Large Skillet (cast iron or non-stick recommended)
* Cutting Board
* Knife
* Measuring Cups & Spoons
* Spoon or Spatula


**Instructions: