### RAG Implementation and Search Setup

This section sets up the core retrieval-augmented generation (RAG) workflow for the recipe assistant.  
- **Download minsearch and Implement Basic Search:** We use the `minsearch` library to create a simple, in-memory search index over the recipe dataset.  
- **Test minsearch retrieval:** We verify that the search index can retrieve relevant recipes based on a sample query.  
- **Implement RAG flow:** We will combine retrieval (minsearch) with a large language model (LLM) to generate recipe recommendations based on user queries and the retrieved context.  
- **Evaluate retrieval:** Later, we will add evaluation logic (e.g., hit rate, MRR) to measure how well the retrieval step surfaces relevant recipes.  
- **Find best boost parameters:** We will experiment with boosting certain fields (like main ingredients) to improve retrieval quality.

In [10]:
# 1. Download minsearch.py if not present
!wget -nc https://raw.githubusercontent.com/alexeygrigorev/minsearch/main/minsearch.py

File ‘minsearch.py’ already there; not retrieving.



In [5]:
import minsearch
import pandas as pd
import json

# Load data
df = pd.read_csv('../data/recipes_clean.csv')

# Create documents for indexing
documents = df.to_dict(orient='records')

# Setup minsearch index
index = minsearch.Index(
    text_fields=['recipe_name', 'main_ingredients', 'all_ingredients', 'instructions', 'cuisine_type', 'dietary_restrictions'],
    keyword_fields=['meal_type', 'difficulty_level']
)

# Fit the index
index.fit(documents)

# Test search
query = "chicken pasta italian"
results = index.search(query, num_results=5)



### Implement OpenAI Integration and RAG Flow

In this section, we integrate the OpenAI API to complete the RAG pipeline:
- **Prompt Construction:** We define a function to build a prompt for the LLM using the context of the top retrieved recipes.
- **LLM Call:** We define a function to call the OpenAI chat model with the constructed prompt.
- **RAG Flow:** The `rag_flow` function ties everything together: it retrieves relevant recipes, builds the prompt, and gets a generated answer from the LLM.  
This enables the assistant to provide recipe recommendations that are grounded in the actual recipe database, rather than relying solely on the

In [None]:
import dotenv
 # Get access to OPENAI_API_KEY
dotenv.load_dotenv(dotenv_path="../.env") 

from openai import OpenAI

client = OpenAI()

def build_prompt(query, search_results):
    entry_template = """
Recipe: {recipe_name}
Cuisine: {cuisine_type}
Meal Type: {meal_type}
Difficulty: {difficulty_level}
Prep Time: {prep_time_minutes} minutes
Cook Time: {cook_time_minutes} minutes
Main Ingredients: {main_ingredients}
Instructions: {instructions}
Dietary Info: {dietary_restrictions}
""".strip()
    context = "\n\n".join([entry_template.format(**doc) for doc in search_results])
    prompt_template = """
You are an expert chef and culinary assistant. Answer the question based on the content from our recipe database.
Use only the facts from the context when answering the question.

CONTEXT:
{context}

QUESTION: {question}

Provide recipe recommendations with brief explanations of why they match the requested ingredients.
If exact ingredients aren't available, suggest the closest matches and mention any substitutions needed.
""".strip()
    return prompt_template.format(context=context, question=query)

def llm(prompt, model='gpt-4o-mini'):
    response = client.chat.completions.create(
        model=model,
        messages=[{"role": "user", "content": prompt}]
    )
    return response.choices[0].message.content

def rag_flow(query):
    search_results = index.search(query, num_results=5)
    prompt = build_prompt(query, search_results)
    answer = llm(prompt)
    return answer