### 3.1 Implement Basic Search with minsearch

In this section, we implement a basic search functionality using the `minsearch` library.  
We load the cleaned recipes dataset, convert it into a list of documents, and create a search index.  
The index is configured to search across relevant text and keyword fields.  
We then test the search by querying for recipes that match certain keywords (e.g., "chicken pasta italian") and retrieve the top results.

In [None]:
from minsearch import Index
import pandas as pd
import json

# Load data
df = pd.read_csv('../data/recipes_clean.csv')

# Create documents for indexing
documents = df.to_dict(orient='records')

# Setup minsearch index
index = Index(
    text_fields=['recipe_name', 'main_ingredients', 'all_ingredients', 'instructions', 'cuisine_type', 'dietary_restrictions'],
    keyword_fields=['meal_type', 'difficulty_level']
)

# Fit the index
index.fit(documents)

# Test search
query = "chicken pasta italian"
results = index.search(query, num_results=5)

### 3.2 Implement OpenAI Integration

Here, we integrate the OpenAI API to enable Retrieval-Augmented Generation (RAG).  
We define a function to build a prompt for the language model using the search results as context.  
The prompt instructs the model to recommend recipes based on the user's query and the retrieved recipes, providing explanations and suggesting substitutions if needed.  
The `rag_flow` function ties everything together: it performs a search, builds the prompt, sends it to the OpenAI model, and returns the generated response.

In [None]:
import openai
import os

# Initialize the OpenAI client using the API key from environment variables
client = openai.OpenAI(api_key=os.getenv('OPENAI_API_KEY'))

def build_prompt(query, search_results):
    """
    Builds a prompt for the language model using the user's query and search results.
    The prompt includes context from the top retrieved recipes and instructs the model
    to recommend recipes, explain matches, and suggest substitutions if needed.
    """
    # Template for formatting each recipe entry in the context
    entry_template = """
Recipe: {recipe_name}
Cuisine: {cuisine_type}
Meal Type: {meal_type}
Difficulty: {difficulty_level}
Prep Time: {prep_time_minutes} minutes
Cook Time: {cook_time_minutes} minutes
Main Ingredients: {main_ingredients}
Instructions: {instructions}
Dietary Info: {dietary_restrictions}
""".strip()
    
    # Combine the top search results into a single context string
    context = "\n\n".join([entry_template.format(**doc) for doc in search_results])
    
    # Template for the full prompt to send to the language model
    prompt_template = """
You are an expert chef and culinary assistant. Answer the question based on the content from our recipe database.
Use only the facts from the context when answering the question.

CONTEXT:
{context}

QUESTION: {question}

Provide recipe recommendations with brief explanations of why they match the requested ingredients.
If exact ingredients aren't available, suggest the closest matches and mention any substitutions needed.
""".strip()
    
    # Return the formatted prompt with context and user question
    return prompt_template.format(context=context, question=query)

def rag_flow(query):
    """
    The main RAG (Retrieval-Augmented Generation) flow:
    1. Searches the index for relevant recipes based on the query.
    2. Builds a prompt using the search results.
    3. Sends the prompt to the OpenAI language model.
    4. Returns the model's generated response.
    """
    # Retrieve top 5 relevant recipes from the search index
    search_results = index.search(query, num_results=5)
    # Build the prompt for the language model
    prompt = build_prompt(query, search_results)
    
    # Send the prompt to the OpenAI model and get the response
    response = client.chat.completions.create(
        model='gpt-4o-mini',
        messages=[{"role": "user", "content": prompt}]
    )
    
    # Return the generated content from the model's response
    return response.choices[0].message.content