# Multi-step Cooking Agent

In this notebook, we change how the cooking process works and break it donwn into four sections:

1. Equipment Prep
2. Ingredients Prep
3. Cooking
4. Garnish

This gives us a more structured input for the agent to work with and allows us to better control the flow of the session.

This approach seems more favourable given a single cooking session is noisy, messy, and often requires a lot of back and forth between the user and the agent.

It's hoped that a lot of the complexity regards to the cooking process can be handled upfront in steps 1 & 2, leaving a good impression to the user.

The cooking step itself is then minimised to the least number of steps possible.

## Notebook Aim

In this notebook, we investigate how a recipe consisting of 'ingredients' and 'instructions' can be transformed into this 4 step process.

## Approach

Given a prompt and the original recipe, we use a simple structured output from an LLM to populate a `RecipeSteps` object.

```
Recipe (title, ingredients, instructions) --> LLM --> Structured Output --> RecipeSteps
```

## Findings

1. Small models (of size ~4GB) are not powerful enough to handle this task reliably.
2. gemini-2.5-flash-lite works well, reliably.

## Next Steps

Incorporate this into the main voice agent + a flow to be used to interact with the client.

In [1]:
from recipe_scrapers import scrape_me
from sous_chef.protocol import messages
from sous_chef.session.base import BaseSession

  from .autonotebook import tqdm as notebook_tqdm


In [2]:
import pandas as pd

df = pd.read_parquet('../../data/02_intermediate/recipe_ingredients.parquet')
recipe = pd.read_parquet('../../data/02_intermediate/recipe.parquet')

In [3]:
def get_recipe_per_index(df, recipe_index:int):
    return df.loc[recipe_index]

def get_ingredients_per_recipe_index(df, recipe_index:int):
    recipe_idx = df['recipes_index'].unique()
    ingredients = df.set_index('recipes_index').loc[recipe_idx[recipe_index]].ingredients.to_list()
    return ingredients

def get_recipe_contents_per_index(recipe, ingredients, recipe_index:int):
    recipe_df = get_recipe_per_index(recipe, recipe_index)
    ingredients = get_ingredients_per_recipe_index(df, recipe_index)
    return recipe_df.Title, recipe_df.Instructions, ingredients


Rather than the agent taking a single recipe as input, we transform the input into four sections:

1. Equipment Prep
2. Ingredients Prep
3. Cooking
4. Garnish

Each section is a list of steps.

This gives us a more structured input for the agent to work with.

In general, any local model run through Ollama is not powerful enough to handle this task. We have to resort to gemini-2.5-flash-lite to get reliable results.

In [62]:
from dotenv import load_dotenv
from typing import List
from pydantic import BaseModel, Field
from langchain.chat_models import init_chat_model
from langchain_core.prompts import ChatPromptTemplate

load_dotenv()

class RecipeSteps(BaseModel):
    equipment_prep: List[str] = Field(description="1. A step by step guide to preparing cooking equipment.")
    ingredients_prep: List[str] = Field(description="2. A step by step guide to preparing the ingredients - Mise en place.")
    cooking: List[str] = Field(description="A step by step guide to cook the recipe.")
    garnish: List[str] = Field(description="A step by step guide to garnish the recipe.")

llm = init_chat_model(
    model="gemini-2.5-flash-lite",
    model_provider="google_genai",
    format='json',
    temperature=0
)
structured_llm = llm.with_structured_output(RecipeSteps)

prompt = ChatPromptTemplate.from_messages([
    (
        "system",
        " ".join([
            "PURPOSE:",
            "You are an expert chef who will be assisting a user to cook in the kitchen.",
            "INSTRUCTIONS:",
            "Given a list of ingredients and instructions for a recipe, you will provide a step by step guide",
            "to preparing the recipe.",
            "You will return a List of strings, containing the step by step guide to prepare the recipe.",
            "BEWARE:",
            "1.If a step is too long, the user will not be able to remember them all. Make sure each step is simple, concise, and easy to remember.",
            "2.Do not use emojis, bold, or italic text.",
            "3. Do not be conversational."
            "4. Don't add steps not explicitly mentioned in the original recipe."
            "METHOD:",
            "Break down the recipe into the following stages:",
            "1. Steps for preparing equipment, such as preheating the oven. This should only contain steps taken before any cooking is done.",
            "2. Steps for preparing ingredients. Each ingredient should have its own step. This includes chopping, slicing, dicing, etc. - Mise en place.",
            "3. Steps for cooking the recipe.",
            "4. Steps for garnishing the recipe. This includes zesting lemons, adding herbs, etc. - steps done at the end.",
        ])
    ),
    (
        "human",
        " ".join([
            "TITLE:",
            "${recipe_title}",
            "INGREDIENTS:",
            "${ingredients}",
            "INSTRUCTIONS:",
            "${instructions}",
        ])
    )
])

def classify_ingredients(title, instructions, ingredients: List[str]):
    chain = prompt | structured_llm
    
    # Format the list as a string for the prompt
    ingredients_str = ", ".join(ingredients)
    
    response = chain.invoke({
        "recipe_title" : title,
        "instructions": instructions,
        "ingredients": ingredients_str
    })
    
    return response

Unexpected argument 'format' provided to ChatGoogleGenerativeAI.
                format was transferred to model_kwargs.
                Please confirm that format is what you intended.
  return _init_chat_model_helper(


In [66]:
title, instructions, ingredients = get_recipe_contents_per_index(recipe, df, 6)

print(f"Title: {title}")
print()
print(instructions)
print()
print("\n".join(ingredients))

Title: Apples and Oranges

Add 3 oz. Grand Marnier, 1 oz. Amaro Averna, and a small pat of salted butter to a 16-ounce travel mug or large heatproof measuring cup. Add 1 cup hot apple cider and stir until butter melts.
Add 1½ tsp. fresh lemon juice and taste for sweetness, adding additional if needed. If serving in travel mug, garnish with ground pink peppercorns and seal. If serving in mugs, divide drink between two warmed mugs and garnish each with a pink peppercorn-dusted lemon wheel.

3 oz. Grand Marnier
1 oz. Amaro Averna
Small pat salted butter (about ½ teaspoon)
1 cup hot apple cider
1½ to 3 tsp. fresh lemon juice (to taste, depending on the sweetness of your cider)
Garnish: freshly ground pink peppercorns
plus 2 lemon wheels (optional)


In [67]:
response = classify_ingredients(title, instructions, ingredients)

In [68]:
print("Equipment Prep:")
print()
print("\n".join(response.equipment_prep))
print()
print("Ingredients Prep:")
print()
print("\n".join(response.ingredients_prep))
print()
print("Cooking:")
print()
print("\n".join(response.cooking))
print()
print("Garnish:")
print()
print("\n".join(response.garnish))

Equipment Prep:

Warming two mugs if serving in mugs.

Ingredients Prep:

Measure out 3 oz. Grand Marnier.
Measure out 1 oz. Amaro Averna.
Measure out a small pat of salted butter (about ½ teaspoon).
Measure out 1 cup hot apple cider.
Measure out 1½ to 3 tsp. fresh lemon juice.

Cooking:

Add Grand Marnier, Amaro Averna, and butter to a travel mug or heatproof measuring cup.
Add hot apple cider and stir until butter melts.
Add lemon juice and taste for sweetness, adding more if needed.

Garnish:

If serving in a travel mug, garnish with ground pink peppercorns and seal.
If serving in mugs, dust lemon wheels with pink peppercorns and place one on each mug.
