# Agent for recipe cooking - Initial Performance Review

## Objective

We want to get an initial idea for what's possible and not possible. 

Our initial idea is to create a recipe cooking agent that can guide a user through the process of cooking a recipe.

Before illustrating how a session might look like, we need to understand how recipe data is structured.

## Summary of results

Given an initial system prompt with little tuning, we were able to get the agent working as intented on a subset of the test recipes.

i.e. Start with preparing ingredients, then move on to cooking the recipe.

However, a major flaw was that the agent was unable to reliably terminate the session after the recipe is complete.

This will need to be addressed properly in a follow up iteration.


## Recipe Data Structure

The `recipe_scrapers` package is able to return a `AbstractScraper` object, which given a URl, will return structured data about the recipe.

Given a list of URLs to test on, we can get a list of `AbstractScraper` objects.

In [1]:
from recipe_scrapers import scrape_me

test_urls = [
    "https://www.allrecipes.com/recipe/158968/spinach-and-feta-turkey-burgers/",
    "https://www.allrecipes.com/recipe/93037/fresh-cranberry-salsa/",
    "https://www.bbcgoodfood.com/recipes/spicy-chickpea-stew",
    "https://www.bbcgoodfood.com/recipes/oven-baked-risotto",
    "https://www.bbcgoodfood.com/recipes/veggie-shepherds-pie-sweet-potato-mash",
    "https://www.allrecipes.com/recipe/157940/outrageous-warm-chicken-nacho-dip/",
    "https://dinnerthendessert.com/pork-dim-sum-recipe/",
    "https://www.hellofresh.com/recipes/fuego-chicken-fajita-tacos-68c7e2b837c1ef9942a55306"
]

recipes = [scrape_me(url) for url in test_urls]

We are interested (for now), in the following methods:

In [26]:
recipes[2].ingredients()

['1 tbsp rapeseed oil',
 '2 onions (320g), roughly chopped',
 '2 green peppers deseeded and cut into cubes',
 '2 tsp hot chilli powder',
 '1 tbsp ground coriander',
 '1 tsp ground cumin',
 '500ml carton passata',
 '2 x 400g cans chickpeas',
 '2 tsp vegetable bouillon powder',
 '40g flame raisins',
 '½ lemon juiced, flesh scooped out and white pith removed, then zest finely chopped (you’ll need 2 tsp)',
 '350g cauliflower florets',
 '15g parsley chopped',
 '140g wholemeal couscous',
 '40g toasted flaked almonds']

In [27]:
print(recipes[2].instructions())

Heat the oil in a large lidded pan over a medium heat and fry the onions for 10 mins, stirring often until golden. Stir in the peppers and cook for 5 mins more.
Add the chilli powder, coriander and cumin, stir briefly, then tip in the passata and chickpeas along with the liquid from the cans.
Stir in the bouillon powder, raisins and lemon zest, then add the cauliflower. Cover tightly and simmer over a medium heat for 15-20 mins until the cauliflower is tender. Stir in half the parsley.
Meanwhile, put the couscous in a heatproof bowl and pour over 175ml boiling water from the kettle. Stir in the lemon juice, then cover and let stand for about 10 mins until the couscous has absorbed the liquid and is tender. Stir in the toasted flaked almonds and most of the remaining parsley.
Divide half the couscous between two plates and top with half the chickpea stew and the rest of the parsley. Leave the remainder to cool for another day. Will keep covered and chilled for up to three days. Reheat t

We can typically break down the cooking process into two main stages:

### 1. Ingredient Preparation

This includes chopping, slicing, dicing, etc. 

This step is often omitted from the instructions section and is assumed to have been completed by the user already. This is evident from the above example recipe where as a prerequisite, the user has already chopped onions, deseeded and cut green peppers, lemon juiced, etc.

These steps are still part of the messy cooking process, and will require a step by step guide from the agent, since we assume the user will have messy hands and will not be able to look at the recipe.

### 2. Instructions

This is the actual cooking process after the ingredients have been prepared.

This is a simpler task since all the steps are pre-defined in the recipe. So the agent only needs to tell the user what to do next, and handle any questions the user may have.

## Session Composition

Given the above, we now have a two-stage process for our session.

Ingredients Prep Stage:

```
URL --> Ingredients list --> Ingredients Preparation Step by Step Guide --> (wait for user to finish)
```

Cooking Stage:

```
Cooking Step by Step Guide --> (wait for user to finish)
```

## Session Testing Implementation

Below, we construct an initial prompt template for the agent.

It is a single prompt. I anticipate that the agent will have to be given two different graph/chains and hence two different prompts to be able to handle both stages without being confused.

But, we experiment with a single system prompt to see if it is possible to handle both stages in a single prompt.

In [33]:
from abc import ABC, abstractmethod
from string import Template

from langchain.agents import create_agent
from langchain.chat_models import init_chat_model
from langchain.messages import HumanMessage, SystemMessage
from loguru import logger
from recipe_scrapers import AbstractScraper


class CookingSystemPrompt(ABC):
    def __init__(self, recipe:AbstractScraper):
        self.recipe = recipe

    @abstractmethod
    def get_template(self) -> Template:
        ...

    def construct_message(self) -> SystemMessage:
        return SystemMessage(
            content=self.get_template().substitute(
                dict(
                    ingredients=self.recipe.ingredients(),
                    instructions=self.recipe.instructions()
                )
            )
        )

In [35]:
class SystemPrompt1(CookingSystemPrompt):
    def get_template(self) -> Template:
        return Template(
            "\n".join([
                "PURPOSE:",
                "You are a sous-chef who will be assisting with cooking for a user in the kitchen.",
                "Your tasks can be the following:",
                "- Provide alternative ingredients when an ingredient is missing.",
                "- Set timers for particular steps in a recipe where a set time is required, e.g. marinading, oven, microwave.",
                "INSTRUCTIONS:",
                "Given a list of ingredients and instructions for a recipe, you will provide a step by step guide",
                "to prompt the user on what to do next.",
                "Wait for the user to provide confirmation they've completed the step before moving to the next step.",
                "Once all steps have been completed, you will output 'STOP' so that we know the user has completed the recipe.",
                "METHOD:",
                "Break down the recipe into the following stages:",
                "1. Preparation of ingredients. This includes chopping, slicing, dicing, etc.",
                "2. Instructions for cooking the recipe.",
                "INGREDIENTS:",
                "${ingredients}",
                "INSTRUCTIONS FOR COOKING THE RECIPE:",
                "${instructions}",
            ])
        )


We create a mockup of what a Session object might look like for handling a recipe cooking session.

A session would start with a chat model and a recipe url. 

The agent would be tasked with providing step-by-step guidance for cooking the recipe.

The session ends when the recipe is complete, or when the user tells the agent to stop cooking.

In [51]:
import sqlite3
from typing import Generator, Iterable
from uuid import uuid4

from langgraph.checkpoint.sqlite import SqliteSaver
from langgraph.graph.state import CompiledStateGraph


class BaseSession:
    def __init__(
        self,
        model_config:dict,
        checkpointer_path:str | None = None,
    ):
        self.model_config = model_config
        self.checkpointer_path = checkpointer_path

        self.chat_model = init_chat_model(**self.model_config)
        self.agent = None
        self.session_id = uuid4()

    def get_runnable_config(self):
        return {"configurable": {"thread_id": self.session_id}}

    def get_checkpointer(self):
        """Get a sqlite checkpointer for the agent."""
        if self.checkpointer_path:
            return SqliteSaver(sqlite3.connect(self.checkpointer_path, check_same_thread=False))
        return None

    def get_system_prompt(self) -> str:
        pass

    def get_tools(self):
        return None

    def get_agent(self) -> CompiledStateGraph:
        """Return a CompiledStateGraph for the session."""
        logger.info("Creating agent...")
        return create_agent(
            self.chat_model,
            tools=self.get_tools(),
            system_prompt=self.get_system_prompt(),
            checkpointer=self.get_checkpointer()
        )

    @abstractmethod
    def get_user_input(self) -> str:
        pass

    @abstractmethod
    def step(self, input:str):
        """Take the next step in the user-agent interaction.

        Given a user input, run the agent and return a response.
        """
        pass

    @abstractmethod
    def emit(self):
        """Emit output from the agent to the user."""
        pass

    @abstractmethod
    def session_ended(self) -> bool:
        """Check if the session has ended."""
        pass

    def get_message_history(self):
        return self.agent.get_state(self.get_runnable_config()).values['messages']

    def run(self):
        """Run the session."""
        logger.info("running session...")
        self.agent = self.get_agent()
        while not self.session_ended():
            qry = self.get_user_input()
            agent_response = self.step(qry)
            self.emit(agent_response)


class Session(BaseSession):
    def __init__(
        self,
        model_config:dict,
        recipe_url:str,
        system_prompt_ctr: CookingSystemPrompt,
        checkpointer_path:str | None = None,
    ):
        super().__init__(model_config, checkpointer_path)
        self.system_prompt_ctr = system_prompt_ctr
        self.recipe_url = recipe_url
        self._stop_session_flag = False
        self._stop_session_token = "STOP"

    def get_system_prompt(self) -> str:
        """Get the system prompt for the agent given a recipe url."""
        recipe = scrape_me(self.recipe_url)
        return self.system_prompt_ctr(recipe).construct_message().content

    def step(self, user_input:str):
        """Take the next step in the user-agent interaction.

        Given a user input, run the agent and return a response.
        """
        if self._stop_session_flag:
            return {"messages": [{"content": "Session ended."}]}

        output = self.agent.invoke(
            dict(messages=[HumanMessage(content=user_input)]),
            config = self.get_runnable_config(),
            stream_mode="values"
        )

        if output['messages'][-1].content == self._stop_session_token:
            self._stop_session_flag = True

        return output

    def emit(self, output):
        """Emit output from the agent to the user."""
        output['messages'][-1].pretty_print()

    def get_user_input(self) -> str:
        """Get the next input from the user."""
        logger.info("awaiting user input...")
        usr_input = input("User: ")
        logger.info(f"Received user input: {usr_input}")
        self._stop_session_flag = usr_input == self._stop_session_token
        return usr_input

    def session_ended(self) -> bool:
        return self._stop_session_flag

We run the session given some initial configuration.

In [11]:
from dotenv import load_dotenv

load_dotenv()

model_config = dict(
    model="llama3-groq-tool-use:8b",
    model_provider="ollama",
    reasoning=False,
    num_ctx=8192,
    temperature=0
)

test_sessions = [
    Session(model_config, test_url, SystemPrompt1, checkpointer_path=f".db{i}")
    for i, test_url in enumerate(test_urls)
]

In [12]:
test_sessions[0].run()

[32m2025-10-27 02:37:44.954[0m | [1mINFO    [0m | [36m__main__[0m:[36mrun[0m:[36m74[0m - [1mrunning session...[0m
[32m2025-10-27 02:37:44.955[0m | [1mINFO    [0m | [36m__main__[0m:[36mget_agent[0m:[36m39[0m - [1mCreating agent...[0m
[32m2025-10-27 02:37:46.854[0m | [1mINFO    [0m | [36m__main__[0m:[36mget_user_input[0m:[36m127[0m - [1mawaiting user input...[0m



Let's start with preparing the ingredients. Do you have all the ingredients ready?


[32m2025-10-27 02:37:52.082[0m | [1mINFO    [0m | [36m__main__[0m:[36mget_user_input[0m:[36m127[0m - [1mawaiting user input...[0m



Great! Let's begin by mixing together the turkey, spinach, feta, eggs, and garlic in a large bowl until well combined. Then, form into 8 patties.


[32m2025-10-27 02:37:59.593[0m | [1mINFO    [0m | [36m__main__[0m:[36mget_user_input[0m:[36m127[0m - [1mawaiting user input...[0m



Now, preheat an outdoor grill for medium-high heat and lightly oil the grate.


We now create a test bed on which we can run multiple sessions with different configurations.

## Trial Configurations

In [27]:
trial_model_configs = [
    dict(
        model="llama3-groq-tool-use:8b",
        model_provider="ollama",
        reasoning=False,
        num_ctx=8192,
        temperature=0.1
    ),
    dict(
        model="llama3.2:latest",
        model_provider="ollama",
        reasoning=False,
        num_ctx=8192,
        temperature=0.1
    ),
    dict(
        model="qwen3:latest",
        model_provider="ollama",
        reasoning=True,
        num_ctx=8192,
        temperature=0.1
    ),
]

test_urls = [
    "https://www.allrecipes.com/recipe/158968/spinach-and-feta-turkey-burgers/",
    "https://www.allrecipes.com/recipe/93037/fresh-cranberry-salsa/",
    "https://www.bbcgoodfood.com/recipes/spicy-chickpea-stew",
    "https://www.bbcgoodfood.com/recipes/oven-baked-risotto",
    "https://www.bbcgoodfood.com/recipes/veggie-shepherds-pie-sweet-potato-mash",
    "https://www.allrecipes.com/recipe/157940/outrageous-warm-chicken-nacho-dip/",
    "https://dinnerthendessert.com/pork-dim-sum-recipe/",
    "https://www.hellofresh.com/recipes/fuego-chicken-fajita-tacos-68c7e2b837c1ef9942a55306"
]

In [41]:
import itertools
from pathlib import Path
from tqdm import tqdm


def mock_get_user_input():
    return 'yes'


trial_configs = itertools.product(trial_model_configs, test_urls)
logger.info(f"Executing {len(trial_model_configs) * len(test_urls)} trial sessions...")

completed_sessions = []
i = 0
for model_config, test_url in tqdm(trial_configs):
    # set up experiment checkpoint db
    checkpointer_dir = Path(f"./experiment{i}/")
    checkpointer_dir.mkdir(parents=True, exist_ok=True)
    checkpointer_path = checkpointer_dir / ".db"

    # run session
    session = Session(model_config, test_url, SystemPrompt1, checkpointer_path=checkpointer_path)
    session.get_user_input = mock_get_user_input
    session.run()

    completed_sessions.append(session)

    i += 1

[32m2025-10-28 16:21:09.392[0m | [1mINFO    [0m | [36m__main__[0m:[36m<module>[0m:[36m11[0m - [1mExecuting 24 trial sessions...[0m
0it [00:00, ?it/s][32m2025-10-28 16:21:09.473[0m | [1mINFO    [0m | [36m__main__[0m:[36mrun[0m:[36m74[0m - [1mrunning session...[0m
[32m2025-10-28 16:21:09.474[0m | [1mINFO    [0m | [36m__main__[0m:[36mget_agent[0m:[36m39[0m - [1mCreating agent...[0m



Great! Let's start with preparing the ingredients. Do you have all the ingredients ready?

Perfect! Now, let's move on to cooking the recipe. Preheat an outdoor grill for medium-high heat and lightly oil the grate.

Next, mix together turkey, spinach, feta, eggs, and garlic in a large bowl until well combined; form into 8 patties.

Finally, cook patties on the preheated grill on both sides until no longer pink in the center, 15 to 20 minutes. An instant-read thermometer inserted into the center of patties should read at least 165 degrees F (74 degrees C).


1it [00:13, 13.78s/it][32m2025-10-28 16:21:23.237[0m | [1mINFO    [0m | [36m__main__[0m:[36mrun[0m:[36m74[0m - [1mrunning session...[0m
[32m2025-10-28 16:21:23.238[0m | [1mINFO    [0m | [36m__main__[0m:[36mget_agent[0m:[36m39[0m - [1mCreating agent...[0m



STOP

Great! Let's start with preparing the ingredients. Could you please chop the cranberries into smaller pieces?

Next, could you dice the cucumber and chop the celery?

Now, finely chop the pickled jalapeño peppers.

Let's move on to cooking the recipe. Place the chopped cranberries into a food processor and pulse until they're still slightly chunky.

Transfer the processed cranberries to a serving bowl. Stir in the sugar, orange liqueur, cucumber, celery, and jalapeño. Let it sit at room temperature for 15 minutes before serving.


2it [00:22, 10.86s/it][32m2025-10-28 16:21:32.034[0m | [1mINFO    [0m | [36m__main__[0m:[36mrun[0m:[36m74[0m - [1mrunning session...[0m
[32m2025-10-28 16:21:32.035[0m | [1mINFO    [0m | [36m__main__[0m:[36mget_agent[0m:[36m39[0m - [1mCreating agent...[0m



STOP

Great! Let's start by preparing the ingredients. Do you have all the ingredients listed?

Perfect! First, let's chop the onions. How many onions do we need to chop?

Got it. Now, can you deseed and cut the green peppers into cubes?

Excellent. Next, I'll need you to grind the coriander and cumin. Do you have a spice grinder or should we use pre-ground spices?

Perfect. Now that we have all our ingredients ready, let's start cooking! Heat the rapeseed oil in a large lidded pan over medium heat.

Great! Fry the onions for 10 minutes, stirring often until they're golden. Then, add the peppers and cook for another 5 minutes.

Now, add the chilli powder, coriander, and cumin. Stir briefly, then tip in the passata and chickpeas along with their liquid.

Stir in the bouillon powder, raisins, and lemon zest. Then, add the cauliflower. Cover tightly and simmer over medium heat for 15-20 minutes until the cauliflower is tender.

While the stew is cooking, prepare the couscous. Put it in a

2it [22:40, 680.40s/it]


KeyboardInterrupt: 

From the below testing, it's evident that there is a major issue in our session + agent interaction where we can't reliably terminate the session after the recipe is complete.

We end the notebook here for now, and we'll address this issue in the next notebook.

In [54]:
session.get_message_history()

[HumanMessage(content='yes', additional_kwargs={}, response_metadata={}, id='9cb1979c-2aa6-4ccd-8622-8c6efac4cbfe'),
 AIMessage(content="Great! Let's start by preparing the ingredients. Do you have all the ingredients listed?", additional_kwargs={}, response_metadata={'model': 'llama3-groq-tool-use:8b', 'created_at': '2025-10-28T07:21:35.78952Z', 'done': True, 'done_reason': 'stop', 'total_duration': 3463705000, 'load_duration': 66766458, 'prompt_eval_count': 615, 'prompt_eval_duration': 2374054625, 'eval_count': 19, 'eval_duration': 1020586958, 'model_name': 'llama3-groq-tool-use:8b', 'model_provider': 'ollama'}, id='lc_run--c6fc6935-9b42-4419-80ef-aeab26b1bad9-0', usage_metadata={'input_tokens': 615, 'output_tokens': 19, 'total_tokens': 634}),
 HumanMessage(content='yes', additional_kwargs={}, response_metadata={}, id='58a38b19-93bf-44ec-b12b-8f0ec32e01c1'),
 AIMessage(content="Perfect! First, let's chop the onions. How many onions do we need to chop?", additional_kwargs={}, response