# Homework Assignment 2: Recipe Bot Error Analysis

This notebook shows you how to run the second homework example using Galileo.

## Configuration

To be able to run this notebook, you need to have a Galileo account set up, along with an LLM integration to run an experiment to generate responses.

1. If you don't have a Galileo account, head to [app.galileo.ai/sign-up](https://app.galileo.ai/sign-up) and sign up for a free account
1. Once you have signed up, you will need to configure an LLM integration. Head to the [integrations page](https://app.galileo.ai/settings/integrations) and configure your integration of choice. The notebook assumes you are using OpenAI, but has details on what to change if you are using a different LLM.
1. Create a Galileo API key from the [API keys page](https://app.galileo.ai/settings/api-keys)
1. In this folder is an example `.env` file called `.env.example`. Copy this file to `.env`, and set the value of `GALILEO_API_KEY` to the API key you just created.
1. If you are using a custom Galileo deployment inside your organization, then set the `GALILEO_CONSOLE_URL` environment variable to your console URL. If you are using [app.galileo.ai](https://app.galileo.ai), such as with the free tier, then you can leave this commented out.
1. This code uses OpenAI to generate some values. Update the `OPENAI_API_KEY` value in the `.env` file with your OpenAI API key. If you are using another LLM, you will need to update the code to reflect this.


In [1]:
# Install the galileo and python-dotenv package into the current Jupyter kernel
%pip install "galileo[openai]" python-dotenv

Collecting openai (from galileo[openai])
  Downloading openai-2.13.0-py3-none-any.whl.metadata (29 kB)
Collecting openai-agents (from galileo[openai])
  Downloading openai_agents-0.6.3-py3-none-any.whl.metadata (13 kB)
Collecting packaging<25.0,>=24.2 (from galileo[openai])
  Using cached packaging-24.2-py3-none-any.whl.metadata (3.2 kB)
Collecting distro<2,>=1.7.0 (from openai->galileo[openai])
  Using cached distro-1.9.0-py3-none-any.whl.metadata (6.8 kB)
Collecting jiter<1,>=0.10.0 (from openai->galileo[openai])
  Downloading jiter-0.12.0-cp313-cp313-macosx_11_0_arm64.whl.metadata (5.2 kB)
Collecting sniffio (from openai->galileo[openai])
  Using cached sniffio-1.3.1-py3-none-any.whl.metadata (3.9 kB)
Collecting tqdm>4 (from openai->galileo[openai])
  Using cached tqdm-4.67.1-py3-none-any.whl.metadata (57 kB)
Collecting griffe<2,>=1.5.6 (from openai-agents->galileo[openai])
  Using cached griffe-1.15.0-py3-none-any.whl.metadata (5.2 kB)
Collecting mcp<2,>=1.11.0 (from openai-agents-

## Environment setup

To use Galileo, we need to load the API key from the .env file

In [3]:
import os
from dotenv import load_dotenv

# Load environment variables from .env file
load_dotenv()

# Check that the GALILEO_API_KEY environment variable is set
if not os.getenv("GALILEO_API_KEY"):
    raise ValueError("GALILEO_API_KEY environment variable is not set. Please set it in your .env file.")

Next we need to ensure there is a Galileo project set up.

In [4]:
from galileo.projects import create_project, get_project

PROJECT_NAME = "AI Evals Course - Homework 1"
project = get_project(name=PROJECT_NAME)
if project is None:
    project = create_project(name=PROJECT_NAME)

print(f"Using project: {project.name} (ID: {project.id})")

Using project: AI Evals Course - Homework 1 (ID: b54a29e8-7c14-436f-ba65-34318928d1ca)


In this notebook, you will be using the LLM integration you set up in Galileo to run an experiment, as well as calling OpenAI directly to generate some data. The default model used is GPT-5.1, and this assumes you have configured an OpenAI integration.

If you have another integration set up, or want to use a different model, update this value.

In [5]:
MODEL="gpt-5.1"

## Part 1: Generate Test Queries

### Pick your dimensions

Pick your dimensions that matter for your test queries, such as cuisine, dietary restrictions, meal type etc. Then add example values, ideally three values for each dimension.

Update the code below to reflect these dimensions and example values.

In [6]:
# Define the dimensions for the recipe generation task, along with some example values
dimensions = [
    {
        "name": "cuisine",
        "values:": ["Italian", "Chinese", "Mexican"]
    },
    {
        "name": "dietary restrictions",
        "values:": ["Vegetarian", "Vegan", "Gluten-Free", "Diabetic"]
    },
    {
        "name": "meal type",
        "values:": ["Breakfast", "Lunch", "Dinner", "Snack"]
    }
]

### Create combinations

You can use an LLM to generate queries using combinations of the different dimensions.

In [7]:
from openai import OpenAI
import ast

client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))

# Create a prompt to generate test queries using the dimensions
prompt = f"""Generate 20 diverse test queries for a recipe bot. Use combinations of the following dimensions:

{dimensions}

The queries should be natural language questions that users might ask a recipe bot, incorporating different combinations of the dimension values provided.

Return ONLY a valid Python list of strings, with no additional text or explanation. For example:
["query 1", "query 2", ...]
"""

response = client.chat.completions.create(
    model=MODEL,
    messages=[
        {"role": "system", "content": "You are a helpful assistant that generates test queries. Return only valid Python lists."},
        {"role": "user", "content": prompt}
    ],
    temperature=0.8
)

# Extract the response and convert to Python array
test_queries = ast.literal_eval(response.choices[0].message.content)

print(f"Generated {len(test_queries)} test queries:")
for i, query in enumerate(test_queries, 1):
    print(f"{i}. {query}")

OpenAIError: The api_key client option must be set either by passing api_key to the client or by setting the OPENAI_API_KEY environment variable