# Pydantic Output Parser

This notebook demonstrates how to use LangChain and Pydantic to parse a dataset of recipes and extract structured information, specifically the number of avocados required for each recipe. The goal is to find recipes that can be made with just one avocado, simulating a scenario where someone has limited ingredients and wants to explore their cooking options.

The code uses an LLM to process unstructured recipe text, extracting key details like title, ingredients, directions, and the avocado count. It filters the dataset for recipes containing avocados, processes a small sample, and outputs structured information for recipes requiring one avocado or less. The Pydantic model ensures type consistency, particularly for the numeric avocado count, while LangChain manages the interaction with the LLM and structures the output.

In [1]:
# Load secrets
import os
from dotenv import load_dotenv
load_dotenv()

def load_environment_variables(variable_names):
    for var_name in variable_names:
        if var_name not in os.environ:
            try:
                from google.colab import userdata
                value = userdata.get(var_name)
                if value:
                    os.environ[var_name] = value
            except ImportError:
                pass
        if var_name not in os.environ:
            raise ValueError(f"{var_name} not found. Please set it in .env file or Google Colab secrets.")
        else:
            print(f"Successfully loaded {var_name} from environment variables.")

variables_to_load = [
    "OPENAI_API_KEY",
    "AWS_ACCESS_KEY_ID",
    "AWS_SECRET_ACCESS_KEY",
    "AWS_REGION_NAME"
]

load_environment_variables(variables_to_load)

Successfully loaded OPENAI_API_KEY from environment variables.
Successfully loaded AWS_ACCESS_KEY_ID from environment variables.
Successfully loaded AWS_SECRET_ACCESS_KEY from environment variables.
Successfully loaded AWS_REGION_NAME from environment variables.


In [2]:
!pip install langchain langchain_openai

# LangChain setup: Show more about what's happening as it happens.
# from langchain.globals import set_debug
# set_debug(True)



In [7]:
import time
import random
from typing import List
from langchain.output_parsers import PydanticOutputParser
from langchain_core.prompts import PromptTemplate
from langchain_core.pydantic_v1 import BaseModel, Field, validator
from langchain_openai import ChatOpenAI
from datasets import load_dataset

In [4]:
model = ChatOpenAI(
    model="gpt-4o-mini")

In [18]:
class RecipeInfo(BaseModel):
    title: str = Field(description="The title of the recipe")
    avocado_count: float = Field(description="The number of avocados required in the recipe")
    ingredients: List[str] = Field(description="The full list of ingredients")
    directions: List[str] = Field(description="The list of directions to prepare the recipe")

parser = PydanticOutputParser(pydantic_object=RecipeInfo)

prompt = PromptTemplate(
    template="Extract the following information from the recipe:\n{format_instructions}\n{recipe_text}\n",
    input_variables=["recipe_text"],
    partial_variables={"format_instructions": parser.get_format_instructions()},
)

chain = prompt | model | parser

# Load the dataset
dataset = load_dataset("corbt/all-recipes", split="train")

# Load the dataset
dataset = load_dataset("corbt/all-recipes", split="train")

# Filter for recipes containing avocado and take 1% of the dataset
avocado_recipes = dataset.filter(lambda example: "avocado" in example["input"].lower())
sample_size = max(1, int(len(avocado_recipes) * 0.01))  # Ensure at least 1 recipe
sampled_recipes = random.sample(list(avocado_recipes), sample_size)

print(f"Processing {sample_size} recipes out of {len(avocado_recipes)} avocado recipes found.")

# Process sampled recipes
results = []
start_time = time.time()

for i, recipe in enumerate(sampled_recipes):
    try:
        if i % 10 == 0:  # Print progress every 10 recipes
            elapsed_time = time.time() - start_time
            print(f"\nProcessing recipe {i+1}/{sample_size}. "
                  f"Elapsed time: {elapsed_time:.2f} seconds.")
        
        recipe_info = chain.invoke({"recipe_text": recipe["input"]})
        
        print(f"\nRecipe: {recipe_info.title}")
        print(f"Avocados required: {recipe_info.avocado_count}")
        
        if recipe_info.avocado_count <= 1:
            print("Great! We can make this recipe with our one avocado.")
            print("\nTitle:")
            print(recipe_info.title)
            print("\nIngredients:")
            for ingredient in recipe_info.ingredients:
                print(f"- {ingredient}")
            print("\nDirections:")
            for step in recipe_info.directions:
                print(f"- {step}")
            results.append(recipe_info)
        else:
            print(f"Skipping this recipe. It requires too many avocados ({recipe_info.avocado_count}).")
        
        # Add a delimiter after each recipe
        print("\n" + "-" * 50)
    except Exception as e:
        print(f"Error processing recipe {i+1}: {e}")
        print("\n" + "-" * 50)

total_time = time.time() - start_time
print(f"\nProcessing complete. Total time: {total_time:.2f} seconds.")
print(f"Found {len(results)} recipes that we can make with one avocado.")

Processing 270 recipes out of 27070 avocado recipes found.

Processing recipe 1/270. Elapsed time: 0.00 seconds.

Recipe: Crispy Chicken Parmesan with Avocado Salsa
Avocados required: 1.0
Great! We can make this recipe with our one avocado.

Title:
Crispy Chicken Parmesan with Avocado Salsa

Ingredients:
- 1/3 cup dry bread crumbs
- 1/3 cup Kraft 100% Parmesan Grated Cheese, divided
- 1/4 cup Miracle Whip made with Olive Oil Spread*
- 1/2 tsp. chili powder
- 6 small boneless skinless chicken breasts (1-1/2 lb./675 g), pounded to 1/4-inch thickness
- 1 Tbsp. olive oil
- 1/2 cup chopped avocados
- 1/2 cup tomatillo salsa
- 1/2 cup chopped tomatoes King Sooper's 1 lb For $0.99 thru 02/09

Directions:
- Combine bread crumbs and 1/4 cup cheese in shallow dish; set aside.
- Mix Miracle Whip and chili powder; spread onto both sides of chicken breasts.
- Coat evenly with bread crumb mixture.
- Heat oil in large skillet on medium heat.
- Add chicken; cook 4 to 5 min. on each side or until done 