## Dependency Injection

Using dependency injection to test and evaluate each component of a chain separately

In [1]:
GEMINI="gemini-2.0-flash"
#OPENAI="gpt-4o-mini"
#CLAUDE="claude-3-7-sonnet-latest"

import os
from dotenv import load_dotenv
load_dotenv("../keys.env")
assert os.environ["GEMINI_API_KEY"][:2] == "AI",\
       "Please specify the GEMINI_API_KEY access token in keys.env file"
#assert os.environ["ANTHROPIC_API_KEY"][:2] == "sk",\
#       "Please specify the ANTHROPIC_API_KEY access token in keys.env file"
#assert os.environ["OPENAI_API_KEY"][:2] == "sk",\
#       "Please specify the OPENAI_API_KEY access token in keys.env file"

In [2]:
# Needed in Jupyter environment See: https://ai.pydantic.dev/troubleshooting/ 
import nest_asyncio
nest_asyncio.apply()

## Data

For simplicity, we'll hardcode the text

In [3]:
aieng_text="""
Recent breakthroughs in AI have not only increased demand for AI products, they've also lowered the barriers to entry for those who want to build AI products. The model-as-a-service approach has transformed AI from an esoteric discipline into a powerful development tool that anyone can use. Everyone, including those with minimal or no prior AI experience, can now leverage AI models to build applications. In this book, author Chip Huyen discusses AI engineering: the process of building applications with readily available foundation models.

The book starts with an overview of AI engineering, explaining how it differs from traditional ML engineering and discussing the new AI stack. The more AI is used, the more opportunities there are for catastrophic failures, and therefore, the more important evaluation becomes. This book discusses different approaches to evaluating open-ended models, including the rapidly growing AI-as-a-judge approach.

AI application developers will discover how to navigate the AI landscape, including models, datasets, evaluation benchmarks, and the seemingly infinite number of use cases and application patterns. You'll learn a framework for developing an AI application, starting with simple techniques and progressing toward more sophisticated methods, and discover how to efficiently deploy these applications.

Understand what AI engineering is and how it differs from traditional machine learning engineering
Learn the process for developing an AI application, the challenges at each step, and approaches to address them
Explore various model adaptation techniques, including prompt engineering, RAG, fine-tuning, agents, and dataset engineering, and understand how and why they work
Examine the bottlenecks for latency and cost when serving foundation models and learn how to overcome them
Choose the right model, dataset, evaluation benchmarks, and metrics for your needs
Chip Huyen works to accelerate data analytics on GPUs at Voltron Data. Previously, she was with Snorkel AI and NVIDIA, founded an AI infrastructure startup, and taught Machine Learning Systems Design at Stanford. She's the author of the book Designing Machine Learning Systems, an Amazon bestseller in AI.

AI Engineering builds upon and is complementary to Designing Machine Learning Systems (O'Reilly).
"""

In [4]:
mldp_text="""
The design patterns in this book capture best practices and solutions to recurring problems in machine learning. The authors, three Google engineers, catalog proven methods to help data scientists tackle common problems throughout the ML process. These design patterns codify the experience of hundreds of experts into straightforward, approachable advice.

In this book, you will find detailed explanations of 30 patterns for data and problem representation, operationalization, repeatability, reproducibility, flexibility, explainability, and fairness. Each pattern includes a description of the problem, a variety of potential solutions, and recommendations for choosing the best technique for your situation.

You'll learn how to:

Identify and mitigate common challenges when training, evaluating, and deploying ML models
Represent data for different ML model types, including embeddings, feature crosses, and more
Choose the right model type for specific problems
Build a robust training loop that uses checkpoints, distribution strategy, and hyperparameter tuning
Deploy scalable ML systems that you can retrain and update to reflect new data
Interpret model predictions for stakeholders and ensure models are treating users fairly
"""

## Step 1: Identify 5 ways to improve a marketing description

In [5]:
from pydantic_ai import Agent
from pydantic.dataclasses import dataclass
from typing import List
from pathlib import Path

@dataclass
class Critique:
    target_audience: List[str]
    improvements: List[str]
    
    def __str__(self):
        nl='\n'
        return f"""
**Target audience**:
{','.join(self.target_audience)}
    
**Suggested changes**:
{nl.join(self.improvements)}
        """.strip()

def critique(in_text: str) -> Critique:
    prompt = f"""
    You are an expert marketer for technology books.
    You will be given the marketing description for a book.
    Identify the target audience by roles (eg: Data Analyst, Data Engineer)
    Suggest exactly 5 ways that the *marketing description* can be improved so
    that it appeals better to this target audience.
    Do not suggest improvements to the book itself.
    
    **Marketing Description**:
    """
    agent = Agent(GEMINI,
                  result_type=Critique)
    print(f"Invoking LLM to critique text")
    result = agent.run_sync([prompt,
                             in_text])
    return (result.data)

Could use LLM-as-Judge to ensure that the critique does not suggesting adding new content, etc.
But let's keep it simple

In [6]:
def assert_critique(critique: Critique):
    assert len(critique.improvements) > 3, "Has at least 3 improvements"
    assert len(critique.target_audience) > 0, "Has at least one role"

In [7]:
aieng_critique = critique(aieng_text)
print(aieng_critique)
assert_critique(aieng_critique)

Invoking LLM to critique text
**Target audience**:
AI Application Developers,Software Engineers,Data Scientists
    
**Suggested changes**:
Emphasize the practical application of AI engineering for solving real-world problems faced by AI application developers.
Highlight specific tools, frameworks, and platforms covered in the book that are relevant to AI application development.
Showcase success stories or case studies of AI applications built using the techniques discussed in the book.
Address the challenges and pain points that AI application developers commonly encounter when working with foundation models.
Offer guidance on how to optimize AI applications for performance, scalability, and cost-effectiveness in production environments, including addressing latency and cost bottlenecks when serving foundation models


In [8]:
mldp_critique = critique(mldp_text)
print(mldp_critique)
assert_critique(mldp_critique)

Invoking LLM to critique text
**Target audience**:
Data Scientist,Machine Learning Engineer,AI Researcher
    
**Suggested changes**:
Use more specific job titles (e.g., Machine Learning Engineer, AI Researcher) instead of the general term "data scientists."
Highlight the practical applications of the design patterns and how they can directly improve the efficiency and effectiveness of their work.
Emphasize the scalability and maintainability aspects of the solutions, as these are critical concerns for professionals deploying ML systems in production.
Include a section on how the book helps in troubleshooting and debugging common ML issues, which is a frequent pain point for practitioners.
Add testimonials or endorsements from well-known figures in the machine learning community to build credibility and trust with the target audience.


## Step 2: Make improvement that will have highest ROI

In [9]:
@dataclass
class Improvement:
    change: str
    reason: str
    modified_marketing_description: str
    
    def __str__(self):
        return f"""
**Change**:
{self.change}
    
**Reason**:
{self.reason}

**New description**:
{self.modified_marketing_description}
        """.strip()    
    
def improve(marketing_text: str, c: Critique) -> Improvement:
    prompt = f"""
    You are a helpful marketing assistant.
    You will be given the marketing description for a book,
    its target audience, and a list of suggested changes.

    Pick one change from the list that best meets these criteria:
    - Does not require changing the book itself, only the marketing description
    - Will make the book much more appealing to the target audience.
    - Requires only 1-5 lines changed in the text of the marketing description.
    Then, make the change and return a change log and the modified description.
    
    **Marketing Description**:
    {marketing_text}
    
    {c}
    """
    print(f"Invoking LLM to improve text")
    agent = Agent(GEMINI,
                  result_type=Improvement)
    result = agent.run_sync(prompt)
    return (result.data)

In [10]:
import difflib
def assert_improvement(improvement: Improvement, orig_text: str, c: Critique):
    assert improvement.change in c.improvements, "Chosen change is in original list"
    lines_changed = list(difflib.Differ().compare(improvement.modified_marketing_description.splitlines(), orig_text.splitlines()))
    nlines_changed = 0
    for line in lines_changed:
        if line.startswith('+ ') or line.startswith('- '):
            nlines_changed += 1
    assert nlines_changed > 0 and nlines_changed <= 5, f"{nlines_changed} lines changed, not 1-5"

In [11]:
improved = improve(aieng_text, aieng_critique)
print(improved)
assert_improvement(improved, aieng_text, aieng_critique)

Invoking LLM to improve text
**Change**:
Emphasize the practical application of AI engineering for solving real-world problems faced by AI application developers.
    
**Reason**:
This change directly addresses the target audience by highlighting the practical applications of AI engineering, making the book more appealing to AI application developers seeking solutions to real-world problems. It requires only a small modification to the first paragraph, focusing on the book's ability to help build practical AI applications. 

**New description**:
Recent breakthroughs in AI have not only increased demand for AI products, they've also lowered the barriers to entry for those who want to build AI products. The model-as-a-service approach has transformed AI from an esoteric discipline into a powerful development tool that anyone can use. Everyone, including those with minimal or no prior AI experience, can now leverage AI models to build applications. In this book, author Chip Huyen discusse

In [12]:
improved = improve(mldp_text, mldp_critique)
print(improved)
assert_improvement(improved, mldp_text, mldp_critique)

Invoking LLM to improve text
**Change**:
Use more specific job titles (e.g., Machine Learning Engineer, AI Researcher) instead of the general term "data scientists."
    
**Reason**:
The target audience includes Machine Learning Engineers and AI Researchers, so using these specific job titles instead of the general term "data scientists" will make the book more appealing to them. This change requires only one line to be modified in the marketing description and does not require changing the book itself. 

**New description**:
The design patterns in this book capture best practices and solutions to recurring problems in machine learning. The authors, three Google engineers, catalog proven methods to help Machine Learning Engineers and AI Researchers tackle common problems throughout the ML process. These design patterns codify the experience of hundreds of experts into straightforward, approachable advice.

In this book, you will find detailed explanations of 30 patterns for data and pr

## Chain, with dependency injection

Use dependency injection

In [13]:
from collections.abc import Callable

def improvement_chain(in_text: str,
                      critique_fn: Callable[[str], Critique] = critique,
                      improve_fn: Callable[[str, Critique], Improvement] = improve
                     ) -> Improvement:
    c = critique_fn(in_text)
    assert_critique(c)
    
    improved = improve_fn(in_text, c)
    assert_improvement(improved, in_text, c)
    
    return improved

improved = improvement_chain(aieng_text)
print(improved)

Invoking LLM to critique text
Invoking LLM to improve text
**Change**:
Add a call to action, encouraging readers to purchase the book and start building AI applications immediately. For example: "Get your copy today and start building the future of AI!"
    
**Reason**:
Adding a call to action encourages immediate engagement and purchase, directly appealing to the target audience of AI Application Developers, Generative AI Engineers, Machine Learning Engineers, and Software Engineers who are looking to build and deploy AI applications. It's a simple yet effective way to increase book sales and adoption of AI engineering principles. The change requires adding just one line to the description, making it a quick and impactful modification. 

**New description**:
Recent breakthroughs in AI have not only increased demand for AI products, they've also lowered the barriers to entry for those who want to build AI products. The model-as-a-service approach has transformed AI from an esoteric dis

## Running only Step 2, with a fake Step 1


In [14]:
def mock_critique(in_text: str) -> Critique:
    print(f"Using mock to critique text")
    return Critique(
        target_audience = 'AI Engineers,Machine Learning Engineers,Software Engineers'.split(','),
        improvements = """
Use more precise language to define the problems the book solves.
Add specific examples of how the design patterns have been used to solve real-world problems.
Highlight the benefits of using design patterns, such as increased efficiency, reduced costs, and improved accuracy.
Emphasize the book's practical approach, with step-by-step instructions and code examples.
Include testimonials from data scientists who have used the design patterns in the book to improve their work.
        """.strip().split('\n')
    )

mock_critique(aieng_text)

Using mock to critique text


Critique(target_audience=['AI Engineers', 'Machine Learning Engineers', 'Software Engineers'], improvements=['Use more precise language to define the problems the book solves.', 'Add specific examples of how the design patterns have been used to solve real-world problems.', 'Highlight the benefits of using design patterns, such as increased efficiency, reduced costs, and improved accuracy.', "Emphasize the book's practical approach, with step-by-step instructions and code examples.", 'Include testimonials from data scientists who have used the design patterns in the book to improve their work.'])

In [15]:
improved = improvement_chain(aieng_text, critique_fn=mock_critique)
print(improved)

Using mock to critique text
Invoking LLM to improve text
**Change**:
Use more precise language to define the problems the book solves.
    
**Reason**:
The original description is too general. Specifying that the book focuses on the challenges of deploying and scaling AI applications makes it more appealing to the target audience of AI/ML/Software engineers. 

**New description**:
Recent breakthroughs in AI have not only increased demand for AI products, they've also lowered the barriers to entry for those who want to build AI products. The model-as-a-service approach has transformed AI from an esoteric discipline into a powerful development tool that anyone can use. Everyone, including those with minimal or no prior AI experience, can now leverage AI models to build applications. In this book, author Chip Huyen discusses AI engineering: the process of building applications with readily available foundation models, focusing on the challenges of deploying and scaling AI applications in 

In [16]:
improved = improvement_chain(mldp_text, critique_fn=mock_critique)
print(improved)

Using mock to critique text
Invoking LLM to improve text
**Change**:
Use more precise language to define the problems the book solves.
    
**Reason**:
The change makes the description more appealing to the target audience by using more precise language to define the problems the book solves. It also highlights the benefits of using design patterns, such as addressing challenges like unreliable training data, ensuring consistent results, and mitigating bias. This change does not require changing the book itself and only requires a few lines changed in the text of the marketing description. 

**New description**:
The design patterns in this book capture best practices and solutions to recurring problems in machine learning. The authors, three Google engineers, catalog proven methods to help data scientists tackle common problems throughout the ML process. These design patterns codify the experience of hundreds of experts into straightforward, approachable advice.

In this book, you will