![DeepLearning.AI logo](https://learn.deeplearning.ai/assets/dlai-logo.png)

# Agentic Workflows ‚Äî Reflection Design Pattern

This notebook demonstrates an agentic workflow for chart generation with *reflection*:
1) Generate V1 (LLM writes plotting code),
2) Execute and render,
3) Reflect on the output and write V2,
4) Execute the improved version.

Dataset: `coffee_sales.csv`
- `date`: YYYY-MM-DD
- `time`: HH:MM
- `cash_type`: Either card or cash
- `card`: Card number
- `price`: Drink price
- `coffee_name`: Drink ordered
- `quarter`: Quarter of the fiscal year
- `month`: month of the year
- `year`: year of the sale

In [16]:
from utils import *

## Set up & peek at data

Start with a quick look at the dataset we‚Äôll use for the chart-generation agentic workflow.
We‚Äôll sample a few rows to build intuition before generating code.

*Agentic mindset:* understand the data first, then draft, then reflect.

In [17]:
# A utils.py function to load our data. Returns a pandas dataframe
df = load_and_prepare_data('coffee_sales.csv')

# Grab a random sample from the dataset
print_html(df.sample(n=5), title="Random Sample of Coffee Sales Data")

date,time,cash_type,card,price,coffee_name,quarter,month,year
2024-06-16,16:43,card,ANON-0000-0000-0059,3.282,Americano with Milk,2,6,2024
2024-07-26,07:59,card,ANON-0000-0000-0408,3.282,Latte,3,7,2024
2024-07-30,15:12,card,ANON-0000-0000-0347,2.792,Americano with Milk,3,7,2024
2024-08-15,08:03,card,ANON-0000-0000-0508,3.282,Cappuccino,3,8,2024
2024-10-06,12:37,card,ANON-0000-0000-0718,3.576,Cappuccino,4,10,2024


You now have a dataset with valuable information that we can leverage to create the chart generation portion of the agentic workflow! Let's get started on building our pipeline.

## Step 1 ‚Äî Generate (V1)

LLM role: *data visualization expert* using **matplotlib** (no seaborn).
Task: write code that answers the user instruction, given only a schema/description.
Guardrails:
- Assume `df` already exists.
- Add clear titles/labels/legends.
- Save the figure to the provided path (no `plt.show()`).

*Why reflection?* Like a human, we first draft, then read the output, and improve it.

In [18]:
def get_response(model: str, prompt: str) -> str:
    if "claude" in model.lower() or "anthropic" in model.lower():
        # Anthropic Claude format
        message = anthropic_client.messages.create(
            model=model,
            max_tokens=1000,
            messages=[
                {
                    "role": "user",
                    "content": [
                        {
                            "type": "text",
                            "text": prompt
                        }
                    ]
                }
            ]
        )
        return message.content[0].text

    else:
        # Default to OpenAI format for all other models (gpt-4, o3-mini, o1, etc.)
        response = openai_client.responses.create(
            model=model,
            input=prompt,
        )
        return response.output_text

In [19]:
def generate_chart_code(instruction: str, model: str, out_path_v1: str) -> str:
    """Generate Python code to make a plot with matplotlib using tag-based wrapping."""

    prompt = f"""
You are a data visualization expert.

Return your answer *strictly* in this format:

<execute_python>
# valid python code here
</execute_python>

Do not add explanations, only the tags and the code.

    The code should create a visualization from a DataFrame 'df' with these columns:
    - date (M/D/YY)
    - time (HH:MM)
    - cash_type (card or cash)
    - card (string)
    - price (number)
    - coffee_name (string)
    - quarter (1-4)
    - month (1-12)
    - year (YYYY)

    User instruction: {instruction}

    Requirements for the code:
    1. Assume the DataFrame is already loaded as 'df'.
    2. Use matplotlib for plotting.
    3. Add clear title, axis labels, and legend if needed.
    4. Save the figure as '{out_path_v1}' with dpi=300.
    5. Do not call plt.show().
    6. Close all plots with plt.close().

    Return ONLY the code wrapped in <execute_python> tags.
    """

    response = get_response(model, prompt)
    return response


Let's try out the function and see what we get back!

In [20]:
instructions = "Create a chart showing year-over-year Q1 sales by drink type."
model = "gpt-4o-mini"

file_name_version_1 = "chart_v1.png"
code_v1 = generate_chart_code(instructions, model, file_name_version_1)
print_html(code_v1, title="Generated Code Output (V1)")

Great! You created some Python code to generate a chart! The next step is to turn that code into a chart. You'll use another `utils.py` function called `execute_chart_code`. It will take the generated code, the dataframe, and a filename. It will then execute the code and produce a chart that you can see!

üìù You may notice that OpenAI models reply with a wrapper around code, in our case \```python ... ```. The `execute_chart_code` also uses a regex expression to remove that wrapper and pass the code itself.

In [21]:
match = re.search(r"<execute_python>([\s\S]*?)</execute_python>", code_v1)
if match:
    initial_code = match.group(1).strip()
    exec_globals = {"df": df, "plt": plt, "pd": pd}
    exec(initial_code, exec_globals)

<Figure size 1200x600 with 0 Axes>

In [22]:
print_html(file_name_version_1, is_image=True, title="Generated Chart (V1)")

## Evaluating your chart

Now that you've created your chart, it's time to give that to the LLM to evaluate. The goal here is for the LLM to evaluate the chart visually and so we will pass the generated plot to the LLM. Let's try it out!

In [23]:
def _anthropic_call_json_with_image(client, model_name: str, prompt: str, media_type: str, b64: str) -> str:
    """
    Call Anthropic Claude (messages.create) with text+image and return *all* text blocks concatenated.
    Adds a system message to enforce strict JSON output.
    """
    msg = client.messages.create(
        model=model_name,
        max_tokens=2000,
        temperature=0,
        system=(
            "You are a careful assistant. Respond with a single valid JSON object only. "
            "Do not include markdown, code fences, or commentary outside JSON."
        ),
        messages=[{
            "role": "user",
            "content": [
                {"type": "text", "text": prompt},
                {"type": "image", "source": {"type": "base64", "media_type": media_type, "data": b64}},
            ],
        }],
    )

    # Anthropic returns a list of content blocks; collect all text
    parts = []
    for block in (msg.content or []):
        if getattr(block, "type", None) == "text":
            parts.append(block.text)
    return "".join(parts).strip()


In [24]:
def reflect_on_image_and_regenerate(
    chart_path: str,
    instruction: str,
    client,
    model_name: str,
    out_path_v2: str,
) -> tuple[str, str]:
    """
    Critique the chart IMAGE against the instruction, then return refined matplotlib code.
    Returns (feedback, refined_code_with_tags).
    Supports OpenAI and Anthropic (Claude).
    """
    media_type, b64 = encode_image_b64(chart_path)

    prompt = (
        "You are a data visualization expert. First, critique how well the attached chart communicates "
        "the instruction. Then return improved matplotlib code.\n\n"
        "STRICT OUTPUT FORMAT (JSON only):\n"
        "{\n"
        '  "feedback": "<brief, specific critique and suggestions>",\n'
        '  "refined_code": "<ONLY python code, wrapped in <execute_python> tags; assumes df exists; '
        f'saves to \'{out_path_v2}\' with dpi=300; NO plt.show(); DO call plt.close() at end>"\n'
        "}\n\n"
        "Constraints for the refined code:\n"
        "- Use pandas/matplotlib only (no seaborn).\n"
        "- Assume df exists; no file reads.\n"
        f"- Save to '{out_path_v2}' with dpi=300.\n"
        "- If year/month/quarter are needed and missing, derive them from df['date'] with:\n"
        "  df['date'] = pd.to_datetime(df['date'], errors='coerce')\n"
        "  if 'year' not in df.columns: df['year'] = df['date'].dt.year\n"
        "  if 'month' not in df.columns: df['month'] = df['date'].dt.month\n"
        "  if 'quarter' not in df.columns: df['quarter'] = df['date'].dt.quarter\n\n"
        "Schema (columns you may reference):\n"
        "- date (M/D/YY)\n"
        "- time (HH:MM)\n"
        "- cash_type (card or cash)\n"
        "- card (string)\n"
        "- price (number)\n"
        "- coffee_name (string)\n"
        "- quarter (1-4)\n"
        "- month (1-12)\n"
        "- year (YYYY)\n\n"
        f"Instruction:\n{instruction}\n"
    )

    lower = model_name.lower()
    if "claude" in lower or "anthropic" in lower:
        # ‚úÖ Use the safe helper that joins all text blocks and adds a system prompt
        content = _anthropic_call_json_with_image(client, model_name, prompt, media_type, b64)
    else:
        # OpenAI multimodal via Responses API (data URL inline)
        data_url = f"data:{media_type};base64,{b64}"
        resp = client.responses.create(
            model=model_name,
            input=[{
                "role": "user",
                "content": [
                    {"type": "input_text", "text": prompt},
                    {"type": "input_image", "image_url": data_url},
                ],
            }],
        )
        content = (resp.output_text or "").strip()

    # Robust JSON parse
    try:
        obj = json.loads(content)
    except Exception:
        m = re.search(r"\{.*\}", content, flags=re.DOTALL)
        obj = json.loads(m.group(0)) if m else {"feedback": content, "refined_code": ""}

    feedback = str(obj.get("feedback", "")).strip()
    refined_code = ensure_execute_python_tags(str(obj.get("refined_code", "")).strip())
    return feedback, refined_code


In [25]:
file_name_version_2 = "chart_v2.png"

feedback, code_v2 = reflect_on_image_and_regenerate(
    chart_path=file_name_version_1,            
    instruction=instructions,  # Reuse the original instructions
    client=openai_client,                     
    model_name="gpt-4.1",                     
    out_path_v2=file_name_version_2             
)

In [26]:
print_html(feedback, title="Feedback on V1 Chart")
print_html(code_v2, title="Regenerated Code Output (V2)")

In [27]:
match = re.search(r"<execute_python>([\s\S]*?)</execute_python>", code_v2)
if match:
    reflected_code = match.group(1).strip()
    exec_globals = {"df": df, "plt": plt, "pd": pd}
    exec(reflected_code, exec_globals)

In [28]:
print_html(file_name_version_2, is_image=True, title="Regenerated Chart (V2)")

## Your end-to-end chart generation workflow

Let's wrap this up by creating an end-to-end workflow that our agent can use to do this all in one go! In this `run_workflow` function, you will pass the instructions for the chart you want generated and also the models you want to use for each stage. This control of the models used in each stage allows you to mix and match and find the best combination that works for you!

In [29]:
from typing import Dict, Any
import pandas as pd
import re, os
import matplotlib.pyplot as plt

def run_workflow(
    dataset_path: str,
    user_instructions: str,
    generation_model: str,
    evaluation_model: str,
    image_basename: str = "chart",
) -> Dict[str, Any]:
    """
    End-to-end pipeline: generate -> execute -> (evaluate + refine) -> execute refined.
    Returns a dict with all artifacts (codes, feedback, images).
    """
    # Load dataset (LLM handles derivations like year/quarter)
    df = load_and_prepare_data(dataset_path)
    print_html(df.sample(n=5), title="Random Sample of Dataset")

    # Derive paths
    out_v1 = f"{image_basename}_v1.png"
    out_v2 = f"{image_basename}_v2.png"

    # Step 1: Generate code V1
    print_html("Step 1: Generating chart code... üìà\n")
    code_v1 = generate_chart_code(
        instruction=user_instructions,
        model=generation_model,
        out_path_v1=out_v1,
    )
    print_html(code_v1, title="Generated Code Output (V1)")

    # Step 2: Execute V1
    print_html("Step 2: Executing chart code... üíª\n")
    chart_path_v1 = None
    match = re.search(r"<execute_python>([\s\S]*?)</execute_python>", code_v1)
    if match:
        initial_code = match.group(1).strip()
        exec_globals = {"df": df, "plt": plt, "pd": pd}
        exec(initial_code, exec_globals)
        if os.path.exists(out_v1):
            chart_path_v1 = out_v1

    print_html(chart_path_v1, is_image=True, title="Generated Chart (V1)")

    # Step 3: Evaluate + refine
    print_html("Step 3: Evaluating and refining chart... üîÅ\n")
    feedback, code_v2 = reflect_on_image_and_regenerate(
        chart_path=out_v1,
        instruction=user_instructions,
        client=openai_client,  # or anthropic_client
        model_name=evaluation_model,
        out_path_v2=out_v2,
    )
    print_html(feedback, title="Feedback on V1 Chart")
    print_html(code_v2, title="Regenerated Code Output (V2)")

    # Step 4: Execute V2
    print_html("Step 4: Executing refined chart code... üñºÔ∏è\n")
    chart_path_v2 = None
    match = re.search(r"<execute_python>([\s\S]*?)</execute_python>", code_v2)
    if match:
        reflected_code = match.group(1).strip()
        exec_globals = {"df": df, "plt": plt, "pd": pd}
        exec(reflected_code, exec_globals)
        if os.path.exists(out_v2):
            chart_path_v2 = out_v2

    print_html(chart_path_v2, is_image=True, title="Regenerated Chart (V2)")

    return {
        "status": "success",
        "dataset_path": dataset_path,
        "user_instructions": user_instructions,
        "generation_model": generation_model,
        "evaluation_model": evaluation_model,
        "chart_code_v1": code_v1,
        "original_chart_path": chart_path_v1,
        "feedback": feedback,
        "refined_code_v2": code_v2,
        "refined_chart_path": chart_path_v2,
    }


Now you can run your full workflow! Try it out! Try different models and see what kind of results you get.

In [30]:
result = run_workflow(
    dataset_path="coffee_sales.csv",       # path to your dataset
    user_instructions="Create a chart showing year-over-year Q1 sales by drink type.",
    generation_model="gpt-4.1-mini",        # model used to generate code
    evaluation_model="o4-mini",            # model used to evaluate/refine
    image_basename="drink_sales"           # base name for saved charts
)


date,time,cash_type,card,price,coffee_name,quarter,month,year
2024-06-18,07:31,card,ANON-0000-0000-0299,2.792,Americano,2,6,2024
2024-07-31,06:49,card,ANON-0000-0000-0437,2.792,Americano with Milk,3,7,2024
2024-10-19,12:08,card,ANON-0000-0000-0690,3.576,Cocoa,4,10,2024
2024-10-17,17:53,card,ANON-0000-0000-0779,3.576,Hot Chocolate,4,10,2024
2024-10-19,17:31,card,ANON-0000-0000-0786,3.576,Cappuccino,4,10,2024
