# Exercise 2 — Code Generation with ReACT Prompting
This notebook demonstrates a **ReACT-style loop**: reason/plan → generate code → run → observe → fix.

We use a small dataset task: compute **monthly revenue by product** from messy rows.
The agent will produce code, we execute it, capture errors, and (optionally) iterate.


In [None]:
!pip -q install openai

# ===== Helper: LLM wrapper (REAL or MOCK) =====
# This notebook runs in MOCK mode by default so you always get "successful output"
# even without an API key. If you have an OpenAI key, set it and REAL mode will run.

import os, json, textwrap, re, sys
from dataclasses import dataclass

MODE = os.getenv("LLM_MODE", "MOCK").upper()  # set to REAL to use OpenAI
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")

@dataclass
class LLMResponse:
    text: str

def llm_call(system_prompt: str, user_prompt: str, mock_text: str) -> LLMResponse:
    """
    Returns LLMResponse.text.
    REAL mode uses OpenAI if OPENAI_API_KEY is set.
    MOCK mode returns mock_text (provided per call).
    """
    if MODE == "REAL":
        if not OPENAI_API_KEY:
            raise RuntimeError("REAL mode requested but OPENAI_API_KEY is not set.")
        # OpenAI SDK (recommended) — works in Colab once you `pip install openai`
        from openai import OpenAI
        client = OpenAI()
        resp = client.chat.completions.create(
            model=os.getenv("OPENAI_MODEL", "gpt-4.1-mini"),
            messages=[
                {"role": "system", "content": system_prompt},
                {"role": "user", "content": user_prompt},
            ],
            temperature=0.2,
        )
        return LLMResponse(resp.choices[0].message.content)
    else:
        return LLMResponse(mock_text)

print(f"✅ LLM wrapper ready. MODE={MODE}")


## Data setup

In [None]:

raw_rows = [
  {"date":"2026-01-02","product":"Paper","qty":"10","unit_price":"2.50"},
  {"date":"2026-01-15","product":"Paper","qty":"five","unit_price":"2.50"},   # bad qty
  {"date":"2026-01-20","product":"Pens","qty":"3","unit_price":None},        # missing price
  {"date":"2026-02-01","product":"Paper","qty":"2","unit_price":"2.50"},
  {"date":"2026-02-03","product":"Stapler","qty":"1","unit_price":"12.00"},
]
raw_rows


## ReACT prompt (full prompt you’ll include in your document)
Requirements/constraints:
- Use only Python standard library
- Parse qty & unit_price safely
- Skip invalid rows but log why
- Output a dict: {YYYY-MM: {product: revenue}}
- Show a short demo run


In [None]:

react_system = "You are a senior Python engineer. Follow the ReACT stages exactly and be explicit."
react_prompt = f"""
You must follow this ReACT loop:

STAGE 1 — REASON/PLAN (plain English, <=6 bullets):
- Identify tricky parts and edge cases.
- Outline approach.

STAGE 2 — ACTION: CODE
- Output ONLY a Python code block.
- Constraints:
  * Standard library only (no pandas).
  * Provide a function monthly_revenue(rows) -> (result_dict, log_list)
  * result_dict format: {{'YYYY-MM': {{'Product': revenue_float}}}}
  * Skip rows with invalid numbers; append a log message explaining the skip.
  * Include a demo run using the provided raw_rows and print both outputs.
  * Be robust to qty like "five" and missing prices.

STAGE 3 — OBSERVATION
- After code, write: "OBSERVATION: <what you expect to print>"

INPUT DATA:
raw_rows = {json.dumps(raw_rows, indent=2)}
"""
print(react_prompt[:900] + "\n...")


## Agent run (MOCK mode gives a working example; REAL mode will call the LLM)
We parse the model output, execute the code block, and show output.
If execution fails, we feed the error back for a fix iteration.

In [None]:

import traceback, types

mock_react_output = r"""
STAGE 1 — REASON/PLAN:
- Values may be strings, missing, or non-numeric; parse carefully.
- Dates must be grouped by YYYY-MM; validate minimal format.
- Revenue = qty * unit_price; use float.
- Skip bad rows and log why; do not crash.
- Keep standard library only.
- Provide clear demo output.

```python
from collections import defaultdict
from datetime import datetime

def _to_float(x):
    if x is None:
        return None
    try:
        return float(x)
    except (TypeError, ValueError):
        return None

def monthly_revenue(rows):
    result = defaultdict(lambda: defaultdict(float))
    logs = []

    for i, row in enumerate(rows):
        # date
        d = row.get("date")
        try:
            month = datetime.fromisoformat(d).strftime("%Y-%m")
        except Exception:
            logs.append(f"row {i}: skipped (invalid date={d!r})")
            continue

        product = row.get("product")
        if not product:
            logs.append(f"row {i}: skipped (missing product)")
            continue

        qty = _to_float(row.get("qty"))
        price = _to_float(row.get("unit_price"))

        if qty is None:
            logs.append(f"row {i}: skipped (invalid qty={row.get('qty')!r})")
            continue
        if price is None:
            logs.append(f"row {i}: skipped (invalid unit_price={row.get('unit_price')!r})")
            continue

        result[month][product] += qty * price

    # convert defaultdicts to normal dicts
    result = {m: dict(pmap) for m, pmap in result.items()}
    return result, logs

# demo
if __name__ == "__main__":
    raw_rows = [
      {"date":"2026-01-02","product":"Paper","qty":"10","unit_price":"2.50"},
      {"date":"2026-01-15","product":"Paper","qty":"five","unit_price":"2.50"},
      {"date":"2026-01-20","product":"Pens","qty":"3","unit_price":None},
      {"date":"2026-02-01","product":"Paper","qty":"2","unit_price":"2.50"},
      {"date":"2026-02-03","product":"Stapler","qty":"1","unit_price":"12.00"},
    ]
    res, logs = monthly_revenue(raw_rows)
    print("RESULT:", res)
    print("LOGS:", logs)
```

OBSERVATION: It should print revenue for Paper in Jan/Feb and Stapler in Feb, and logs for the invalid qty and missing price.
"""

def extract_code_block(text):
    m = re.search(r"```python\s*(.*?)```", text, flags=re.DOTALL)
    if not m:
        raise ValueError("No python code block found.")
    return m.group(1)

def run_react_once(output_text):
    code_str = extract_code_block(output_text)
    ns = {}
    try:
        exec(code_str, ns, ns)
        # call the function with our raw_rows (not the demo copy)
        res, logs = ns["monthly_revenue"](raw_rows)
        return res, logs, None
    except Exception as e:
        return None, None, traceback.format_exc()

# call LLM (or use mock)
react_out = llm_call(react_system, react_prompt, mock_react_output).text
print(react_out.splitlines()[0:18], "\n...\n")  # show top part

res, logs, err = run_react_once(react_out)
print("EXECUTION ERROR:" if err else "✅ EXECUTION OK")
if err:
    print(err)
else:
    print("Result dict:", res)
    print("Logs:", logs)


## Optional fix iteration template (only runs if there was an error)
This shows the *cycle* required by the rubric: observe error → ask model to patch code → re-run.

In [None]:

if err:
    fix_prompt = f"""
Your previous code failed with this traceback:

{err}

Return ONLY a corrected python code block that satisfies the original constraints.
"""
    mock_fix = "```python
# (mock fix would go here)
```"
    fixed_out = llm_call(react_system, fix_prompt, mock_fix).text
    print(fixed_out)


✅ **What to copy into write-up document for Exercise 2**
- Tools: Colab, OpenAI API (optional), Python exec
- Full ReACT prompt (shown above)
- Evidence: printed result dict + logs
- Notes: explain how errors would be looped back (even if mock run succeeds)
