# CSCE 580 — Quiz 2 (Fall 2025)
**Date:** October 7, 2025  
**Student:** David Dinh

This notebook implements the **code** portions of Quiz 2 and prepares the expected folder structure and artifacts.

## Q1 — Compare Energy Consumption (10 pts)

Use the tool at: https://symbio6.nl/en/apps/ai-vs-simple-energy.html

**Instructions:**
1. Pick **three different settings** (e.g., translation, summarization, image captioning).
2. Record AI (LLM) vs Classical energy usage.
3. Compute the **difference** and **average difference**, and decide which approach is higher on average.

In [None]:
#  Enter your measurements here (replace example numbers with real values from the website)
import pandas as pd

df_energy = pd.DataFrame({
    "Setting": ["Searching", "Math", "Image Captioning"],  
    "AI_Energy_Wh": [2.9, 1.7, 2.9],        
    "Classical_Energy_Wh": [0.3, 0.01, 0.3], 
})
df_energy["Difference_Wh"] = df_energy["AI_Energy_Wh"] - df_energy["Classical_Energy_Wh"]
display(df_energy)

avg_diff = df_energy["Difference_Wh"].mean()
winner = "AI (LLM)" if avg_diff > 0 else "Classical"
print(f"Average difference across 3 settings: {avg_diff:.3f} Wh")
print(f"On average, {winner} uses more energy.")


: 

## Q2 — Convert Recipes to R3 JSON (90 pts)

You must select **two recipes** (not the sample Egg Drop soup). Save their **cleaned** text as:

- `data/original_recipe1.txt`
- `data/original_recipe2.txt`

Each cleaned file should include lines like:
```
Recipe Name: <Name>
Source: <URL>

Ingredients:
- item 1
- item 2

Instructions:
1. Step one...
2. Step two...
```


In [None]:
#  Quick peek: show first few lines of the cleaned recipe files if they exist
from pathlib import Path

def head(path, n=20):
    p = Path(path)
    if not p.exists():
        print(f" {path} not found.")
        return
    print(f"--- {path} ---")
    with p.open(encoding="utf-8") as f:
        for i, line in enumerate(f):
            if i >= n: break
            print(line.rstrip())

head("data/original_recipe1.txt")
print()
head("data/original_recipe2.txt")


### Utilities — Parse cleaned text and build R3 JSON

This section provides helper functions used to:
- Parse the cleaned `.txt` files
- Enforce **single-action** imperative instruction steps
- Generate PF1/PF2/PF3 style JSON and PP (partial-stitched) JSON

In [None]:
# Helpers to parse and generate JSONs
import json, re
from pathlib import Path

def parse_cleaned_txt(path: Path):
    """Parse a cleaned recipe .txt into a dict with name, provenance, ingredients, instructions."""
    text = Path(path).read_text(encoding="utf-8")
    text = text.replace("\r\n", "\n")
    name_match = re.search(r"Recipe Name:\s*(.+)", text, re.IGNORECASE)
    src_match  = re.search(r"Source:\s*(.+)", text, re.IGNORECASE)
    recipe_name = name_match.group(1).strip() if name_match else ""
    data_provenance = src_match.group(1).strip() if src_match else ""

    ing_match = re.search(r"Ingredients:\s*(.*?)(?:\n\s*\n|Instructions:)", text, re.IGNORECASE | re.DOTALL)
    ing_block = ing_match.group(1).strip() if ing_match else ""
    instr_match = re.search(r"Instructions:\s*(.*)", text, re.IGNORECASE | re.DOTALL)
    instr_block = instr_match.group(1).strip() if instr_match else ""

    ingredients = []
    for line in ing_block.splitlines():
        line = line.strip()
        if not line: continue
        line = re.sub(r"^[-•]\s*", "", line)  # remove leading bullet/dash
        ingredients.append(line)

    steps = []
    for line in instr_block.splitlines():
        line = line.strip()
        if not line: continue
        line = re.sub(r"^(step\d+|\d+[\.\)])\s*", "", line, flags=re.IGNORECASE)
        steps.append(line)

    return {
        "recipe_name": recipe_name,
        "data_provenance": data_provenance,
        "ingredients": ingredients,
        "instructions": steps
    }

def to_imperative_atomic(steps):
    """Enforce single-action, imperative, short sentences. Heuristic-only."""
    out = []
    for s in steps:
        s = s.strip()
        parts = re.split(r"\s+and\s+", s)
        for p in parts:
            p = p.strip()
            if not p: continue
            p = p.rstrip(". ").strip()
            p = p[:1].upper() + p[1:]
            if not p.endswith("."):
                p += "."
            out.append(p)
    return out

def make_pf1_json(recipe):
    return {
        "recipe_name": recipe["recipe_name"],
        "data_provenance": recipe["data_provenance"],
        "macronutrients": {},
        "ingredients": recipe["ingredients"],
        "instructions": to_imperative_atomic(recipe["instructions"]),
    }

def make_pf2_json(recipe):
    return {
        "recipe_name": recipe["recipe_name"],
        "data_provenance": recipe["data_provenance"],
        "macronutrients": {},
        "ingredients": recipe["ingredients"],
        "instructions": to_imperative_atomic(recipe["instructions"]),
    }

def make_pf3_json(recipe):
    atomic = to_imperative_atomic(recipe["instructions"])
    cleaned = []
    for s in atomic:
        s2 = re.sub(r"\s*\([^)]*\)", "", s).strip()
        s2 = re.sub(r"\s*;\s*.*$", ".", s2)
        if not s2.endswith("."):
            s2 += "."
        cleaned.append(s2)
    return {
        "recipe_name": recipe["recipe_name"],
        "data_provenance": recipe["data_provenance"],
        "macronutrients": {},
        "ingredients": recipe["ingredients"],
        "instructions": cleaned,
    }

def write_json(obj, path):
    Path(path).write_text(json.dumps(obj, indent=2), encoding="utf-8")
    return path


### Generate PF1 / PF2 / PF3 and PP JSONs

This cell reads the two cleaned recipe files and writes **8 JSON files**:

- `recipe1_pf1.json`, `recipe1_pf2.json`, `recipe1_pf3.json`, `recipe1_pp.json`
- `recipe2_pf1.json`, `recipe2_pf2.json`, `recipe2_pf3.json`, `recipe2_pp.json`

In [None]:
# Run to generate all JSONs (edit input paths if needed)
r1 = parse_cleaned_txt("data/original_recipe1.txt")
r2 = parse_cleaned_txt("data/original_recipe2.txt")

# PF1/PF2/PF3
write_json(make_pf1_json(r1), "data/recipe1_pf1.json")
write_json(make_pf2_json(r1), "data/recipe1_pf2.json")
write_json(make_pf3_json(r1), "data/recipe1_pf3.json")

write_json(make_pf1_json(r2), "data/recipe2_pf1.json")
write_json(make_pf2_json(r2), "data/recipe2_pf2.json")
write_json(make_pf3_json(r2), "data/recipe2_pf3.json")

# PP (partial stitch): reuse parsed ingredients/instructions (as if extracted separately)
pp1 = {
    "recipe_name": r1["recipe_name"] or "Recipe 1",
    "data_provenance": r1["data_provenance"],
    "macronutrients": {},
    "ingredients": r1["ingredients"],
    "instructions": to_imperative_atomic(r1["instructions"]),
}
pp2 = {
    "recipe_name": r2["recipe_name"] or "Recipe 2",
    "data_provenance": r2["data_provenance"],
    "macronutrients": {},
    "ingredients": r2["ingredients"],
    "instructions": to_imperative_atomic(r2["instructions"]),
}
write_json(pp1, "data/recipe1_pp.json")
write_json(pp2, "data/recipe2_pp.json")

from pathlib import Path
for f in [
    "data/recipe1_pf1.json", "data/recipe1_pf2.json", "data/recipe1_pf3.json", "data/recipe1_pp.json",
    "data/recipe2_pf1.json", "data/recipe2_pf2.json", "data/recipe2_pf3.json", "data/recipe2_pp.json",
]:
    print(" created:", Path(f).resolve())


### Evaluate Goodness Scores (Q2c)

Rubric:
- **50** points for valid JSON
- **+10** each if the JSON contains: `recipe_name`, `data_provenance`, `macronutrients`, `ingredients`, `instructions`  
Max **100**.

In [None]:
# Evaluator
import json
import pandas as pd
from pathlib import Path

fields = ['recipe_name', 'data_provenance', 'macronutrients', 'ingredients', 'instructions']

def evaluate_goodness(json_path: Path) -> int:
    try:
        data = json.loads(Path(json_path).read_text(encoding="utf-8"))
    except Exception:
        return 0  # invalid JSON file
    score = 50
    for k in fields:
        if k in data:
            score += 10
    return min(score, 100)

files = [
    "data/recipe1_pf1.json","data/recipe1_pf2.json","data/recipe1_pf3.json","data/recipe1_pp.json",
    "data/recipe2_pf1.json","data/recipe2_pf2.json","data/recipe2_pf3.json","data/recipe2_pp.json",
]

rows = []
for f in files:
    p = Path(f)
    if p.exists():
        rows.append({"file": f, "score": evaluate_goodness(p)})
    else:
        rows.append({"file": f, "score": None})

df_scores = pd.DataFrame(rows).sort_values(by=["file"])
display(df_scores)

# Highest per recipe
best_r1 = df_scores[df_scores["file"].str.contains("recipe1")]["score"].max()
best_r2 = df_scores[df_scores["file"].str.contains("recipe2")]["score"].max()
print(f"Best Recipe 1 score: {best_r1}")
print(f"Best Recipe 2 score: {best_r2}")


### Create AI Test Case Doc (Q2d)

This creates a starter **`docs/recipe-testcase.md`** following the provided template structure.

In [None]:
from pathlib import Path

testcase = r"""# AI Test Case — Recipe to R3 JSON

## 1. Objective
Convert cleaned recipe text into structured R3 JSON for use by a robotic chef.

## 2. Inputs
- Two cleaned recipes in `data/original_recipe1.txt` and `data/original_recipe2.txt`.
- Prompt-Full strategies: PF1, PF2, PF3
- Prompt-Partial strategies: PP-1 (ingredients), PP-2 (instructions)

## 3. Expected Behavior
- Valid JSON object with keys: `recipe_name`, `data_provenance`, `macronutrients`, `ingredients`, `instructions`.
- Each instruction is a single, imperative action.

## 4. Procedure
1. Use PF1, PF2, PF3 prompts separately for each recipe → save JSONs in `data/`.
2. Use PP-1 and PP-2 to extract ingredients and instructions, then stitch into full JSON in this notebook.
3. Run the goodness evaluator on all 8 JSON files.

## 5. Evaluation Criteria (Rubric)
- 50 points for valid JSON.
- +10 points each for the presence of the five required fields (max 100).

## 6. Results
- Recipe 1: list PF1/PF2/PF3/PP scores.
- Recipe 2: list PF1/PF2/PF3/PP scores.

## 7. Observations
- Which approach (full vs partial) performed better and why?
- Any common formatting errors?

"""
Path("docs/recipe-testcase.md").write_text(testcase, encoding="utf-8")
print(" Wrote docs/recipe-testcase.md")
