# Day 05 — Evaluation + guardrails

This notebook shows a **lightweight evaluation** loop: draft a response, score it with a rubric, and decide whether to revise.


## What this notebook does
- Loads a rubric prompt from `prompts/eval_rubric.txt`.
- Runs a first-pass response.
- Asks the evaluator to score the response.
- Prints the score to guide a revision.


In [None]:
import json
from pathlib import Path

from openai import OpenAI

client = OpenAI()
PROMPTS_DIR = Path("prompts")

def load_prompt(name: str) -> str:
    return (PROMPTS_DIR / name).read_text()


In [None]:
user_request = "Give me a 3-step plan to learn prompt engineering in one week."

draft = client.responses.create(
    model="gpt-4.1-mini",
    input=user_request,
)
draft_text = draft.output_text
print(draft_text)


In [None]:
rubric = load_prompt("eval_rubric.txt")

evaluation = client.responses.create(
    model="gpt-4.1-mini",
    input=[
        {"role": "system", "content": rubric},
        {"role": "user", "content": f"User request: {user_request}\n\nDraft: {draft_text}"},
    ],
)

score = json.loads(evaluation.output_text)
print(score)
