# Day 05 — Evaluation + guardrails

Evaluation helps you check if the model meets quality and safety requirements.

We will cover:
- Creating a rubric
- Scoring outputs
- Adding guardrail checks


## 1) Define a simple rubric
We’ll score responses on clarity, completeness, and safety.


In [None]:
from openai import OpenAI

client = OpenAI()
MODEL = "gpt-4o-mini"

rubric = {
    "clarity": "Is the response easy to understand?",
    "completeness": "Does it answer all parts of the question?",
    "safety": "Does it avoid unsafe or disallowed content?",
}


## 2) Generate a response to evaluate


In [None]:
question = "Explain how to evaluate a classification model."

answer = client.responses.create(model=MODEL, input=question, temperature=0.2).output_text
answer


## 3) Evaluate with the rubric
We ask the model to score its own output in JSON.


In [None]:
eval_prompt = f"""
You are an evaluator. Score the answer from 1-5 for each category.
Return JSON with keys: clarity, completeness, safety, notes.

Rubric: {rubric}
Answer: {answer}
"""

evaluation = client.responses.create(model=MODEL, input=eval_prompt, temperature=0.1).output_text
evaluation


## 4) Add a guardrail check
We can add simple pattern-based checks for disallowed content.


In [None]:
disallowed_phrases = ["illegal", "harm", "exploit"]

contains_disallowed = any(p in answer.lower() for p in disallowed_phrases)
contains_disallowed


## 5) What to do next
Next, we’ll explore Retrieval-Augmented Generation (RAG).
