# Level 1 â€” Week 3 Practice (Starter Notebook)

Starter code for structured outputs: JSON parsing + schema validation + retry/repair patterns.

## References (docs)
- JSON Schema (official): https://json-schema.org/
- Python `json` (official): https://docs.python.org/3/library/json.html
- Pydantic (validation): https://docs.pydantic.dev/latest/
- Tenacity (retries): https://tenacity.readthedocs.io/
- Prompt Engineering Guide (community): https://www.promptingguide.ai/
- Anthropic Cookbook (GitHub): https://github.com/anthropics/anthropic-cookbook


## Setup

Run this in an environment with `pydantic` and `tenacity` installed.


In [None]:
import json
from typing import List, Optional

from pydantic import BaseModel
from tenacity import retry, stop_after_attempt, wait_exponential


## Define a target schema

This schema defines what downstream code can rely on.


In [None]:
class ExtractionItem(BaseModel):
    field: str
    value: str

class ExtractionResult(BaseModel):
    items: List[ExtractionItem]
    notes: Optional[str] = None

schema_json = ExtractionResult.model_json_schema()
schema_json


## Simulate model output

We simulate common failure cases: invalid JSON, and valid JSON with wrong shape.


In [None]:
raw_good = json.dumps({
    'items': [{'field': 'company', 'value': 'Acme'}],
    'notes': 'ok',
}, ensure_ascii=False)
raw_bad_json = 'items: [company=Acme]'
raw_wrong_shape = json.dumps({'items': [{'field': 'company'}]}, ensure_ascii=False)
raw_good, raw_bad_json, raw_wrong_shape


## Parse + validate helper

JSON parsing + schema validation turns model output into an explicit success/failure.


In [None]:
def parse_and_validate(raw_text: str) -> ExtractionResult:
    data = json.loads(raw_text)
    return ExtractionResult.model_validate(data)


In [None]:
parse_and_validate(raw_good)


## Retry/repair wrapper (starter pattern)

In production you might re-prompt the model using the schema and the invalid output.


In [None]:
def naive_repair(raw_text: str) -> str:
    # TODO: replace with an LLM re-ask in the real project
    if raw_text.startswith('items:'):
        return json.dumps({
            'items': [{'field': 'company', 'value': 'Acme'}],
            'notes': 'repaired',
        }, ensure_ascii=False)
    return raw_text

@retry(stop=stop_after_attempt(3), wait=wait_exponential(multiplier=0.5, min=0.5, max=2.0))
def parse_validate_with_retry(raw_text: str) -> ExtractionResult:
    repaired = naive_repair(raw_text)
    return parse_and_validate(repaired)

parse_validate_with_retry(raw_bad_json)


## TODO: Integrate with a real LLM

- Put your schema into the prompt.
- On failure, re-prompt with invalid output and request corrected JSON.
- Always cap retries (e.g., 3).
