User Story
As Maya, in order to demonstrate that a small fine-tuned model can match or beat prompted larger models on a narrow domain task, I want an extraction variant powered by a LoRA fine-tune of a small open model, served from our own inference endpoint.
Preconditions
Acceptance Criteria
Success Metrics
- Either: the LoRA variant ships and hits a reasonable recall number (>50%), OR the catalog honestly documents what was attempted and what blocked shipping
- If shipped, cost-per-extraction documented vs API variants
Notes
- Class topic: fine-tuning (Ch 5), MLOps (Ch 4), production deployment (Ch 6) — the headline rubric item
- Base model choice: smallest viable — Llama 3.2 3B or Mistral 7B
- Cost gate: explicit user approval before GPU training spend
- This is the most ambitious story; acceptable to scope-defer with a written explanation
Definition of Done
User Story
As Maya, in order to demonstrate that a small fine-tuned model can match or beat prompted larger models on a narrow domain task, I want an extraction variant powered by a LoRA fine-tune of a small open model, served from our own inference endpoint.
Preconditions
Acceptance Criteria
data/fine-tuning/extraction/— at least 50 (PDF description → spec) pairs, either hand-curated from fixtures or Opus-generated and reviewedscripts/fine-tune-extraction.ts(or language appropriate to the trainer)extraction/lora-v1in the extraction registry that calls the inference endpoint via HTTPcatalog/experiments/pdf-field-extraction/lora-v1.mdincluding: base model, training data size, LoRA rank/alpha, training hyperparameters, deployment architecture, and metric deltas vs baselinescatalog/experiments/_roadmap.mdupdated with shipped status and one-line findingscope-deferredwith reasoning — "attempted" is an honest outcomeSuccess Metrics
Notes
Definition of Done