Ship a LoRA-tuned local model that passes coding challenges via our tool system

## Problem
The shipped compacted model gives garbage for general conversation and can't use our tools reliably. We need a model that:
1. Passes real coding challenges (not trivial hello worlds)
2. Uses OUR tool system correctly (code/edit, code/write, shell, etc.)
3. Ships with Continuum — works out of the box, zero API keys

## The pipeline
1. **Academy RealClassEval**: 488 Python challenges (390 train, 98 eval). Current best: 53.1% Pass@1 with DeepSeek-Chat (cloud). Local model needs to match or exceed this.
2. **Tool-call LoRA**: Fine-tune on successful tool invocation traces from CodingAgent sessions. Model learns OUR tool schema, not generic function calling.
3. **Sentinel coding pipelines**: dev/build-feature, dev/fix-bug must work end-to-end with the shipped model.

## Approach
- Base: Qwen2.5-Coder-14B (fits 16GB MacBook Air at Q5_K)
- LoRA training on: RealClassEval solutions + CodingAgent tool traces + Academy exam passes
- Eval: must pass >40% RealClassEval with tool system (not raw code generation)
- Ship as: `continuum-ai/qwen2.5-coder-14b-continuum` on HuggingFace
- Auto-discovered by CandleAdapter on first run

## Success criteria
- `./jtag academy/start --mode=realclasseval --questionsPerExam=98` passes >40% with LOCAL model
- Sentinel `dev/fix-bug` completes successfully on real repo bugs with LOCAL model
- Zero API keys needed. Works on MacBook Air 16GB.

## Dependencies
- #324 (tool calling parsers)
- #325 (ship 14B model)
- #321 (local inference quality)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ship a LoRA-tuned local model that passes coding challenges via our tool system #344

Problem

The pipeline

Approach

Success criteria

Dependencies

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Ship a LoRA-tuned local model that passes coding challenges via our tool system #344

Description

Problem

The pipeline

Approach

Success criteria

Dependencies

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions