Model Qualification Report: Qwen2.5-Coder-1.5B-Instruct
Date: 2026-01-30
Qualified By: apr-model-qa-playbook v0.1.0
Model: Qwen/Qwen2.5-Coder-1.5B-Instruct
Format: GGUF Q4_K_M (1.04 GB)
Summary
| Metric |
Value |
| MQS Score |
200/1000 |
| Grade |
F (Partial - QUAL only) |
| Gateways |
4/4 PASSED |
| Tests |
50/50 PASSED (100%) |
| Duration |
195.4s |
Gateway Status
| Gateway |
Status |
Description |
| G1-LOAD |
✅ PASS |
Model loads successfully |
| G2-INFER |
✅ PASS |
Basic inference works |
| G3-STABLE |
✅ PASS |
No crashes or panics |
| G4-VALID |
✅ PASS |
Output is not garbage |
Performance Metrics
| Metric |
Value |
| Tokens/second |
5.9 - 21.2 tok/s |
| Generation time (32 tokens) |
~1.5s |
| Total latency (incl. load) |
~3.8s |
| Backend |
CPU |
Test Matrix
- Modalities: run, chat
- Backends: cpu
- Formats: gguf
- Scenarios per combination: 25
Artifacts
evidence.json - Full test evidence (50 entries)
junit.xml - JUnit XML for CI integration
mqs.json - Machine-readable MQS score
report.html - Interactive HTML dashboard
Methodology
Tests follow the Popperian Falsification protocol:
- Each test is a falsifiable hypothesis
- Outcome:
Corroborated (survived refutation) or Falsified (refuted)
- All 50 hypotheses were corroborated
Recommendations
- Production Ready: Yes, for CPU inference
- Performance: Acceptable (5.9+ tok/s on CPU)
- Stability: No crashes observed in 50 tests
Next Steps
Generated by apr-model-qa-playbook
Model Qualification Report: Qwen2.5-Coder-1.5B-Instruct
Date: 2026-01-30
Qualified By: apr-model-qa-playbook v0.1.0
Model: Qwen/Qwen2.5-Coder-1.5B-Instruct
Format: GGUF Q4_K_M (1.04 GB)
Summary
Gateway Status
Performance Metrics
Test Matrix
Artifacts
evidence.json- Full test evidence (50 entries)junit.xml- JUnit XML for CI integrationmqs.json- Machine-readable MQS scorereport.html- Interactive HTML dashboardMethodology
Tests follow the Popperian Falsification protocol:
Corroborated(survived refutation) orFalsified(refuted)Recommendations
Next Steps
Generated by apr-model-qa-playbook