Skip to content

QA Report: Qwen2.5-Coder-1.5B-Instruct Qualified #171

@noahgift

Description

@noahgift

Model Qualification Report: Qwen2.5-Coder-1.5B-Instruct

Date: 2026-01-30
Qualified By: apr-model-qa-playbook v0.1.0
Model: Qwen/Qwen2.5-Coder-1.5B-Instruct
Format: GGUF Q4_K_M (1.04 GB)

Summary

Metric Value
MQS Score 200/1000
Grade F (Partial - QUAL only)
Gateways 4/4 PASSED
Tests 50/50 PASSED (100%)
Duration 195.4s

Gateway Status

Gateway Status Description
G1-LOAD ✅ PASS Model loads successfully
G2-INFER ✅ PASS Basic inference works
G3-STABLE ✅ PASS No crashes or panics
G4-VALID ✅ PASS Output is not garbage

Performance Metrics

Metric Value
Tokens/second 5.9 - 21.2 tok/s
Generation time (32 tokens) ~1.5s
Total latency (incl. load) ~3.8s
Backend CPU

Test Matrix

  • Modalities: run, chat
  • Backends: cpu
  • Formats: gguf
  • Scenarios per combination: 25

Artifacts

  • evidence.json - Full test evidence (50 entries)
  • junit.xml - JUnit XML for CI integration
  • mqs.json - Machine-readable MQS score
  • report.html - Interactive HTML dashboard

Methodology

Tests follow the Popperian Falsification protocol:

  • Each test is a falsifiable hypothesis
  • Outcome: Corroborated (survived refutation) or Falsified (refuted)
  • All 50 hypotheses were corroborated

Recommendations

  1. Production Ready: Yes, for CPU inference
  2. Performance: Acceptable (5.9+ tok/s on CPU)
  3. Stability: No crashes observed in 50 tests

Next Steps

  • Run full qualification (1800 tests) for comprehensive coverage
  • Add GPU backend testing
  • Test additional quantizations (Q5_K_M, Q8_0)

Generated by apr-model-qa-playbook

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions