Skip to content

Wire IREE execution into CI and gate on PyTorch numerical checks #447

@michalharakal

Description

@michalharakal

SKaiNET Ground Truth already defines PyTorch as the reference and stores generated validation artifacts in GGUF. IREE provides both runtime tooling and test patterns for end-to-end execution. This milestone closes the loop: export from SKaiNET, compile with IREE, run on CPU, compare results against PyTorch-derived references, and block merges when numeric tolerances fail. That gives NLnet a milestone with an objective pass/fail gate instead of a documentation claim. ([GitHub][6])

Background links:

  • SKaiNET Ground Truth README. ([GitHub][6])
  • SKaiNET core and transformer scope for example models. ([GitHub][1])
  • IREE iree-run-module, Python packages, console scripts, and e2e testing. ([IREE][7])

Acceptance criteria:

  • CI installs IREE from supported packages or a pinned source build and records the method used.
  • CI compiles selected SKaiNET StableHLO fixtures to CPU modules and executes them automatically.
  • CI compares outputs against PyTorch ground truth with declared tolerances for each test family.
  • Merge gating fails when tolerance thresholds are exceeded.
  • Test fixtures use synthetic or public inputs only.
  • The runbook states whether console scripts come from pip packages or source build; unknown details are marked Unspecified.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions