Skip to content

proposal: add CLI round-trip regression test for all reduction rules #1005

@zazabap

Description

@zazabap

Summary

Four extract_solution bugs (#1000, #1001, #1002, #1003) and three missing-example gaps (#1004) were discovered by running the full CLI pipeline for all 43 reduction rules from PRs #777, #779, #804, #972. None are caught by existing CI.

Root Cause Analysis

The Rust closed-loop unit tests pass despite broken extract_solution implementations. Why?

// Unit test path (passes):
let target = source.reduce_to();
let target_solution = BruteForce::find_witness(&target);  // specific witness
let source_solution = reduction.extract_solution(target_solution);
assert!(source.evaluate(source_solution) == Or(true));  // works for THIS witness
# CLI path (fails):
pred create --example ThreePartition -o tp.json
pred reduce tp.json --to FlowShopScheduling -o bundle.json
pred solve bundle.json  # ILP finds a DIFFERENT valid witness
# extract_solution produces Or(false) for the ILP witness

The bug is that extract_solution is witness-dependent. It only works for the specific witness structure that brute-force happens to return, not for all valid target solutions. The ILP solver finds an equally valid but structurally different witness, and the Lehmer-code decoding breaks.

A correct extract_solution must work for any valid target solution, not just the one brute-force finds. This is a contract violation.

Proposal

1. Fix the root cause: closed-loop tests must also use ILP witnesses

Current closed-loop tests only verify extraction with brute-force witnesses. Add a second assertion using the ILP solver:

#[test]
fn test_source_to_target_closed_loop() {
    // Existing: brute-force witness
    let bf_witness = BruteForce::find_witness(&target).unwrap();
    let bf_extracted = reduction.extract_solution(bf_witness);
    assert!(source.evaluate(bf_extracted).is_feasible());

    // NEW: ILP witness (may be structurally different)
    let ilp_witness = ILP::find_witness(&target).unwrap();
    let ilp_extracted = reduction.extract_solution(ilp_witness);
    assert!(source.evaluate(ilp_extracted).is_feasible());
}

This catches the bug at the source — in the Rust test suite, with coverage, during development.

2. Add make cli-roundtrip as a CI regression test

A shell script that runs pred create --examplepred reducepred solve → checks source eval for every rule with a canonical example. This is a belt-and-suspenders check that also exercises serialization, registry lookup, and dynamic path finding.

cli-roundtrip: cli
    @scripts/cli_roundtrip.sh

This is mechanical — no skill needed. It runs in CI automatically.

Why NOT a skill

A skill adds nothing here. This is a deterministic for loop over commands with pass/fail checking. Skills are for workflows requiring judgment or interaction. An LLM cannot improve on test $result = "Or(true)".

Bugs this would catch

Issue Bug Unit test catches? ILP witness test catches? CLI roundtrip catches?
#1000 3Part→SeqMinWeightTardiness No Yes Yes
#1001 3Part→JobShopScheduling No Yes Yes
#1002 HP→ConsecutiveOnesSubmatrix No Yes Yes
#1003 3Part→FlowShopScheduling No Yes Yes
#1004 Missing examples No No Yes (create fails)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions