test: cross-engine regression coverage for unq on number lists#227
Merged
Conversation
Locks in the fix from 5a30e00 (unq list dedup via nanval_equal) across tree, VM, and Cranelift engines. The original persona report claimed len unq xs hung with exit 137 on a L n of meaningful size; cannot reproduce on current main, but adding coverage so any future drift back to raw-bits comparison or quadratic blowup gets caught. Covers: persona repro (len unq [1,2,2,3] = 3), order preservation, all-same, empty, all-unique, floats, negatives/zero, and a 1000-element stress test.
Demonstrates unq across the shapes the regression test pins: basic dedup, all-same, empty, all-unique, floats, negatives. The examples_engines harness asserts identical output across tree/VM/Cranelift, giving us a second line of coverage on top of the cross-engine regression test.
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The 2026-05-11 edu-teacher persona report (lines 391 / 586 of
ilo_assessment_feedback.md) saidlen unq xshung with exit 137 / OOM on aL nof meaningful size. I went looking for the bug and could not reproduce it onmain— at any size from[1,2,2,3]up to 100k all-unique elements, on any engine, in any path. The actual fix landed back in5a30e00(2026-03-06) which switched the list dedup path from raw-bits comparison tonanval_equal. That commit pre-dates the persona report by two months — they were almost certainly running a stale binary (their own entry #9 in the same batch documents~/.cargo/bin/ilobeing out of date and needing reinstall).No code change here. The substantive fix is already in main. What was missing was lock-in coverage for
unqon number lists specifically —tests/eval_inline.rscovered the text case but not the number-list case, so any future regression back to raw-bits comparison, RC mishandling, or quadratic blowup would slip through.This PR adds that coverage across all three engines.
Repro
Before (claimed):
Now (verified on current main):
Scaling check: 100k all-unique numbers → 100000, ~42ms. Linear in practice.
What's in the diff
Two commits, both additive:
test: cross-engine regression for unq on number lists— newtests/regression_unq_numbers.rswith 24 tests (8 cases × 3 engines): persona repro, order preservation, all-same, empty, all-unique, floats, negatives/zero, and a 1000-element stress test that catches quadratic-blowup OOM.examples: unq on number lists— newexamples/unq-numbers.iloexercising the same shapes through theexamples_enginesharness for a second line of cross-engine coverage, and as an in-context example for future agents.No production source touched.
Test plan
cargo fmtcleancargo clippy --release --features cranelift --tests --all-targetscleancargo test --release --features cranelift --test regression_unq_numbers→ 24/24 passcargo test --release --features cranelift --test examples_engines→ passcargo test --release --features craneliftfull suite green (2861 lib + all integration suites, 0 failures)Follow-ups
The
unqlist path is O(n²) (innerout.iter().any(nanval_equal)scan) but scales linearly in practice up to 100k elements thanks to auto-vectorisation of the equality compare. A hash-keyed rework would be a future optimisation, not a fix — no live bug forcing it, manifesto win is theoretical. Deferred.After merge, the entry will move out of
🛠 In Progressinto✅ Addressedciting5a30e00+ this PR.