Fix non-deterministic behavior and incorrect variable selection in Carpanzano tearing algorithm#72
Conversation
- sort(collect(active_eqs)) in find_single_solvable_eq! and
carpanzano_tear_scc! to eliminate hash-order-dependent iteration
over Set{Int} active_eqs and active_vars.
- Fix Heuristic 2 in carpanzano_tear_scc!: max_incidence_cnt and
min_solvable_cnt were never updated inside the loop body, causing
alg_var to be overwritten every iteration (cnt > typemin(Int) is
always true) and ending up as the last hash-order variable rather
than the one with maximum incidence / minimum solvable count.
Add the two missing tracker assignments.
2. Add unit test for carpanzano_tearing Heuristic 2 correctness
Test calls carpanzano_tear_scc! directly with a minimal 3x3 bipartite
graph where all edges are solvable (bypassing Heuristic 1) and v2 has
incidence 3 (vs v1=2 and v3=2). Verifies that Heuristic 2 correctly
selects v2 as the algebraic/torn variable.
Before the fix, max_incidence_cnt was never updated inside the Heuristic 2
loop so the last variable in hash-iteration order always 'won', making the
result non-deterministic and incorrect.
Clean up comments
Remove comments and adjust struct definition.
AayushSabharwal
left a comment
There was a problem hiding this comment.
Thank you for spotting the error and finding the fix. The review comments are for some performance concerns, but otherwise this looks great.
| nbors = nbors_buffer | ||
| for ieq in active_eqs | ||
| # Sort active_eqs iteration for deterministic equation selection regardless of hash order | ||
| for ieq in sort(collect(active_eqs)) |
There was a problem hiding this comment.
This would also be resolved if active_eqs is made an OrderedSet{Int} instead of Set{Int}, since then the equations are always iterated over in insertion order. This approach avoids two allocations from collect and sort.
There was a problem hiding this comment.
Thanks for reviewing, Aayush!
I considered that direction as well, but in this code path active_eqs / active_vars are populated from SCC data that itself comes from Set/Dict-backed structures, so their insertion order is already hash-dependent. If we simply switch to OrderedSet{Int}, we'd just be preserving that hash-dependent insertion order and would still have the same non-determinism problem.
To make OrderedSet a drop-in replacement here, we'd also need to ensure those sets are constructed in a deterministic way (e.g. inserting elements in sorted order during SCC decomposition), which is a larger refactor touching the callers and the SCC building logic.
Given that, I went with the explicit sort(collect(...)) as the minimal, local fix that:
- removes the hash-order dependence unconditionally, and
- makes the tie-breaking behavior (lowest index first) explicit and obvious at the call site.
An OrderedSet-based approach would be a worthwhile follow-up optimization, but only after auditing and sorting all insertion sites in the SCC decomposition.
There was a problem hiding this comment.
I went back and traced the SCC construction more carefully, and I think your suggestion is actually safe to apply directly. I commited the suggested changes.
| min_incidence_cnt = typemax(Int) | ||
| for ieq in active_eqs | ||
| # Sort active_eqs for deterministic tie-breaking regardless of hash order | ||
| for ieq in sort(collect(active_eqs)) |
There was a problem hiding this comment.
This can also be removed if active_eqs = OrderedSet{Int}()
| min_solvable_cnt = typemax(Int) | ||
| for ivar in active_vars | ||
| # Sort active_vars for deterministic selection; also fix missing max/min updates | ||
| for ivar in sort(collect(active_vars)) |
There was a problem hiding this comment.
The same applies for active_vars
|
Will merge after analyzing the new |
|
Thanks for the fix! |
Summary
This PR fixes two related issues in
carpanzano_tearing.jlthat caused non-deterministic behavior and incorrect variable selection in the Carpanzano tearing algorithm.Bugs Fixed
1. Heuristic 2 — tracker variables never updated (logic bug)
In
carpanzano_tear_scc!, Heuristic 2 selects the algebraic (torn) variable with maximum incidence. However, the tracking variablesmax_incidence_cntandmin_solvable_cntwere never updated inside the loop:Because
max_incidence_cntremained attypemin(Int), every variable satisfied the condition. As a result, the last iterated variable was always selected, ignoring incidence counts entirely.This PR fixes the issue by correctly updating both tracking variables within the loop.
2. Non-determinism from hash-order iteration
Several loops iterated over
Set{Int}collections (active_eqs,active_vars), whose iteration order depends on hash values and may vary across Julia versions and runs.This resulted in non-reproducible tearing decisions, since the heuristics iterate over these sets and pick the "first" element satisfying a condition (single solvable eq, min-incidence eq, max-incidence var).
This PR fixes the issue at the data-structure level by switching
active_varsandactive_eqsfromSet{Int}toOrderedSet{Int}.OrderedSetiterates in insertion order, which is itself deterministic here because the sets are populated by walkingfind_var_sccs(...)over a bipartite graph whose adjacency lists areVector{Int}(not hash-backed) and aVector-backedMatching— so no upstream changes are required.Changes
(alg::CarpanzanoTearing)(structure::SystemStructure):active_vars = Set{Int}()→active_vars = OrderedSet{Int}()active_eqs = Set{Int}()→active_eqs = OrderedSet{Int}()find_single_solvable_eq!: relax parameter types fromSet{Int}toAbstractSet{Int}so the helper accepts bothSetandOrderedSet(required becauseOrderedSet{Int}is a sibling ofSet{Int}, not a subtype — both are<: AbstractSet{Int}).carpanzano_tear_scc!(Heuristic 2):max_incidence_cntandmin_solvable_cntinside the selection loop.using OrderedCollections: OrderedSet(or equivalent import) where needed.No public API changes.
Tests Added
carpanzano_tearing.jl) that directly callscarpanzano_tear_scc!v2has incidence 3v1andv3have incidence 2v2as the algebraic variableResult:
Impact