Skip to content

fix: drop equations from scc_eqs whose matched variable is outside the SCC#82

Closed
baggepinnen wants to merge 1 commit into
mainfrom
fbc/fix-tearing-dup-state-var
Closed

fix: drop equations from scc_eqs whose matched variable is outside the SCC#82
baggepinnen wants to merge 1 commit into
mainfrom
fbc/fix-tearing-dup-state-var

Conversation

@baggepinnen
Copy link
Copy Markdown

Summary

Fixes the "Tearing internal error: lowering DAE into semi-implicit ODE failed!" thrown at reassemble.jl:499 for systems where the dummy-derivative pass leaves stale entries in full_var_eq_matching that cause get_sorted_scc to pull equations into an SCC's scc_eqs whose eq_var_matching value points at a variable outside that SCC. When this happens, two distinct SCCs end up emitting differential equations for the same state-variable index (via push!(var_ordering, diff_to_var[iv])), tripping the duplicate-state-var assertion.

The fix filters scc_eqs in get_sorted_scc to keep only equations whose matched variable is either Unassigned or inside the SCC. This drops the spurious cross-SCC overlap without affecting well-formed systems.

Root cause

In generate_system_equations!, var_ordering is filled from two push sites:

  • reassemble.jl:460 (linear-SCC inline path): push!(var_ordering, ∫iv) with ∫iv = diff_to_var[iv]
  • reassemble.jl:837 (general codegen_equation! derivative-variable branch): push!(var_ordering, diff_to_var[iv])

For a duplicate to occur in var_ordering[is_diff_eq], two distinct matched ivs must satisfy diff_to_var[iv₁] == diff_to_var[iv₂]. In the failing case this happens because the same iv = D(angles[2]) is processed twice — once in a singleton SCC matched to the joint's der_angles[2] ~ D(angles[2]) equation, and once again in a separate (large, linsol-able) SCC whose vscc was populated by get_sorted_scc pulling in that joint equation through full_var_eq_matching[v] for some v in the larger SCC's scc list.

Concretely, with the rolling-wheel reproducer below, get_sorted_scc produces:

  • scc = [1, 47, 8, 29, 45, 69, 71, 72, 73, 74, 75, 76, 138, …] (24 entries, no 2)
  • scc_eqs(final) of size 25 including eq 14
  • eq_var_matching[14] = 2, so scc_vars ends up with 2 injected at position 17 even though 2 ∉ scc

The big SCC and the singleton SCC for var 2 then both push diff_to_var[2] = 20 into var_ordering.

Reproducer

Repo: JuliaComputing/MultibodyComponents.jl, branch rolling_wheel, HEAD f531433.

using ModelingToolkit, MultibodyComponents
import DyadCompilerPasses

@named model = MultibodyComponents.tests.WheelInWorld()
ssys = MultibodyComponents.multibody(model)   # ← Tearing internal error before this PR

Stack of the failing pre-PR error:

Tearing internal error: lowering DAE into semi-implicit ODE failed!
  [2] generate_system_equations!(...) @ ModelingToolkitTearing reassemble.jl:499
  [4] DefaultReassembleAlgorithm(...)  @ ModelingToolkitTearing reassemble.jl:1184
  [6] dummy_derivative                 @ ModelingToolkit symbolics_tearing.jl:84
 [19] multibody(model::System, ...)    @ MultibodyComponents

After this PR, multibody(model) completes; the sibling SlipWheelInWorld test (which already compiled) continues to compile and integrate to t_end = 3.0 with retcode = Success.

Regression test

Not included here. The bug only surfaces with the dummy-derivative alias pattern produced by the multibody port; minimal 2D rolling-disk MTK-only models (with and without acceleration-level kinetics) compile cleanly both with and without this fix, so they don't witness the bug. The rolling-wheel reproducer above is the smallest known trigger; happy to backport a regression test if a smaller MTK-only MWE is found.

Test plan

  • mtkcompile(MultibodyComponents.tests.WheelInWorld()) succeeds (previously threw Tearing internal error).
  • MultibodyComponents.tests.SlipWheelInWorld still compiles + integrates to t_end = 3.0, retcode = Success.
  • Pkg.test("ModelingToolkitTearing") shows the same 2 pre-existing trivial_tearing! failures with and without this change — no regression.

🤖 Generated with Claude Code

…the SCC

`get_sorted_scc` builds `scc_eqs` by iterating `for v in scc` and pushing
`full_var_eq_matching[v]` if it is an `Int`. Stale entries in
`full_var_eq_matching` left behind by the dummy-derivative pass can map a
variable inside the current SCC to an equation that is *also* matched (via
`var_eq_matching`) to a different variable outside this SCC. When such an
equation enters `scc_eqs`, `generate_system_equations!` pushes the outside
variable's index into `var_ordering` from this SCC's codegen, while another
SCC simultaneously pushes the same index from its own codegen, causing the
duplicate-state assertion at `reassemble.jl:499` to fire:

    "Tearing internal error: lowering DAE into semi-implicit ODE failed!"

Filtering `scc_eqs` to keep only equations whose `eq_var_matching` target
is either `Unassigned` or inside the SCC removes the spurious cross-SCC
overlap without affecting well-formed systems.

Verified on the rolling-wheel reproducer in `JuliaComputing/MultibodyComponents.jl`,
branch `rolling_wheel` (`MultibodyComponents.tests.WheelInWorld`), which previously
threw the error during `multibody(model)` and now reaches `solve`.
Sibling `SlipWheelInWorld` (which already compiled) continues to compile
and integrate to `t_end = 3.0` with `retcode = Success`. The MTKTearing
test suite has the same 2 pre-existing `trivial_tearing!` failures both
with and without this change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@AayushSabharwal
Copy link
Copy Markdown
Member

This is an interesting bug. While the diagnosis is correct, I don't think the solution is. scc_eqs will have fewer elements than it should. The correct fix is to probably update full_var_eq_matching in state selection/tearing so that it is not stale.

@AayushSabharwal
Copy link
Copy Markdown
Member

This is very incorrect and as such I'm closing the PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants