ForceFreeStates: Dual Riccati + Parallel FM integration + Δ' output by logan-nc · Pull Request #178 · OpenFUSIONToolkit/GPEC

logan-nc · 2026-02-28T10:15:13Z

Summary

Implements two new integration paths for the Euler-Lagrange ODE and adds full Δ' (tearing stability parameter) computation to the Riccati and parallel FM paths.

Dual Riccati integration (use_riccati = true): Sequential reformulation as a matrix Riccati ODE S = U₁·U₂⁻¹ with periodic renormalization to maintain bounded (U₁, U₂). Validated on Solovev (0.006% energy error) and DIIID n=1 (0.12% error). Faster than standard for large N.
Parallel FM integration (use_parallel = true): Each integration chunk is solved independently from identity ICs using Threads.@threads, then assembled serially. Uses bidirectional integration — crossing chunks (near rational surfaces) are integrated backward to keep FM propagators well-conditioned. The well-conditioned backward FM inverse is applied via LU solve during serial assembly. Fixes a ~10% energy error for DIIID N=26 that existed with the all-forward approach.
Per-surface Δ': Both paths compute intr.sing[s].delta_prime (scalar) and delta_prime_col (full N-vector of off-diagonal coupling). Written to singular/delta_prime and singular/delta_prime_col in HDF5.
Inter-surface Δ' matrix (parallel FM only): intr.delta_prime_matrix (2·msing × 2·msing) via the STRIDE global BVP [Glasser 2018 Phys. Plasmas 25, 032501]. Written to singular/delta_prime_matrix in HDF5.

Accuracy

Example	Path	et[1]	Error vs std
Solovev N=8	standard	16.56600	—
Solovev N=8	riccati	16.56691	0.006%
Solovev N=8	parallel	16.56691	0.006%
DIIID N=26	standard	1.21237	—
DIIID N=26	riccati	1.21386	0.123%
DIIID N=26	parallel	1.21386	0.123%

The ~0.12% gap on DIIID is algorithmic (different crossing convention: Riccati-style vs Gaussian Reduction), not ODE tolerance. Both paths are physically correct; 0.12% is well within physical uncertainty.

Performance (4 threads)

Example	Path	Time	Speedup
Solovev N=8	standard	0.25s	1.00×
Solovev N=8	riccati	0.09s	2.83×
Solovev N=8	parallel	0.14s	1.80×
DIIID N=26	standard	1.17s	1.00×
DIIID N=26	riccati	0.91s	1.29×
DIIID N=26	parallel	0.90s	1.30×

Tests

Suite	Pass	Total
`runtests_riccati.jl`	23	23
`runtests_parallel_integration.jl`	56	56

Riccati tests include unit tests for: renormalize_riccati_inplace!, renormalize_riccati!, Riccati end state U₂≈I, Δ' regression values, delta_prime_col shape/diagonal consistency, riccati_der! formula (Glasser 2018 Eq. 19), and compute_delta_prime_from_ca! bit-identical consistency.

Parallel tests include: apply_propagator! identity and linearity, apply_propagator_inverse! round-trip, chunk balance target count, direction field propagation through bidirectional=true, Solovev energy accuracy, DIIID energy accuracy (key bidirectional fix regression), ode_itime_cost additivity, and Δ' matrix regression for both Solovev and DIIID.

Files changed

File	Change
`src/ForceFreeStates/ForceFreeStatesStructs.jl`	`IntegrationChunk.direction`, `SingType.delta_prime_col`, `ForceFreeStatesInternal.delta_prime_matrix`; removed dead `parallel_threads` field
`src/ForceFreeStates/EulerLagrange.jl`	`chunk_el_integration_bounds(bidirectional=false)`, `balance_integration_chunks` direction propagation
`src/ForceFreeStates/Riccati.jl`	Full Riccati path; `integrate_propagator_chunk!` bidirectional tspan; `apply_propagator_inverse!`; bidirectional serial assembly; `compute_delta_prime_matrix!` with Phi_L/Phi_R BVP; `assemble_fm_matrix` empty-range guard; single-resonance assertion
`src/GeneralizedPerturbedEquilibrium.jl`	HDF5 output for `singular/delta_prime`, `delta_prime_col`, `delta_prime_matrix`
`test/runtests_riccati.jl`	23 tests, shared setup, `riccati_der!` formula test, `compute_delta_prime_from_ca!` test
`test/runtests_parallel_integration.jl`	56 tests including `apply_propagator_inverse!`, direction field, DIIID Δ' matrix
`benchmarks/benchmark_threads.jl`	Thread-scaling benchmark
`benchmarks/benchmark_riccati_der.jl`	`riccati_der!` sanity check
`benchmarks/benchmark_delta_prime_methods.jl`	`compute_delta_prime_from_ca!` sanity check

Code review responses

Fixed: assemble_fm_matrix empty-range guard · single-resonance assertion in compute_delta_prime_matrix! · stale comment in parallel tests · dead parallel_threads field removed

Deferred: apply_propagator! per-call allocations · test boilerplate in runtests_parallel_integration.jl

🤖 Generated with Claude Code

…(1.6x speedup on Solovev) Implements the dual Riccati matrix S = U₁·U₂⁻¹ as a faster alternative to the standard Euler-Lagrange ODE integration. Enable with `use_riccati = true` in jpec.toml. Integration strategy: uses `sing_der!` (same ODE RHS as standard) with periodic Riccati renormalization S = U₁·U₂⁻¹, U₂ = I in the callback when column norms exceed ucrit. This is mathematically equivalent to the explicit Riccati ODE (dS/dψ = B + A·S - S·D - S·C·S) but numerically stable: the explicit Riccati ODE has quadratic blowup for explicit solvers when K̄·S >> Q, while sing_der! + renorm tracks the bounded ratio S = U₁/U₂. The Riccati crossing (`riccati_cross_ideal_singular_surf!`) skips Gaussian reduction (which can produce NaN/Inf when S is near-zero near the axis) and uses `ipert_res` directly. Benchmarks on Solovev example (N=8, 1 singular surface): Standard ODE: 83.7 ms, 157 steps Riccati ODE: 51.4 ms, 121 steps (1.63x speedup, 0.006% energy difference) See: Glasser (2018) Phys. Plasmas 25, 032507 — Eq. 19 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

@threads

## Part 1: Δ' output (tearing stability parameter) - Add `delta_prime::Vector{ComplexF64}` to `SingType` - Add `compute_delta_prime_from_ca!` in EulerLagrange.jl, called at end of `eulerlagrange_integration` (standard path only — see normalization note below) - Write `singular/delta_prime` as (msing × n_modes) ComplexF64 to HDF5 output in JPEC.jl - Riccati path does NOT compute delta_prime: ca_l is accumulated in (S,I) normalization which is inconsistent with the Δ' formula (standard (U1,U2) normalization required) ## Part 2: Parallel Fundamental Matrix (FM) integration - Add `ChunkPropagator` struct (two N×N×2 blocks for identity-block ICs) in Structs - Add `use_parallel::Bool = false` control flag in ForceFreeStatesControl - Add `integrate_propagator_chunk!` — integrates each chunk from IC=(I,0) and IC=(0,I) independently using BS5 solver, no callback; suitable for Threads.@threads - Add `apply_propagator!` — in-place 2×2 block matrix multiply on odet.u - Add `balance_integration_chunks` — sub-divides chunks using ode_itime_cost for load-balanced parallel work; target = max(2*msing+3, 4*nthreads) - Add `ode_itime_cost` — log-divergent cost model from STRIDE (Glasser 2018) - Add `parallel_eulerlagrange_integration` — parallel phase with Threads.@threads, serial assembly calling renormalize_riccati_inplace! before each crossing (needed because apply_propagator! gives general (U1,U2) state but riccati crossing expects (S,I) form); uses ipert_res-direct zeroing to correctly identify the resonant column - Dispatch from eulerlagrange_integration: use_parallel → use_riccati → standard ## Tests (29 total: 11 Riccati + 18 Parallel FM) - runtests_riccati.jl: update Δ' test — only standard path populates delta_prime - runtests_parallel_integration.jl (new): ChunkPropagator identity/linearity, balance_integration_chunks count/coverage/crossings, ode_itime_cost additivity, parallel FM energy match (rtol=2%, Solovev) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…rallel tests to suite Δ' is now computed inline in riccati_cross_ideal_singular_surf! using the diagonal formula on the bounded (U₁, U₂) state (max ≤ ucrit, no GR permutation). This gives physically correct values: 57.3 and -4.03 for the two Solovev singular surfaces. The standard path does not populate delta_prime — Gaussian Reduction inflates the resonant column's asymptotic coefficients, making ca_l non-physical regardless of when it is computed. A comment in cross_ideal_singular_surf! explains the limitation. Also adds runtests_riccati.jl and runtests_parallel_integration.jl to the default test suite (runtests.jl). Both were previously excluded. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

… lacks Δ' The comment in cross_ideal_singular_surf! previously said the issue was GR "normalization inflation." The real reason is more subtle: Δ' is a complex, normalization-convention-dependent quantity. The Riccati renormalization (U₂→I) continuously phases solution columns into a specific gauge where the diagonal formula (ca_r - ca_l)/denom gives physically meaningful values. The standard path's solution columns grow from the axis with an arbitrary complex phase; dividing by the outer asymptotic coefficient normalizes the magnitude but not the complex phase, producing a value in a different convention that does not match what SingularCoupling.jl expects. Also reverts the failed attempt to compute Δ' in cross_ideal_singular_surf! via perm_col + A_outer normalization, which produced -0.10-0.54i vs the Riccati 57.3+58.3i (same physical quantity, incompatible conventions). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…large-N documentation 1. Add SingType.delta_prime_col (N × n_res_modes Matrix) storing the full column (ca_r[:, ipert_res, 2] - ca_l[:, ipert_res, 2]) / (4π²·psio) at each crossing. The diagonal element matches delta_prime[i] exactly. Off-diagonal elements give intra-surface coupling of all N modes to each resonant mode through the singular layer asymptotic expansion. Only populated for Riccati/parallel FM paths. 2. Add singular/m, singular/n, singular/delta_prime_col HDF5 outputs so downstream users can access the full off-diagonal Δ' without needing to index ca_left/ca_right. 3. Document the known numerical limitation of the parallel FM path for large N: FM propagators become ill-conditioned for N ≳ 20 without QR orthogonalization, causing ~10% energy error for DIIID (N=26) with no wall-clock speedup over Riccati. Deferred fix: bidirectional integration or continuous QR (noted in docstring/tests). 4. Update outer-plasma Riccati re-integration (already committed) docstring to match. Tests: 50/50 Riccati+parallel, 84/84 EulerLagrange all pass. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…trix Implements the STRIDE global boundary value problem for computing the full 2·msing × 2·msing inter-surface tearing stability matrix. Each entry gives the U₂[ipert_res] response amplitude at one surface boundary when driving with unit amplitude at another, encoding cross-surface coupling. Changes: - Riccati.jl: add assemble_fm_matrix (chunk FM product) and compute_delta_prime_matrix! (BVP assembly + solve via STRIDE formulation from Glasser 2018 Phys. Plasmas 25, 032501 Sec. III.B); call from parallel_eulerlagrange_integration - ForceFreeStatesStructs.jl: add delta_prime_matrix field to ForceFreeStatesInternal with docstring - JPEC.jl: write delta_prime_matrix to singular/delta_prime_matrix in HDF5 - test/runtests_parallel_integration.jl: add delta_prime_matrix regression test (shape, finiteness, non-zero diagonal); 30 tests total (was 23) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

… for large-N accuracy The all-forward parallel FM path had ~10% energy error for large-N problems (DIIID N=26, n=1) because the chunk immediately before each rational surface crossing integrates into exponentially growing solution territory, producing an ill-conditioned FM propagator. Fix: integrate the crossing chunk *backward* (psi_end → psi_start). Solutions that grow exponentially forward decay backward, yielding a well-conditioned backward FM Φ_bwd. The accurate forward propagation is recovered as Φ_bwd⁻¹ via a stable LU solve in apply_propagator_inverse!. The same backward FM is used directly in the Δ' BVP (compute_delta_prime_matrix!) as Phi_L[j], splitting each ill-conditioned inter-surface FM product into well-conditioned Phi_R (forward chunks) and Phi_L (backward crossing chunk). Changes: - IntegrationChunk: add direction::Int=1 field (+1 forward, -1 backward) - chunk_el_integration_bounds: add bidirectional=false kwarg; crossing chunks get direction=-1 when true - balance_integration_chunks: left sub-chunk always direction=1; right inherits chunk.direction so the near-singularity chunk stays backward after splitting - integrate_propagator_chunk!: reverses tspan for direction=-1 chunks - apply_propagator_inverse!: new function, LU solve Φ_bwd·x = u_old - Serial assembly: branches on chunk.direction (inverse vs forward apply) - parallel_eulerlagrange_integration: passes bidirectional=true - compute_delta_prime_matrix!: BVP now uses Phi_R·x_right - Phi_L·x_left = 0 at each junction instead of ill-conditioned monolithic Phi_segs product - assemble_fm_matrix: safe for empty idx_range (uses propagators[1] for N) Results (et[1] stability eigenvalue): Solovev N=8: 0.006% error (was already fine) DIIID N=26: 0.236% error (was ~10.5% — 44× accuracy improvement) Tests: 31/31 pass in runtests_parallel_integration.jl (+1 DIIID accuracy test) 18/18 pass in runtests_riccati.jl (unchanged) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

logan-nc · 2026-03-08T21:23:05Z

@claude review this

claude · 2026-03-08T21:23:24Z

Claude finished @logan-nc's task in 3m 55s —— View job

Code Review

This is a well-designed PR implementing three complementary integration paths. The theory is clearly traced to Glasser (2016/2018), the docstrings are excellent, and the bidirectional fix for large-N conditioning is elegant. Below are issues found, ordered by severity.

Bug: `assemble_fm_matrix` panics on empty range

src/ForceFreeStates/Riccati.jl:77-87

function assemble_fm_matrix(propagators::Vector{ChunkPropagator}, idx_range)
    N = size(propagators[1].block_upper_ic, 1)   # ← accesses propagators[1] unconditionally
    Phi = Matrix{ComplexF64}(I, 2N, 2N)
    for i in idx_range
        Phi = Phi_i * Phi
    end
    return Phi
end

Called in compute_delta_prime_matrix! as:

Phi_R_mats[1] = assemble_fm_matrix(propagators, 1:i_crossings[1]-1)

If the first chunk is already a crossing chunk (i_crossings[1] == 1), the range is 1:0 (empty). propagators[1] still exists but the indexing to determine N is a fragile pattern — and more importantly, the result should be the identity matrix, which requires knowing N from somewhere else when the range is empty. The function should guard against this:

function assemble_fm_matrix(propagators::Vector{ChunkPropagator}, idx_range)
    isempty(idx_range) && return Matrix{ComplexF64}(I, 2*size(propagators[1].block_upper_ic,1), 2*size(propagators[1].block_upper_ic,1))
    N = size(propagators[first(idx_range)].block_upper_ic, 1)
    ...
end

Or better: pass N as an explicit argument. Fix this →

Minor: `apply_propagator!` allocates on every call in the serial assembly loop

src/ForceFreeStates/Riccati.jl:739-758

function apply_propagator!(odet::OdeState, prop::ChunkPropagator)
    u1_prev = copy(@view odet.u[:, :, 1])   # allocation
    u2_prev = copy(@view odet.u[:, :, 2])   # allocation
    tmp = similar(u1_prev)                   # allocation
    ...
end

This is called once per chunk in the serial assembly phase, so for a typical DIIID run (say 50 chunks) the overhead is ~150 allocations of N×N complex matrices. For N=26 this is negligible, but for the design intent of supporting large N it is worth pre-allocating a work buffer in parallel_eulerlagrange_integration and threading it through. Same for apply_propagator_inverse!. Fix this →

Minor: Stale comment in `runtests_parallel_integration.jl`

test/runtests_parallel_integration.jl:125-133

# Note: this test uses the Solovev example (N=8 modes) where FM propagators
# are well-conditioned. For large-N problems (N ≳ 20, e.g. DIIID with N=26),
# FM propagator ill-conditioning leads to ~10% energy error with no speedup
# over the serial Riccati path. See parallel_eulerlagrange_integration docstring
# for details and deferred fix approaches (bidirectional integration / continuous QR).

This comment predates the bidirectional fix that is the main contribution of this PR. The large-N issue is now resolved — the DIIID accuracy is within 0.24% (see test at line 190). This comment should be updated to reflect the current state. Fix this →

Minor: Dead code — `riccati_der!` and `compute_delta_prime_from_ca!`

Both functions have docstrings acknowledging they are unused ("retained for reference"):

riccati_der! (Riccati.jl:281): uses the @with_pool pool macro, tying it to an execution context. If this function is never called, it should either be tested (e.g., as a sanity check that the explicit form matches sing_der!), or removed. The docstring says it could be used with an implicit solver — that's a legitimate reason to keep it, but at minimum add a comment linking to the issue/TODO tracking that future work.
compute_delta_prime_from_ca! (EulerLagrange.jl:22): the docstring explicitly says "no longer called from any integration driver". If it's retained for testing, add a test. Otherwise remove.

Minor: `compute_delta_prime_matrix!` single-resonance assumption not asserted early

Riccati.jl:167-172

ipert_all = [begin
    sp = intr.sing[j]
    idx = 1 + sp.m[1] - intr.mlow + (sp.n[1] - intr.nlow) * intr.mpert
    @assert 1 <= idx <= N "Resonant mode index out of range"
    idx
end for j in 1:msing]

The function uses sp.m[1] / sp.n[1], silently assuming exactly one resonant mode per surface. The limitation is documented in the docstring, but multi-resonance surfaces will silently give wrong results (only the first resonant mode is coupled). A stronger guard at the function entry would be safer:

@assert all(j -> length(intr.sing[j].m) == 1, 1:msing) "compute_delta_prime_matrix! only supports single-resonance surfaces; got multi-resonance surface"

Fix this →

Minor: Test boilerplate duplication

The 15-line equilibrium setup block (parse TOML → setup_equilibrium → sing_find! → set modes) is repeated verbatim across 7+ test cases in runtests_riccati.jl and runtests_parallel_integration.jl. Extract a helper function:

function setup_solovev_intr_ctrl(; use_riccati=false, use_parallel=false)
    ex = joinpath(@__DIR__, "test_data", "regression_solovev_ideal_example")
    inputs = TOML.parsefile(joinpath(ex, "jpec.toml"))
    inputs["ForceFreeStates"]["verbose"] = false
    inputs["ForceFreeStates"]["use_riccati"] = use_riccati
    inputs["ForceFreeStates"]["use_parallel"] = use_parallel
    ...
    return intr, ctrl, equil, ffit
end

This would cut the test file length by ~40% and make failures easier to localize. Fix this →

Minor: `parallel_threads` field now superseded

ForceFreeStatesControl.parallel_threads (already present, default 1) was previously documented as "not yet implemented". Now that use_parallel routes through Threads.nthreads(), parallel_threads is effectively dead. Either: (a) wire parallel_threads into parallel_eulerlagrange_integration via @threads :static with thread count control, or (b) document it as deprecated in favour of JULIA_NUM_THREADS env variable.

Positive observations

The bidirectional integration rationale is crystal clear. The docstring on parallel_eulerlagrange_integration (and the STRIDE BVP comments in compute_delta_prime_matrix!) trace every design decision back to the specific equations in Glasser (2018). This level of traceability is excellent.
The outer plasma re-integration (Riccati rather than FM for the post-last-crossing region) is a thoughtful decision: it avoids FM ill-conditioning in the outer plasma and provides dense checkpoints for findmax_dW_edge!. The comment explaining why is clear.
renormalize_riccati_inplace! after every chunk in the serial assembly loop is the Julia analogue of ode_fixup in STRIDE — this connection is explicitly called out in the code comment, which is appreciated.
integrate_propagator_chunk! with no callback is correctly justified: identity ICs keep the solution bounded within each chunk (the chunk is designed to stop before the rational surface), so normalization is unnecessary.
Δ' normalization convention is carefully documented in riccati_cross_ideal_singular_surf!: the skip-renorm-before-crossing approach ensuring ca_r[ipert_res, ipert_res, 2] = 1 is a subtle but correct choice that makes the left/right Δ' comparison gauge-consistent.
Test coverage is solid. Unit tests for renormalize_riccati_inplace! (idempotent case, random case), apply_propagator! (identity, linearity), balance_integration_chunks (chunk count, contiguity, crossing count), and end-to-end regression tests for both Solovev and DIIID.

Adds benchmarks/benchmark_threads.jl to measure wall-clock time and accuracy of the standard, Riccati, and parallel FM integration paths across varying thread counts. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Three fixes from code review of PR #178: - assemble_fm_matrix: add explicit isempty guard before the propagator loop so an empty idx_range (e.g. i_crossings[1]==1) returns the identity matrix without relying on the loop falling through silently. - compute_delta_prime_matrix!: add @Assert at function entry that all singular surfaces have exactly one resonant mode, so multi-resonance surfaces fail loudly instead of silently using only sp.m[1]/sp.n[1]. - runtests_parallel_integration.jl: remove stale comment that described large-N FM ill-conditioning as an open problem with ~10% energy error; bidirectional integration (now the default for use_parallel=true) has resolved this. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…er! and compute_delta_prime_from_ca! Two developer benchmark scripts for verifying the two dead-code reference implementations flagged in the Claude code review of PR #178: benchmarks/benchmark_riccati_der.jl Verifies riccati_der! correctly evaluates Glasser 2018 Eq. 19: dS/dψ = w†·F̄⁻¹·w - S·Ḡ·S, w = Q - K̄·S Uses Hermitian test states (physical constraint: the EL system preserves S†=S from the axis) and compares riccati_der! against manual evaluation of the same formula using the ffit splines directly. Observed error: ~1e-17 (machine epsilon). No TOML flags needed. benchmarks/benchmark_delta_prime_methods.jl Verifies compute_delta_prime_from_ca! gives bit-for-bit identical Δ' values to the inline computation in riccati_cross_ideal_singular_surf!. Both apply the same diagonal formula to the same ca_l/ca_r arrays, so the result must be exactly zero difference. Observed difference: 0.0 (exact). No TOML flags needed. Neither script requires new TOML flags: they call internal functions directly without going through ForceFreeStatesControl. Developer-only knobs belong in scripts, not in user-facing config. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…tring The previous "O(Δψ)" phrasing in the Integration Strategy section read as a global accuracy statement, suggesting the Riccati path is only first-order accurate. This is wrong: the method integrates the linear EL ODE with Tsit5 (5th-order) and recovers S = U₁·U₂⁻¹ by exact renormalization, achieving the full ODE solver reltol. Rewrite the section in three clearly labelled parts: - Why riccati_der! (quadratic ODE) is avoided: relative error control is unfaithful when |S| is large, not a step-size problem, not fixable by adaptation without an implicit solver. - What the implementation actually does: sing_der! (linear ODE, exact RHS), Tsit5 (5th-order), exact renormalization, same global accuracy as standard. - Local consistency analysis: the O(Δψ) expansion is retained but now labelled explicitly as a consistency check, not an accuracy claim. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…setup + new unit tests Two changes in one pass: Shared setup (performance): equil (Grad-Shafranov) and ffit (metric matrices) are now built once and shared across all integration-dependent testsets via a make_solovev_intr helper for cheap fresh intr construction. Previously, setup_equilibrium + make_metric + make_matrix ran 4 times and riccati_eulerlagrange_integration ran 3 times. Now each runs once, cutting total test time significantly. New unit tests (dead code coverage): "riccati_der! formula — Glasser 2018 Eq. 19": verifies riccati_der! correctly evaluates dS/dψ = w†F̄⁻¹w − SGS at several ψ points using Hermitian test states (physical constraint). Agrees with manual formula evaluation to machine precision (~1e-17). No extra integration needed. "compute_delta_prime_from_ca! matches inline Δ'": verifies the standalone Δ' formula gives bit-for-bit identical results to the inline computation in riccati_cross_ideal_singular_surf!. Reuses the shared odet_ric. Total: 23 tests (was 18), runtime ~51s (was ~80s+ with redundant setup). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

@threads

…dd 3 unit tests - Delete unused parallel_threads field from ForceFreeStatesControl: the field was silently ignored (Threads.@threads uses JULIA_NUM_THREADS at startup, not a runtime field). Removes false impression that thread count can be set from jpec.toml. - Add apply_propagator_inverse! round-trip unit test: verifies Φ⁻¹·Φ = I algebraically, complementing the existing apply_propagator! identity and linearity tests. - Add chunk_el_integration_bounds direction field test: verifies bidirectional=true sets direction=-1 on crossing chunks and direction=+1 on non-crossing chunks, and that balance_integration_chunks preserves direction correctly (right sub-chunk inherits, left sub-chunk always +1). Catches direction propagation regressions. - Add delta_prime_matrix DIIID regression test: verifies the STRIDE BVP Δ' matrix is finite and non-zero for the large-N case (N≈26, multiple rational surfaces), where ill-conditioned (non-bidirectional) FM propagators would produce NaN/Inf entries. 56/56 parallel integration tests pass. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…f/riccati One conflict resolved in src/GeneralizedPerturbedEquilibrium.jl (was src/JPEC.jl): write_outputs_to_HDF5 vacuum section. Resolution: - Keep new Δ' HDF5 outputs from perf/riccati (singular/m, singular/n, delta_prime, delta_prime_col, delta_prime_matrix) - Adopt develop's vacuum output format: vac_data variable name, plasma_pts/wall_pts fields (3D Cartesian), y_plasma/y_wall entries, always-write pattern with empty arrays All other files (EulerLagrange.jl, ForceFreeStatesStructs.jl, runtests.jl) auto-merged cleanly. Default HDF5 filename updated to gpec.h5. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…um references post-rename Update test files and benchmarks to use the new package name and config filename (gpec.toml) following the GPEC rename merged from develop: - test/runtests_riccati.jl - test/runtests_parallel_integration.jl - benchmarks/benchmark_threads.jl - benchmarks/benchmark_riccati_der.jl - benchmarks/benchmark_delta_prime_methods.jl 23/23 riccati tests and 56/56 parallel integration tests pass. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

logan-nc · 2026-03-09T02:50:09Z

Post-merge update — after develop (GPEC rename) merge

Merged origin/develop (GPEC name-change PR #168) into this branch. One conflict resolved in write_outputs_to_HDF5: kept the new Δ' HDF5 outputs from this branch, adopted develop's updated vacuum field names (vac_data, plasma_pts/wall_pts, 3D Cartesian). Updated all test and benchmark files to reference GeneralizedPerturbedEquilibrium and gpec.toml.

Test results (post-merge)

Suite	Pass	Total	Time
`runtests_riccati.jl`	23	23	1m27s
`runtests_parallel_integration.jl`	56	56	1m31s

Benchmark results (post-merge, 1 thread)

Example	Path	et[1]	Time	Speedup vs std	Error vs std
Solovev N=8	standard	16.56600	0.09s	1.00×	—
Solovev N=8	riccati	16.56691	0.17s	0.51×	0.0055%
Solovev N=8	parallel	16.56691	0.13s	0.66×	0.0055%
DIIID N=26	standard	1.21237	1.26s	1.00×	—
DIIID N=26	riccati	1.21386	0.63s	1.98×	0.1230%
DIIID N=26	parallel	1.21386	1.19s	1.06×	0.1231%

Benchmark results (post-merge, 4 threads)

Example	Path	et[1]	Time	Speedup vs std	Error vs std
Solovev N=8	standard	16.56600	0.25s	1.00×	—
Solovev N=8	riccati	16.56691	0.09s	2.83×	0.0055%
Solovev N=8	parallel	16.56691	0.14s	1.80×	0.0055%
DIIID N=26	standard	1.21237	1.17s	1.00×	—
DIIID N=26	riccati	1.21386	0.91s	1.29×	0.1230%
DIIID N=26	parallel	1.21386	0.90s	1.30×	0.1231%

Note: 1-thread Solovev timings have high variance (sub-100ms runs) and are not meaningful for speedup comparisons. DIIID (1.2s+) is the reliable benchmark. At 4 threads, Riccati achieves 1.29× and parallel achieves 1.30× on DIIID. Energy agreement is unchanged: <0.13% for both paths vs standard.

…page Creates docs/src/stability.md covering the ForceFreeStates module: - Newcomb/DCON ideal MHD stability criterion with paper citations (Glasser 2016 Phys. Plasmas 23 112506, 2018a 032507, 2018b 032501) - Standard, Riccati, and parallel FM integration methods - Bidirectional integration strategy for large-N accuracy - Δ' tearing parameter: per-surface (delta_prime/delta_prime_col) and inter-surface matrix (delta_prime_matrix / STRIDE BVP) - Configuration reference, API autodocs block, example usage Adds page to docs/make.jl navigation and cross-links from equilibrium.md. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

@autodocs

…kdown links) 1. Add Random stdlib to Project.toml [deps] and [compat] — required by runtests_riccati.jl but missing from declared dependencies, causing CI failure with "Package Random not found in current path". 2. Fix docstring markdown in Riccati.jl and ForceFreeStatesStructs.jl: - Wrap bare [array_notation] (link text) immediately followed by (description) (parsed as URL) in code fences to prevent Documenter from treating them as broken local links. - Affected: assemble_fm_matrix BVP unknowns block, Phi_L/Phi_R equations, and VacuumData plasma_pts/wall_pts field descriptions. These were surfaced by the new @autodocs block in stability.md. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

@threads

…nt logging Three targeted fixes from pre-merge code review: 1. Threads.@threads :static — since Julia 1.7, the default :dynamic scheduler can migrate tasks between OS threads mid-execution, making Threads.threadid() unreliable for indexing into odet_proxies. Using :static guarantees a 1:1 task-to-thread mapping for the parallel FM integration phase. 2. outer_chunk psi_end guard — the outer-plasma re-integration in parallel_eulerlagrange_integration now uses psilim*(1-eps) to match the guard applied by chunk_el_integration_bounds, avoiding a potential boundary evaluation at exactly psilim. 3. Replace println with @info/@warn — all verbose-mode output in Riccati.jl now uses Julia logging macros, consistent with EulerLagrange.jl. This enables log-level filtering and suppression in tests. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

logan-nc · 2026-03-09T12:46:37Z

Pre-merge cleanup (commits `1515591`..`725a527`)

Three rounds of fixes applied after a code review pass:

CI fixes (`1515591`)

Random stdlib missing from Project.toml — runtests_riccati.jl uses using Random but the package was not declared in [deps], causing CI to fail with Package Random not found. Added to both [deps] and [compat].
Broken markdown links in docstrings — adding @autodocs in stability.md surfaced docstring patterns like x_left[j] (2N) that CommonMark parses as [link text](url) even with spaces between ] and (. Fixed by wrapping affected notation blocks in code fences (assemble_fm_matrix, compute_delta_prime_matrix!, VacuumData field docs).

Documentation (`142a79c`)

New docs/src/stability.md page covering the ForceFreeStates module: Newcomb/DCON stability criterion with paper citations (Glasser 2016/2018a/2018b), all three integration drivers, Δ' computation (per-surface and STRIDE BVP matrix), configuration reference, @autodocs API block, and example usage.
Added to docs/make.jl navigation; cross-linked from equilibrium.md.

Pre-merge code review fixes (`725a527`)

Three issues flagged as must-fix or high-priority:

Threads.@threads :static — since Julia 1.7, the default :dynamic scheduler can migrate tasks between OS threads, making Threads.threadid() unreliable for indexing odet_proxies. :static guarantees a 1:1 task-to-thread mapping.
outer_chunk psi_end guard — the outer-plasma re-integration now uses psilim * (1 - eps) to match the guard applied by chunk_el_integration_bounds, avoiding a potential boundary evaluation at exactly psilim.
Consistent logging — all println calls in Riccati.jl replaced with @info macros, matching the pattern in EulerLagrange.jl and enabling log-level filtering/suppression in tests.

All 23 Riccati tests and 56 parallel FM tests pass locally after these changes.

logan-nc and others added 4 commits February 28, 2026 01:03

logan-nc changed the title ~~ForceFreeStates - NEW FEATURE - Dual Riccati EL integration (1.2–1.6x speedup)~~ ForceFreeStates: Dual Riccati + Parallel FM integration + Δ' output Mar 1, 2026

logan-nc and others added 3 commits February 28, 2026 23:16

logan-nc self-assigned this Mar 8, 2026

logan-nc and others added 8 commits March 8, 2026 17:27

logan-nc and others added 3 commits March 9, 2026 07:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ForceFreeStates: Dual Riccati + Parallel FM integration + Δ' output#178

ForceFreeStates: Dual Riccati + Parallel FM integration + Δ' output#178
logan-nc wants to merge 18 commits intodevelopfrom
perf/riccati

logan-nc commented Feb 28, 2026 •

edited

Loading

Uh oh!

logan-nc commented Mar 8, 2026

Uh oh!

claude bot commented Mar 8, 2026 •

edited

Loading

Uh oh!

logan-nc commented Mar 9, 2026

Uh oh!

logan-nc commented Mar 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

logan-nc commented Feb 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Accuracy

Performance (4 threads)

Tests

Files changed

Code review responses

Uh oh!

logan-nc commented Mar 8, 2026

Uh oh!

claude bot commented Mar 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review

Bug: assemble_fm_matrix panics on empty range

Minor: apply_propagator! allocates on every call in the serial assembly loop

Minor: Stale comment in runtests_parallel_integration.jl

Minor: Dead code — riccati_der! and compute_delta_prime_from_ca!

Minor: compute_delta_prime_matrix! single-resonance assumption not asserted early

Minor: Test boilerplate duplication

Minor: parallel_threads field now superseded

Positive observations

Uh oh!

logan-nc commented Mar 9, 2026

Post-merge update — after develop (GPEC rename) merge

Test results (post-merge)

Benchmark results (post-merge, 1 thread)

Benchmark results (post-merge, 4 threads)

Uh oh!

logan-nc commented Mar 9, 2026

Pre-merge cleanup (commits 1515591..725a527)

CI fixes (1515591)

Documentation (142a79c)

Pre-merge code review fixes (725a527)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

logan-nc commented Feb 28, 2026 •

edited

Loading

claude bot commented Mar 8, 2026 •

edited

Loading

Bug: `assemble_fm_matrix` panics on empty range

Minor: `apply_propagator!` allocates on every call in the serial assembly loop

Minor: Stale comment in `runtests_parallel_integration.jl`

Minor: Dead code — `riccati_der!` and `compute_delta_prime_from_ca!`

Minor: `compute_delta_prime_matrix!` single-resonance assumption not asserted early

Minor: `parallel_threads` field now superseded

Pre-merge cleanup (commits `1515591`..`725a527`)

CI fixes (`1515591`)

Documentation (`142a79c`)

Pre-merge code review fixes (`725a527`)