Skip to content

ForceFreeStates: Dual Riccati + Parallel FM integration + Δ' output#178

Open
logan-nc wants to merge 18 commits intodevelopfrom
perf/riccati
Open

ForceFreeStates: Dual Riccati + Parallel FM integration + Δ' output#178
logan-nc wants to merge 18 commits intodevelopfrom
perf/riccati

Conversation

@logan-nc
Copy link
Collaborator

@logan-nc logan-nc commented Feb 28, 2026

Summary

Implements two new integration paths for the Euler-Lagrange ODE and adds full Δ' (tearing stability parameter) computation to the Riccati and parallel FM paths.

  • Dual Riccati integration (use_riccati = true): Sequential reformulation as a matrix Riccati ODE S = U₁·U₂⁻¹ with periodic renormalization to maintain bounded (U₁, U₂). Validated on Solovev (0.006% energy error) and DIIID n=1 (0.12% error). Faster than standard for large N.

  • Parallel FM integration (use_parallel = true): Each integration chunk is solved independently from identity ICs using Threads.@threads, then assembled serially. Uses bidirectional integration — crossing chunks (near rational surfaces) are integrated backward to keep FM propagators well-conditioned. The well-conditioned backward FM inverse is applied via LU solve during serial assembly. Fixes a ~10% energy error for DIIID N=26 that existed with the all-forward approach.

  • Per-surface Δ': Both paths compute intr.sing[s].delta_prime (scalar) and delta_prime_col (full N-vector of off-diagonal coupling). Written to singular/delta_prime and singular/delta_prime_col in HDF5.

  • Inter-surface Δ' matrix (parallel FM only): intr.delta_prime_matrix (2·msing × 2·msing) via the STRIDE global BVP [Glasser 2018 Phys. Plasmas 25, 032501]. Written to singular/delta_prime_matrix in HDF5.


Accuracy

Example Path et[1] Error vs std
Solovev N=8 standard 16.56600
Solovev N=8 riccati 16.56691 0.006%
Solovev N=8 parallel 16.56691 0.006%
DIIID N=26 standard 1.21237
DIIID N=26 riccati 1.21386 0.123%
DIIID N=26 parallel 1.21386 0.123%

The ~0.12% gap on DIIID is algorithmic (different crossing convention: Riccati-style vs Gaussian Reduction), not ODE tolerance. Both paths are physically correct; 0.12% is well within physical uncertainty.


Performance (4 threads)

Example Path Time Speedup
Solovev N=8 standard 0.25s 1.00×
Solovev N=8 riccati 0.09s 2.83×
Solovev N=8 parallel 0.14s 1.80×
DIIID N=26 standard 1.17s 1.00×
DIIID N=26 riccati 0.91s 1.29×
DIIID N=26 parallel 0.90s 1.30×

Tests

Suite Pass Total
runtests_riccati.jl 23 23
runtests_parallel_integration.jl 56 56

Riccati tests include unit tests for: renormalize_riccati_inplace!, renormalize_riccati!, Riccati end state U₂≈I, Δ' regression values, delta_prime_col shape/diagonal consistency, riccati_der! formula (Glasser 2018 Eq. 19), and compute_delta_prime_from_ca! bit-identical consistency.

Parallel tests include: apply_propagator! identity and linearity, apply_propagator_inverse! round-trip, chunk balance target count, direction field propagation through bidirectional=true, Solovev energy accuracy, DIIID energy accuracy (key bidirectional fix regression), ode_itime_cost additivity, and Δ' matrix regression for both Solovev and DIIID.


Files changed

File Change
src/ForceFreeStates/ForceFreeStatesStructs.jl IntegrationChunk.direction, SingType.delta_prime_col, ForceFreeStatesInternal.delta_prime_matrix; removed dead parallel_threads field
src/ForceFreeStates/EulerLagrange.jl chunk_el_integration_bounds(bidirectional=false), balance_integration_chunks direction propagation
src/ForceFreeStates/Riccati.jl Full Riccati path; integrate_propagator_chunk! bidirectional tspan; apply_propagator_inverse!; bidirectional serial assembly; compute_delta_prime_matrix! with Phi_L/Phi_R BVP; assemble_fm_matrix empty-range guard; single-resonance assertion
src/GeneralizedPerturbedEquilibrium.jl HDF5 output for singular/delta_prime, delta_prime_col, delta_prime_matrix
test/runtests_riccati.jl 23 tests, shared setup, riccati_der! formula test, compute_delta_prime_from_ca! test
test/runtests_parallel_integration.jl 56 tests including apply_propagator_inverse!, direction field, DIIID Δ' matrix
benchmarks/benchmark_threads.jl Thread-scaling benchmark
benchmarks/benchmark_riccati_der.jl riccati_der! sanity check
benchmarks/benchmark_delta_prime_methods.jl compute_delta_prime_from_ca! sanity check

Code review responses

Fixed: assemble_fm_matrix empty-range guard · single-resonance assertion in compute_delta_prime_matrix! · stale comment in parallel tests · dead parallel_threads field removed

Deferred: apply_propagator! per-call allocations · test boilerplate in runtests_parallel_integration.jl


🤖 Generated with Claude Code

logan-nc and others added 4 commits February 28, 2026 01:03
…(1.6x speedup on Solovev)

Implements the dual Riccati matrix S = U₁·U₂⁻¹ as a faster alternative to the standard
Euler-Lagrange ODE integration. Enable with `use_riccati = true` in jpec.toml.

Integration strategy: uses `sing_der!` (same ODE RHS as standard) with periodic Riccati
renormalization S = U₁·U₂⁻¹, U₂ = I in the callback when column norms exceed ucrit. This
is mathematically equivalent to the explicit Riccati ODE (dS/dψ = B + A·S - S·D - S·C·S)
but numerically stable: the explicit Riccati ODE has quadratic blowup for explicit solvers
when K̄·S >> Q, while sing_der! + renorm tracks the bounded ratio S = U₁/U₂.

The Riccati crossing (`riccati_cross_ideal_singular_surf!`) skips Gaussian reduction (which
can produce NaN/Inf when S is near-zero near the axis) and uses `ipert_res` directly.

Benchmarks on Solovev example (N=8, 1 singular surface):
  Standard ODE: 83.7 ms, 157 steps
  Riccati ODE:  51.4 ms, 121 steps  (1.63x speedup, 0.006% energy difference)

See: Glasser (2018) Phys. Plasmas 25, 032507 — Eq. 19

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
## Part 1: Δ' output (tearing stability parameter)
- Add `delta_prime::Vector{ComplexF64}` to `SingType`
- Add `compute_delta_prime_from_ca!` in EulerLagrange.jl, called at end of
  `eulerlagrange_integration` (standard path only — see normalization note below)
- Write `singular/delta_prime` as (msing × n_modes) ComplexF64 to HDF5 output in JPEC.jl
- Riccati path does NOT compute delta_prime: ca_l is accumulated in (S,I) normalization
  which is inconsistent with the Δ' formula (standard (U1,U2) normalization required)

## Part 2: Parallel Fundamental Matrix (FM) integration
- Add `ChunkPropagator` struct (two N×N×2 blocks for identity-block ICs) in Structs
- Add `use_parallel::Bool = false` control flag in ForceFreeStatesControl
- Add `integrate_propagator_chunk!` — integrates each chunk from IC=(I,0) and IC=(0,I)
  independently using BS5 solver, no callback; suitable for Threads.@threads
- Add `apply_propagator!` — in-place 2×2 block matrix multiply on odet.u
- Add `balance_integration_chunks` — sub-divides chunks using ode_itime_cost for
  load-balanced parallel work; target = max(2*msing+3, 4*nthreads)
- Add `ode_itime_cost` — log-divergent cost model from STRIDE (Glasser 2018)
- Add `parallel_eulerlagrange_integration` — parallel phase with Threads.@threads,
  serial assembly calling renormalize_riccati_inplace! before each crossing (needed
  because apply_propagator! gives general (U1,U2) state but riccati crossing expects
  (S,I) form); uses ipert_res-direct zeroing to correctly identify the resonant column
- Dispatch from eulerlagrange_integration: use_parallel → use_riccati → standard

## Tests (29 total: 11 Riccati + 18 Parallel FM)
- runtests_riccati.jl: update Δ' test — only standard path populates delta_prime
- runtests_parallel_integration.jl (new): ChunkPropagator identity/linearity,
  balance_integration_chunks count/coverage/crossings, ode_itime_cost additivity,
  parallel FM energy match (rtol=2%, Solovev)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…rallel tests to suite

Δ' is now computed inline in riccati_cross_ideal_singular_surf! using the diagonal
formula on the bounded (U₁, U₂) state (max ≤ ucrit, no GR permutation). This gives
physically correct values: 57.3 and -4.03 for the two Solovev singular surfaces.

The standard path does not populate delta_prime — Gaussian Reduction inflates the
resonant column's asymptotic coefficients, making ca_l non-physical regardless of
when it is computed. A comment in cross_ideal_singular_surf! explains the limitation.

Also adds runtests_riccati.jl and runtests_parallel_integration.jl to the default
test suite (runtests.jl). Both were previously excluded.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… lacks Δ'

The comment in cross_ideal_singular_surf! previously said the issue was GR
"normalization inflation." The real reason is more subtle: Δ' is a complex,
normalization-convention-dependent quantity. The Riccati renormalization (U₂→I)
continuously phases solution columns into a specific gauge where the diagonal
formula (ca_r - ca_l)/denom gives physically meaningful values. The standard
path's solution columns grow from the axis with an arbitrary complex phase;
dividing by the outer asymptotic coefficient normalizes the magnitude but not the
complex phase, producing a value in a different convention that does not match
what SingularCoupling.jl expects.

Also reverts the failed attempt to compute Δ' in cross_ideal_singular_surf! via
perm_col + A_outer normalization, which produced -0.10-0.54i vs the Riccati
57.3+58.3i (same physical quantity, incompatible conventions).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@logan-nc logan-nc changed the title ForceFreeStates - NEW FEATURE - Dual Riccati EL integration (1.2–1.6x speedup) ForceFreeStates: Dual Riccati + Parallel FM integration + Δ' output Mar 1, 2026
logan-nc and others added 3 commits February 28, 2026 23:16
…large-N documentation

1. Add SingType.delta_prime_col (N × n_res_modes Matrix) storing the full column
   (ca_r[:, ipert_res, 2] - ca_l[:, ipert_res, 2]) / (4π²·psio) at each crossing.
   The diagonal element matches delta_prime[i] exactly. Off-diagonal elements give
   intra-surface coupling of all N modes to each resonant mode through the singular
   layer asymptotic expansion. Only populated for Riccati/parallel FM paths.

2. Add singular/m, singular/n, singular/delta_prime_col HDF5 outputs so downstream
   users can access the full off-diagonal Δ' without needing to index ca_left/ca_right.

3. Document the known numerical limitation of the parallel FM path for large N:
   FM propagators become ill-conditioned for N ≳ 20 without QR orthogonalization,
   causing ~10% energy error for DIIID (N=26) with no wall-clock speedup over Riccati.
   Deferred fix: bidirectional integration or continuous QR (noted in docstring/tests).

4. Update outer-plasma Riccati re-integration (already committed) docstring to match.

Tests: 50/50 Riccati+parallel, 84/84 EulerLagrange all pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…trix

Implements the STRIDE global boundary value problem for computing the full
2·msing × 2·msing inter-surface tearing stability matrix. Each entry gives
the U₂[ipert_res] response amplitude at one surface boundary when driving
with unit amplitude at another, encoding cross-surface coupling.

Changes:
- Riccati.jl: add assemble_fm_matrix (chunk FM product) and
  compute_delta_prime_matrix! (BVP assembly + solve via STRIDE formulation
  from Glasser 2018 Phys. Plasmas 25, 032501 Sec. III.B); call from
  parallel_eulerlagrange_integration
- ForceFreeStatesStructs.jl: add delta_prime_matrix field to
  ForceFreeStatesInternal with docstring
- JPEC.jl: write delta_prime_matrix to singular/delta_prime_matrix in HDF5
- test/runtests_parallel_integration.jl: add delta_prime_matrix regression
  test (shape, finiteness, non-zero diagonal); 30 tests total (was 23)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… for large-N accuracy

The all-forward parallel FM path had ~10% energy error for large-N problems
(DIIID N=26, n=1) because the chunk immediately before each rational surface
crossing integrates into exponentially growing solution territory, producing
an ill-conditioned FM propagator.

Fix: integrate the crossing chunk *backward* (psi_end → psi_start). Solutions
that grow exponentially forward decay backward, yielding a well-conditioned
backward FM Φ_bwd. The accurate forward propagation is recovered as Φ_bwd⁻¹
via a stable LU solve in apply_propagator_inverse!.

The same backward FM is used directly in the Δ' BVP (compute_delta_prime_matrix!)
as Phi_L[j], splitting each ill-conditioned inter-surface FM product into
well-conditioned Phi_R (forward chunks) and Phi_L (backward crossing chunk).

Changes:
- IntegrationChunk: add direction::Int=1 field (+1 forward, -1 backward)
- chunk_el_integration_bounds: add bidirectional=false kwarg; crossing chunks
  get direction=-1 when true
- balance_integration_chunks: left sub-chunk always direction=1; right inherits
  chunk.direction so the near-singularity chunk stays backward after splitting
- integrate_propagator_chunk!: reverses tspan for direction=-1 chunks
- apply_propagator_inverse!: new function, LU solve Φ_bwd·x = u_old
- Serial assembly: branches on chunk.direction (inverse vs forward apply)
- parallel_eulerlagrange_integration: passes bidirectional=true
- compute_delta_prime_matrix!: BVP now uses Phi_R·x_right - Phi_L·x_left = 0
  at each junction instead of ill-conditioned monolithic Phi_segs product
- assemble_fm_matrix: safe for empty idx_range (uses propagators[1] for N)

Results (et[1] stability eigenvalue):
  Solovev N=8:   0.006% error (was already fine)
  DIIID   N=26:  0.236% error (was ~10.5% — 44× accuracy improvement)

Tests: 31/31 pass in runtests_parallel_integration.jl (+1 DIIID accuracy test)
       18/18 pass in runtests_riccati.jl (unchanged)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@logan-nc
Copy link
Collaborator Author

logan-nc commented Mar 8, 2026

@claude review this

@logan-nc logan-nc self-assigned this Mar 8, 2026
@claude
Copy link

claude bot commented Mar 8, 2026

Claude finished @logan-nc's task in 3m 55s —— View job


Code Review

This is a well-designed PR implementing three complementary integration paths. The theory is clearly traced to Glasser (2016/2018), the docstrings are excellent, and the bidirectional fix for large-N conditioning is elegant. Below are issues found, ordered by severity.


Bug: assemble_fm_matrix panics on empty range

src/ForceFreeStates/Riccati.jl:77-87

function assemble_fm_matrix(propagators::Vector{ChunkPropagator}, idx_range)
    N = size(propagators[1].block_upper_ic, 1)   # ← accesses propagators[1] unconditionally
    Phi = Matrix{ComplexF64}(I, 2N, 2N)
    for i in idx_range
        Phi = Phi_i * Phi
    end
    return Phi
end

Called in compute_delta_prime_matrix! as:

Phi_R_mats[1] = assemble_fm_matrix(propagators, 1:i_crossings[1]-1)

If the first chunk is already a crossing chunk (i_crossings[1] == 1), the range is 1:0 (empty). propagators[1] still exists but the indexing to determine N is a fragile pattern — and more importantly, the result should be the identity matrix, which requires knowing N from somewhere else when the range is empty. The function should guard against this:

function assemble_fm_matrix(propagators::Vector{ChunkPropagator}, idx_range)
    isempty(idx_range) && return Matrix{ComplexF64}(I, 2*size(propagators[1].block_upper_ic,1), 2*size(propagators[1].block_upper_ic,1))
    N = size(propagators[first(idx_range)].block_upper_ic, 1)
    ...
end

Or better: pass N as an explicit argument. Fix this →


Minor: apply_propagator! allocates on every call in the serial assembly loop

src/ForceFreeStates/Riccati.jl:739-758

function apply_propagator!(odet::OdeState, prop::ChunkPropagator)
    u1_prev = copy(@view odet.u[:, :, 1])   # allocation
    u2_prev = copy(@view odet.u[:, :, 2])   # allocation
    tmp = similar(u1_prev)                   # allocation
    ...
end

This is called once per chunk in the serial assembly phase, so for a typical DIIID run (say 50 chunks) the overhead is ~150 allocations of N×N complex matrices. For N=26 this is negligible, but for the design intent of supporting large N it is worth pre-allocating a work buffer in parallel_eulerlagrange_integration and threading it through. Same for apply_propagator_inverse!. Fix this →


Minor: Stale comment in runtests_parallel_integration.jl

test/runtests_parallel_integration.jl:125-133

# Note: this test uses the Solovev example (N=8 modes) where FM propagators
# are well-conditioned. For large-N problems (N ≳ 20, e.g. DIIID with N=26),
# FM propagator ill-conditioning leads to ~10% energy error with no speedup
# over the serial Riccati path. See parallel_eulerlagrange_integration docstring
# for details and deferred fix approaches (bidirectional integration / continuous QR).

This comment predates the bidirectional fix that is the main contribution of this PR. The large-N issue is now resolved — the DIIID accuracy is within 0.24% (see test at line 190). This comment should be updated to reflect the current state. Fix this →


Minor: Dead code — riccati_der! and compute_delta_prime_from_ca!

Both functions have docstrings acknowledging they are unused ("retained for reference"):

  • riccati_der! (Riccati.jl:281): uses the @with_pool pool macro, tying it to an execution context. If this function is never called, it should either be tested (e.g., as a sanity check that the explicit form matches sing_der!), or removed. The docstring says it could be used with an implicit solver — that's a legitimate reason to keep it, but at minimum add a comment linking to the issue/TODO tracking that future work.

  • compute_delta_prime_from_ca! (EulerLagrange.jl:22): the docstring explicitly says "no longer called from any integration driver". If it's retained for testing, add a test. Otherwise remove.


Minor: compute_delta_prime_matrix! single-resonance assumption not asserted early

Riccati.jl:167-172

ipert_all = [begin
    sp = intr.sing[j]
    idx = 1 + sp.m[1] - intr.mlow + (sp.n[1] - intr.nlow) * intr.mpert
    @assert 1 <= idx <= N "Resonant mode index out of range"
    idx
end for j in 1:msing]

The function uses sp.m[1] / sp.n[1], silently assuming exactly one resonant mode per surface. The limitation is documented in the docstring, but multi-resonance surfaces will silently give wrong results (only the first resonant mode is coupled). A stronger guard at the function entry would be safer:

@assert all(j -> length(intr.sing[j].m) == 1, 1:msing) "compute_delta_prime_matrix! only supports single-resonance surfaces; got multi-resonance surface"

Fix this →


Minor: Test boilerplate duplication

The 15-line equilibrium setup block (parse TOML → setup_equilibriumsing_find! → set modes) is repeated verbatim across 7+ test cases in runtests_riccati.jl and runtests_parallel_integration.jl. Extract a helper function:

function setup_solovev_intr_ctrl(; use_riccati=false, use_parallel=false)
    ex = joinpath(@__DIR__, "test_data", "regression_solovev_ideal_example")
    inputs = TOML.parsefile(joinpath(ex, "jpec.toml"))
    inputs["ForceFreeStates"]["verbose"] = false
    inputs["ForceFreeStates"]["use_riccati"] = use_riccati
    inputs["ForceFreeStates"]["use_parallel"] = use_parallel
    ...
    return intr, ctrl, equil, ffit
end

This would cut the test file length by ~40% and make failures easier to localize. Fix this →


Minor: parallel_threads field now superseded

ForceFreeStatesControl.parallel_threads (already present, default 1) was previously documented as "not yet implemented". Now that use_parallel routes through Threads.nthreads(), parallel_threads is effectively dead. Either: (a) wire parallel_threads into parallel_eulerlagrange_integration via @threads :static with thread count control, or (b) document it as deprecated in favour of JULIA_NUM_THREADS env variable.


Positive observations

  • The bidirectional integration rationale is crystal clear. The docstring on parallel_eulerlagrange_integration (and the STRIDE BVP comments in compute_delta_prime_matrix!) trace every design decision back to the specific equations in Glasser (2018). This level of traceability is excellent.

  • The outer plasma re-integration (Riccati rather than FM for the post-last-crossing region) is a thoughtful decision: it avoids FM ill-conditioning in the outer plasma and provides dense checkpoints for findmax_dW_edge!. The comment explaining why is clear.

  • renormalize_riccati_inplace! after every chunk in the serial assembly loop is the Julia analogue of ode_fixup in STRIDE — this connection is explicitly called out in the code comment, which is appreciated.

  • integrate_propagator_chunk! with no callback is correctly justified: identity ICs keep the solution bounded within each chunk (the chunk is designed to stop before the rational surface), so normalization is unnecessary.

  • Δ' normalization convention is carefully documented in riccati_cross_ideal_singular_surf!: the skip-renorm-before-crossing approach ensuring ca_r[ipert_res, ipert_res, 2] = 1 is a subtle but correct choice that makes the left/right Δ' comparison gauge-consistent.

  • Test coverage is solid. Unit tests for renormalize_riccati_inplace! (idempotent case, random case), apply_propagator! (identity, linearity), balance_integration_chunks (chunk count, contiguity, crossing count), and end-to-end regression tests for both Solovev and DIIID.


logan-nc and others added 8 commits March 8, 2026 17:27
Adds benchmarks/benchmark_threads.jl to measure wall-clock time and
accuracy of the standard, Riccati, and parallel FM integration paths
across varying thread counts.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Three fixes from code review of PR #178:

- assemble_fm_matrix: add explicit isempty guard before the propagator
  loop so an empty idx_range (e.g. i_crossings[1]==1) returns the
  identity matrix without relying on the loop falling through silently.

- compute_delta_prime_matrix!: add @Assert at function entry that all
  singular surfaces have exactly one resonant mode, so multi-resonance
  surfaces fail loudly instead of silently using only sp.m[1]/sp.n[1].

- runtests_parallel_integration.jl: remove stale comment that described
  large-N FM ill-conditioning as an open problem with ~10% energy error;
  bidirectional integration (now the default for use_parallel=true) has
  resolved this.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…er! and compute_delta_prime_from_ca!

Two developer benchmark scripts for verifying the two dead-code reference
implementations flagged in the Claude code review of PR #178:

benchmarks/benchmark_riccati_der.jl
  Verifies riccati_der! correctly evaluates Glasser 2018 Eq. 19:
    dS/dψ = w†·F̄⁻¹·w - S·Ḡ·S,  w = Q - K̄·S
  Uses Hermitian test states (physical constraint: the EL system preserves
  S†=S from the axis) and compares riccati_der! against manual evaluation
  of the same formula using the ffit splines directly.
  Observed error: ~1e-17 (machine epsilon). No TOML flags needed.

benchmarks/benchmark_delta_prime_methods.jl
  Verifies compute_delta_prime_from_ca! gives bit-for-bit identical Δ'
  values to the inline computation in riccati_cross_ideal_singular_surf!.
  Both apply the same diagonal formula to the same ca_l/ca_r arrays, so
  the result must be exactly zero difference.
  Observed difference: 0.0 (exact). No TOML flags needed.

Neither script requires new TOML flags: they call internal functions directly
without going through ForceFreeStatesControl. Developer-only knobs belong in
scripts, not in user-facing config.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…tring

The previous "O(Δψ)" phrasing in the Integration Strategy section read as a
global accuracy statement, suggesting the Riccati path is only first-order
accurate. This is wrong: the method integrates the linear EL ODE with Tsit5
(5th-order) and recovers S = U₁·U₂⁻¹ by exact renormalization, achieving
the full ODE solver reltol.

Rewrite the section in three clearly labelled parts:
- Why riccati_der! (quadratic ODE) is avoided: relative error control is
  unfaithful when |S| is large, not a step-size problem, not fixable by
  adaptation without an implicit solver.
- What the implementation actually does: sing_der! (linear ODE, exact RHS),
  Tsit5 (5th-order), exact renormalization, same global accuracy as standard.
- Local consistency analysis: the O(Δψ) expansion is retained but now
  labelled explicitly as a consistency check, not an accuracy claim.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…setup + new unit tests

Two changes in one pass:

Shared setup (performance):
  equil (Grad-Shafranov) and ffit (metric matrices) are now built once and
  shared across all integration-dependent testsets via a make_solovev_intr
  helper for cheap fresh intr construction. Previously, setup_equilibrium +
  make_metric + make_matrix ran 4 times and riccati_eulerlagrange_integration
  ran 3 times. Now each runs once, cutting total test time significantly.

New unit tests (dead code coverage):
  "riccati_der! formula — Glasser 2018 Eq. 19": verifies riccati_der!
    correctly evaluates dS/dψ = w†F̄⁻¹w − SGS at several ψ points using
    Hermitian test states (physical constraint). Agrees with manual formula
    evaluation to machine precision (~1e-17). No extra integration needed.

  "compute_delta_prime_from_ca! matches inline Δ'": verifies the standalone
    Δ' formula gives bit-for-bit identical results to the inline computation
    in riccati_cross_ideal_singular_surf!. Reuses the shared odet_ric.

Total: 23 tests (was 18), runtime ~51s (was ~80s+ with redundant setup).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…dd 3 unit tests

- Delete unused parallel_threads field from ForceFreeStatesControl: the field was
  silently ignored (Threads.@threads uses JULIA_NUM_THREADS at startup, not a runtime
  field). Removes false impression that thread count can be set from jpec.toml.

- Add apply_propagator_inverse! round-trip unit test: verifies Φ⁻¹·Φ = I algebraically,
  complementing the existing apply_propagator! identity and linearity tests.

- Add chunk_el_integration_bounds direction field test: verifies bidirectional=true
  sets direction=-1 on crossing chunks and direction=+1 on non-crossing chunks, and
  that balance_integration_chunks preserves direction correctly (right sub-chunk inherits,
  left sub-chunk always +1). Catches direction propagation regressions.

- Add delta_prime_matrix DIIID regression test: verifies the STRIDE BVP Δ' matrix is
  finite and non-zero for the large-N case (N≈26, multiple rational surfaces), where
  ill-conditioned (non-bidirectional) FM propagators would produce NaN/Inf entries.

56/56 parallel integration tests pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…f/riccati

One conflict resolved in src/GeneralizedPerturbedEquilibrium.jl (was src/JPEC.jl):
write_outputs_to_HDF5 vacuum section. Resolution:
- Keep new Δ' HDF5 outputs from perf/riccati (singular/m, singular/n,
  delta_prime, delta_prime_col, delta_prime_matrix)
- Adopt develop's vacuum output format: vac_data variable name,
  plasma_pts/wall_pts fields (3D Cartesian), y_plasma/y_wall entries,
  always-write pattern with empty arrays

All other files (EulerLagrange.jl, ForceFreeStatesStructs.jl, runtests.jl)
auto-merged cleanly. Default HDF5 filename updated to gpec.h5.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…um references post-rename

Update test files and benchmarks to use the new package name and config filename
(gpec.toml) following the GPEC rename merged from develop:
- test/runtests_riccati.jl
- test/runtests_parallel_integration.jl
- benchmarks/benchmark_threads.jl
- benchmarks/benchmark_riccati_der.jl
- benchmarks/benchmark_delta_prime_methods.jl

23/23 riccati tests and 56/56 parallel integration tests pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@logan-nc
Copy link
Collaborator Author

logan-nc commented Mar 9, 2026

Post-merge update — after develop (GPEC rename) merge

Merged origin/develop (GPEC name-change PR #168) into this branch. One conflict resolved in write_outputs_to_HDF5: kept the new Δ' HDF5 outputs from this branch, adopted develop's updated vacuum field names (vac_data, plasma_pts/wall_pts, 3D Cartesian). Updated all test and benchmark files to reference GeneralizedPerturbedEquilibrium and gpec.toml.

Test results (post-merge)

Suite Pass Total Time
runtests_riccati.jl 23 23 1m27s
runtests_parallel_integration.jl 56 56 1m31s

Benchmark results (post-merge, 1 thread)

Example Path et[1] Time Speedup vs std Error vs std
Solovev N=8 standard 16.56600 0.09s 1.00×
Solovev N=8 riccati 16.56691 0.17s 0.51× 0.0055%
Solovev N=8 parallel 16.56691 0.13s 0.66× 0.0055%
DIIID N=26 standard 1.21237 1.26s 1.00×
DIIID N=26 riccati 1.21386 0.63s 1.98× 0.1230%
DIIID N=26 parallel 1.21386 1.19s 1.06× 0.1231%

Benchmark results (post-merge, 4 threads)

Example Path et[1] Time Speedup vs std Error vs std
Solovev N=8 standard 16.56600 0.25s 1.00×
Solovev N=8 riccati 16.56691 0.09s 2.83× 0.0055%
Solovev N=8 parallel 16.56691 0.14s 1.80× 0.0055%
DIIID N=26 standard 1.21237 1.17s 1.00×
DIIID N=26 riccati 1.21386 0.91s 1.29× 0.1230%
DIIID N=26 parallel 1.21386 0.90s 1.30× 0.1231%

Note: 1-thread Solovev timings have high variance (sub-100ms runs) and are not meaningful for speedup comparisons. DIIID (1.2s+) is the reliable benchmark. At 4 threads, Riccati achieves 1.29× and parallel achieves 1.30× on DIIID. Energy agreement is unchanged: <0.13% for both paths vs standard.

logan-nc and others added 3 commits March 9, 2026 07:49
…page

Creates docs/src/stability.md covering the ForceFreeStates module:
- Newcomb/DCON ideal MHD stability criterion with paper citations
  (Glasser 2016 Phys. Plasmas 23 112506, 2018a 032507, 2018b 032501)
- Standard, Riccati, and parallel FM integration methods
- Bidirectional integration strategy for large-N accuracy
- Δ' tearing parameter: per-surface (delta_prime/delta_prime_col)
  and inter-surface matrix (delta_prime_matrix / STRIDE BVP)
- Configuration reference, API autodocs block, example usage

Adds page to docs/make.jl navigation and cross-links from equilibrium.md.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…kdown links)

1. Add Random stdlib to Project.toml [deps] and [compat] — required by
   runtests_riccati.jl but missing from declared dependencies, causing
   CI failure with "Package Random not found in current path".

2. Fix docstring markdown in Riccati.jl and ForceFreeStatesStructs.jl:
   - Wrap bare [array_notation] (link text) immediately followed by
     (description) (parsed as URL) in code fences to prevent Documenter
     from treating them as broken local links.
   - Affected: assemble_fm_matrix BVP unknowns block, Phi_L/Phi_R
     equations, and VacuumData plasma_pts/wall_pts field descriptions.

These were surfaced by the new @autodocs block in stability.md.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…nt logging

Three targeted fixes from pre-merge code review:

1. Threads.@threads :static — since Julia 1.7, the default :dynamic
   scheduler can migrate tasks between OS threads mid-execution, making
   Threads.threadid() unreliable for indexing into odet_proxies. Using
   :static guarantees a 1:1 task-to-thread mapping for the parallel FM
   integration phase.

2. outer_chunk psi_end guard — the outer-plasma re-integration in
   parallel_eulerlagrange_integration now uses psilim*(1-eps) to match
   the guard applied by chunk_el_integration_bounds, avoiding a potential
   boundary evaluation at exactly psilim.

3. Replace println with @info/@warn — all verbose-mode output in Riccati.jl
   now uses Julia logging macros, consistent with EulerLagrange.jl. This
   enables log-level filtering and suppression in tests.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@logan-nc
Copy link
Collaborator Author

logan-nc commented Mar 9, 2026

Pre-merge cleanup (commits 1515591..725a527)

Three rounds of fixes applied after a code review pass:

CI fixes (1515591)

  • Random stdlib missing from Project.tomlruntests_riccati.jl uses using Random but the package was not declared in [deps], causing CI to fail with Package Random not found. Added to both [deps] and [compat].
  • Broken markdown links in docstrings — adding @autodocs in stability.md surfaced docstring patterns like x_left[j] (2N) that CommonMark parses as [link text](url) even with spaces between ] and (. Fixed by wrapping affected notation blocks in code fences (assemble_fm_matrix, compute_delta_prime_matrix!, VacuumData field docs).

Documentation (142a79c)

  • New docs/src/stability.md page covering the ForceFreeStates module: Newcomb/DCON stability criterion with paper citations (Glasser 2016/2018a/2018b), all three integration drivers, Δ' computation (per-surface and STRIDE BVP matrix), configuration reference, @autodocs API block, and example usage.
  • Added to docs/make.jl navigation; cross-linked from equilibrium.md.

Pre-merge code review fixes (725a527)

Three issues flagged as must-fix or high-priority:

  1. Threads.@threads :static — since Julia 1.7, the default :dynamic scheduler can migrate tasks between OS threads, making Threads.threadid() unreliable for indexing odet_proxies. :static guarantees a 1:1 task-to-thread mapping.

  2. outer_chunk psi_end guard — the outer-plasma re-integration now uses psilim * (1 - eps) to match the guard applied by chunk_el_integration_bounds, avoiding a potential boundary evaluation at exactly psilim.

  3. Consistent logging — all println calls in Riccati.jl replaced with @info macros, matching the pattern in EulerLagrange.jl and enabling log-level filtering/suppression in tests.

All 23 Riccati tests and 56 parallel FM tests pass locally after these changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant