fix(wrx): wr_calc_pwr SEGV + 3 sister bugs + wrcomm heap-reuse sweep#123
Merged
Conversation
Four distinct bugs in libwrxapi.so's wrx_run / wrx_get_state path, localised via env-gated wrx_dump_state module (TR_DUMP_STATE pattern): 1. wrcalpwr.f90: xtemp(0:nstpmax) / ytemp buffer overrun when nstpmax_all = MAXVAL(nstpmax_nray)+1 exceeds nstpmax. Cap with MIN() + guard ray-2 read when NRAYMAX=1 + gate PAGES/GRD1D/PAGEE block behind WRX_NO_GRAPHICS (auto-set by wrx_api_init). 2. wrcomm.f90 SAVE-guard: add ALLOCATED() canary so finalize+reinit cycle does not early-return with freed chunks (use-after-free). 3. wrcomm.f90 allocatables: pos_pwrmax_rs_nsa / pwrmax_rs_nsa / pos_pwrmax_rl_nsa / pwrmax_rl_nsa were declared but never allocated; wrx_get_state read them -> SEGV. Now allocate + 0-init. 4. wrcomm.f90 heap-reuse: defensive zero-init sweep of all remaining allocatables so libwrxapi.so reinit matches fresh-binary behaviour (same class fixed for tr in trcomm_profile.f90 on this branch). Retires the WRX_RUN_OK known-limitation in python/wrxlib/README.md. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This was referenced Apr 20, 2026
Owner
Author
|
@cursor review |
Owner
Author
|
@cursor review |
Normalise ``emit_alloc2d_summary`` format string so that 2-D and 3-D bounds both use the standard Fortran ``lb:ub, lb:ub[, lb:ub]`` notation. Previously 2-D emitted ``(lb1,ub1:lb2,ub2)`` (comma between lb/ub, colon between dims) while 3-D emitted ``(lb1:ub1,lb2:ub2,lb3:ub3)``. The inconsistency made the dump grep-unfriendly and occasionally confused Layer-1 baseline compare scripts. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
k-yoshimi
added a commit
that referenced
this pull request
Apr 20, 2026
Extend the C ABI state struct with the per-bin radial power arrays
(pos_nrs, pos_nrl, pwr_nrs_nsa, pwr_nrl_nsa) and the per-ray pwrmax
arrays that the Phase-0 baseline (wrx_regress.dat) dumps. Without
these fields, the Layer-1 equivalence tests cannot match the baseline
JSON shape {arrays, arrays2, scalars}.
Changes:
- wrx/wrx_state.f90: extend wrx_state_c with pos_nrs/pos_nrl (1D),
pwr_nrs_nsa/pwr_nrl_nsa (2D), and per-ray pos_pwrmax_* /
pwrmax_*_nsa_nray (2D). Add WRX_MAX_NRSMAX/WRX_MAX_NRLMAX=256 upper
bounds (2x wrinit default). Retain legacy 1D-by-species fields for
BC. Struct grows 70 KB total (well under any reasonable budget).
- wrx/wrx_api.h: mirror the Fortran TYPE in C, including the 4 new
2D-by-ray arrays and the 2 new per-bin arrays. Add matching
WRX_MAX_NRSMAX/WRX_MAX_NRLMAX #defines.
- wrx/wrx_api.f90 wrx_api_get_state: populate the new fields from
wrcomm (pos_nrs, pos_nrl, pwr_nrs_nsa, pwr_nrl_nsa, per-ray pwrmax
arrays). Transpose pwr_nrs_nsa/pwr_nrl_nsa from wrcomm's
(nrsmax, nsamax_wr) into struct's (NSAMAX, NRSMAX) ordering so the
C-side {NSAMAX, <outer>} convention matches pwr_nsa_nray.
- python/wrxlib/_ffi.py: WrxStateC ctypes Structure mirrors the new
C struct fields exactly.
- python/wrxlib/state.py: WrxState.from_c slices the new fields;
to_dict() reshaped from {rays, profile_rs, profile_rl} to the
baseline shape {MDLWRQ, MODELG, NRAYMAX, NRLMAX, NRSMAX, NSAMAX_WR,
NSMAX, NSTPMAX, arrays: {NSTPMAX_NRAY, pos_nrl, pos_nrs, pwr_nray,
pwr_nsa}, arrays2: {pos_pwrmax_*_nsa_nray, pwr_nrl_nsa, pwr_nrs_nsa,
pwr_nsa_nray, pwrmax_*_nsa_nray}, scalars: {pwr_tot}}.
This is a 7th PR independent of #121-#126. Depends on #123 (wrx SEGV
fixes) landing first so the wr_calc_pwr code that populates the new
wrcomm arrays doesn't crash. Together with #123, this retires the
WRX_EQUIV_OK / WRX_RUN_OK / WRX_REINIT_OK gates set by #125:
- python/wrxlib/tests/test_equivalence.py test_demo / test_iter01
- python/wrxlib/tests/test_sweep.py test_3x3_grid_completes
- python/wrxlib/tests/test_wrxlib.py test_run_and_get_state
- python/mcp-servers/wrx_mcp/tests/test_server.py
test_reinit_cycle_reproducible
Build verified:
- make -C wrx libwrxapi.so -> clean
- python3 -c 'from wrxlib._ffi import WrxStateC; ...' -> struct
size 70,424 bytes, new fields present.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
3 tasks
There was a problem hiding this comment.
✅ Bugbot reviewed your changes and found no new issues!
Comment @cursor review or bugbot run to trigger another review on this PR
Reviewed by Cursor Bugbot for commit 81aa96e. Configure here.
k-yoshimi
added a commit
that referenced
this pull request
Apr 20, 2026
…ass) Applies the defensive zero-init sweep pattern (landed today for tr in PR #121 and wrx in PR #123) to fp/fpcomm.f90:fp_allocate and the per-purpose fp_allocate_ntg1 / fp_allocate_ntg2 subroutines. Root cause: the fp property-based boundary test (test_NSMAX_in_range with NSMAX in {1..4}, plus NRMAX/NPMAX/NTHMAX/ DELT sweeps) SEGVs at process teardown when the suite runs, because libfpapi.so's re-ALLOCATE of fpcomm arrays after a prior fp_finalize can return glibc chunks that still hold the previous run's values. Binary fp launches never hit this (one-shot process). Fix: zero-init every allocatable at the end of fp_allocate, fp_allocate_ntg1, and fp_allocate_ntg2, grouped by dimension (1D/2D/ 3D/4D/5D) for readability. Conditionally-allocated arrays (guarded by MODELD, MODEL_WAVE, MODEL_DISRUPT, MODEL_EX_READ_Tn, MODELS) are zero-init'd inside the same IF blocks to preserve the allocate-path gating. Also adds python/fplib/tests/test_property_boundary.py (new file): boundary-value sweeps over NPMAX / NTHMAX / DELT / NSMAX / NRMAX and an unknown-param assertion. Mirrors the trlib / tilib / wrxlib / wrlib property-based test files added under PR #125. Test effect (libfpapi.so alone, pytest fplib): - Before: 30 existing tests PASS, test_property_boundary 1 dot then SIGSEGV (exit 139) at process teardown. test_NSMAX_in_range SEGVs after only the first NSMAX value when run via the full class. - After: 30 existing tests PASS (no regression). test_NSMAX_in_range completes all 4 subtests (NSMAX in {1..4}). test_NTHMAX_sweep then aborts on the 2nd NTHMAX iteration due to a separate lifecycle bug (mtx_initialize / fp_init idempotency), tracked for a follow-up PR. This is a partial fix: the test suite still aborts at a later point, but the zero-init sweep is an orthogonal defensive improvement that mirrors PRs #121 and #123 and unblocks test_NSMAX_in_range cleanly. The remaining lifecycle bug is isolated to the second fp_init call after fp_finalize and is not in scope here. Counts: 245 zero-init assignments across fp_allocate (~240), fp_allocate_ntg1 (~25), fp_allocate_ntg2 (~20); ~190 allocatable arrays covered (the rest are conditionally-allocated, gated in the same IF/END IF blocks as the upstream allocate calls). Depends on PR #121 / #123 landing first to satisfy the pattern precedent; otherwise standalone. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
k-yoshimi
added a commit
that referenced
this pull request
Apr 20, 2026
Mirror the SAVE-guard fix landed for ti and wrx on 2026-04-20. The `fp_allocate` subroutine's early-return block (line 349-361) skips re-allocation when the tracked dimensions (NPMAX/NTHMAX/NRMAX/ NRSTART/NREND/NSAMAX/NSBMAX) match the saved values. After a `fp_deallocate` call, though, the SAVE counters still hold the previous values even though every array has been freed -- so on the next allocate-with-matching-dims, the early-return skips the allocation and subsequent code reads freed memory (use-after-free). Fix: AND the early-return condition with `ALLOCATED(F)` (canary -- F is the first array allocated in the body), so the re-allocation only skips when the arrays are actually still live. Binary fp runs once per process so it never hits this path; this fix is for `libfpapi.so` callers that do finalize + re-init in one process (notably the property-based NSMAX-sweep tests in PR #125). Verified: - make -C fp libfpapi.so: clean build - python3 -m pytest python/fplib/: 30 PASS (no regression) - bash test_run/run_tests.sh fp_dt1: PASS (no binary regression) Same bug class as: - ti/ticomm.f90:allocate_ticomm (merged via PR #122) - wrx/wrcomm.f90:wr_allocate (merged via PR #123) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
4 tasks
k-yoshimi
added a commit
that referenced
this pull request
Apr 20, 2026
Preemptive mirror of PR #123 (wrx) and PR #121 (tr): ensures that libwrapi.so finalize+reinit cycles inside a single Python/C process cannot leak stale values from the previously-freed heap chunks into the next run. Scope (library-scope wr files, i.e. SRCS_CORE in wr/Makefile): * wr/wrcomm.f90 :: wr_allocate -- zero-init every ALLOCATABLE it owns (RAYIN, NSTPMAX_NRAY, RAYS, CEXS/CEYS/CEZS, RKXS/RKYS/RKZS, RXS/RYS/RZS, BNXS/BNYS/BNZS, BABSS, RAYB, RAYRB1/RAYRB2, CEXB/CEYB/ CEZB, RK1B/RP1B, RK2B/RP2B, RAMPB). REAL(rkind) -> 0.D0, COMPLEX(rkind) -> (0.D0, 0.D0), INTEGER -> 0. * wr/wrexecr.f90 :: wr_calc_pwr -- zero-init the power-deposition profile allocatables (rs_nstp_nray, rl_nstp_nray, pos_nrs/pwr_nrs/ pwr_nrs_nray, pos_nrl/pwr_nrl/pwr_nrl_nray, pos_pwrmax_rs_nray/ pwrmax_rs_nray, pos_pwrmax_rl_nray/pwrmax_rl_nray). Note pwrmax_rs_nray is only populated in the middle-locmax branch of wr_calc_pwr (locmax<=1 / locmax>=nrsmax leave it untouched) and pwrmax_rl_nray / pos_pwrmax_rl_nray are never written anywhere -- already documented by wrregress.f90 which excludes them from the regression dump; without this sweep wr_get_state can leak stale heap bytes into the state struct on any reinit cycle. Excluded (graphics / menu, not in libwrapi.so per SRCS_CORE): * wr/wrgout.f90 (KA, GRS/GZS, GLCX/GLCY/GSCX/GSCY, RLMA1/RLMA2, ZLMA1/ZLMA2, FASSX1/FASSX2, FASSZ1/FASSZ2, DELP1/DELP2) * wr/wrfile.f90 (NTEMP, scratch array immediately populated from READ and DEALLOCATED, not reachable from wr_api init->run path) Verified: * make -C wr libwrapi.so builds cleanly. * make -C wr wr_api_check_all: smoke/param/run/reinit/run_so/negative all PASS (reinit exercises 3x init->run->finalize cycles with heap-reuse, same class this PR defends against). * python/wrlib/tests: 33 passed + 12 subtests passed (+ 3 in test_equivalence.py, 1 + 9 subtests in test_sweep.py). One pre-existing TestWrlibReinitAndShape::test_get_state_shapes_match_ runtime_dims failure unchanged from develop baseline (wr_run returns ierr=3 with defaults; reproduced on unmodified develop HEAD as well, unrelated to this sweep). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
k-yoshimi
added a commit
that referenced
this pull request
Apr 20, 2026
Mirror of the PR #121 (tr) / #123 (wrx) / #128 (fp) heap-reuse pattern for the eq library path. In libeqapi.so, finalize+reinit across the Python wrapper lifecycle can hand glibc malloc a previously-freed heap chunk whose bytes are the prior run's output; a fresh eq binary launch sees a kernel-zeroed heap instead. Unlike wrx / tr, eq stores its working state in COMMON blocks (via eqcom{0..3}_mod.f90) rather than MODULE-level allocatables, so the sweep here lands at every LOCAL ALLOCATE in SRCS_CORE subroutines that are on the eq_api_init -> eq_init path. Zero-init sites (all ALLOCATEs in library-scope subroutines, grouped immediately after the last ALLOCATE with a short PR #123-pattern comment): * eqsub.f90: EQAXIS, setup_psig -> PSIRG/PSIZG/PSIRZG * equread.f90: alloc_equ(mode=1) -> rg/zg/psi/rbp/pds/fds/vlv/qqv/ prv/csu/rsu/zsu/hiv/siv/siw/sdw/ ckv/ssv/aav/rrv/rbv/arv/bbv/biv/ r2b2v/shv/grbm2v/rov/aiv/brv/ epsv/elipv/trigv/ftv alloc_equ(mode=2) -> ieqout/ieqerr/icp/cp/ivac/ncoil/ cvac/rvac/zvac/rcoil/zcoil/ccoil/ rlimt/zlimt eqcq -> bmax/fint/flam/nsul/dll/zbl * eqintf.f90: GETRSU -> RSU1/ZSU1 (INTENT(OUT) args) * eqcalc.f90: EQSOLV -> FJT/PSIOLD EQSETF -> PSISX/PSITX/PSISTX * eqcalq.f90: EQCALQV -> PSIRG/PSIZG/PSIRZG/HJTRG/HJTZG/HJTRZG (vacuum-fill body) -> URPSW/UZPSW (chi-spline body) -> D01/D10/D11 * eq-qst.f90: EQJAEAR -> psi_temp, rc_xp/zc_xp/psic_xp Deliberately skipped for this pass: * eqbpsd.f90 allocates of equ1D%rho / equ1D%data / metric1D%rho / metric1D%data: (a) these live inside derived-type bpsd_equ1D_type members that were actively modified in the last five develop commits (784f96b, 0ce1777, 7df60e5, d5c4e38, faf9157) as part of Phase L-1 baseline restoration; touching them risks conflicts with the ongoing eqlib extraction. (b) The %rho plain-real arrays are fully rewritten in the immediately-following DO nr=1,nrmax loop, and the %data array-of-derived-type members are each assigned field-by-field in the same loop, so the heap-reuse window is minimal. Leave for the follow-up PR that retires the EQ_RUN_OK gate. * eq/*.f (fixed-form, e.g. eqcalx.f) and eq/eqgout.f90 / eq/eqmenu.f90: not part of libeqapi.so (SRCS_GRAPHICS / SRCS_MENU only; graphics paths are stubbed in eq_graphics_stubs.f90). * eqg2d.f / eqg3d.f: graphics-only, not in libeqapi.so. Verification: * `make -C eq libeqapi.so` builds clean (gfortran 13.x, -std=legacy, -fbounds-check -ffpe-trap=invalid,zero,overflow). * `python3 -m pytest python/eqlib/tests/` -> 28 passed, 3 skipped. The 3 skips (test_eq_iter01, test_eq_tst2, test_3x3_grid_completes) are pre-existing EQ_RUN_OK=1-gated Layer 1 equivalence tests that remain SKIPPED upstream on origin/develop; this PR does not change their status. Addressing those skips is the scope of the L-6 follow-up after the baseline regeneration in PR #92 stabilises. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This was referenced Apr 20, 2026
k-yoshimi
added a commit
that referenced
this pull request
Apr 20, 2026
Extends PR #121's trcomm_profile zero-init to the full set of library-scope allocations in trcomm/trprep/trloop. Fortran ALLOCATE provides no zero-init guarantee; the libtrapi.so finalize+reinit cycle reuses heap chunks returned by glibc tcache/fastbin, so stale values from the prior run leak into the next run. Mirrors the pattern landed in wrx (#123), fp (#128), and ti (#122). See PR #121 for the motivating incident on tr. Files touched (all in tr/): - trcomm_ctrl.f90 : NSS/NSV/NNS/NST (INTEGER), NEA (INTEGER) - trcomm_mtx.f90 : XV/YV/AY/Y/ZV/AZ/Z/AX/X - trcomm_globals.f90 : SPSCT/ANS0/TS0/ANSAV/ANLAV/TSAV/WST/PRFT/ PBCLT/PFCLT/PLT/SPET/SLT/PRFVT - trcom1.f90 : RTEXU/RTIXU/RNEXU/RTEXEU/RTIXEU/RNEXEU/ A/B/C/D/RD/PPA/PPB/PPC/GRE - trcomm_profile.f90 : extends PR #121 sweep to GRM/GRG/GJB/GAD/ GET/GAK/GYR/GER/GVR/GVRT graphics arrays and the full UFILE scratch set (QPU/AJU/.../SWLU/ PTSU/PNSU/PTSUA/PNSUA/RNU/RTU/PNBU/PICU/ SNBU/RNU_ORG) All 24 library-level trlib tests pass (3 skipped pre-existing, unrelated to this change: 2 equivalence tests require eq_tst2 baseline that drifts by ~1e-9 on current toolchain; 1 sweep test depends on same baseline).
4 tasks
k-yoshimi
added a commit
that referenced
this pull request
Apr 20, 2026
#127) * fix(wrx): extend wrx_state_c schema to surface radial-bin power arrays Extend the C ABI state struct with the per-bin radial power arrays (pos_nrs, pos_nrl, pwr_nrs_nsa, pwr_nrl_nsa) and the per-ray pwrmax arrays that the Phase-0 baseline (wrx_regress.dat) dumps. Without these fields, the Layer-1 equivalence tests cannot match the baseline JSON shape {arrays, arrays2, scalars}. Changes: - wrx/wrx_state.f90: extend wrx_state_c with pos_nrs/pos_nrl (1D), pwr_nrs_nsa/pwr_nrl_nsa (2D), and per-ray pos_pwrmax_* / pwrmax_*_nsa_nray (2D). Add WRX_MAX_NRSMAX/WRX_MAX_NRLMAX=256 upper bounds (2x wrinit default). Retain legacy 1D-by-species fields for BC. Struct grows 70 KB total (well under any reasonable budget). - wrx/wrx_api.h: mirror the Fortran TYPE in C, including the 4 new 2D-by-ray arrays and the 2 new per-bin arrays. Add matching WRX_MAX_NRSMAX/WRX_MAX_NRLMAX #defines. - wrx/wrx_api.f90 wrx_api_get_state: populate the new fields from wrcomm (pos_nrs, pos_nrl, pwr_nrs_nsa, pwr_nrl_nsa, per-ray pwrmax arrays). Transpose pwr_nrs_nsa/pwr_nrl_nsa from wrcomm's (nrsmax, nsamax_wr) into struct's (NSAMAX, NRSMAX) ordering so the C-side {NSAMAX, <outer>} convention matches pwr_nsa_nray. - python/wrxlib/_ffi.py: WrxStateC ctypes Structure mirrors the new C struct fields exactly. - python/wrxlib/state.py: WrxState.from_c slices the new fields; to_dict() reshaped from {rays, profile_rs, profile_rl} to the baseline shape {MDLWRQ, MODELG, NRAYMAX, NRLMAX, NRSMAX, NSAMAX_WR, NSMAX, NSTPMAX, arrays: {NSTPMAX_NRAY, pos_nrl, pos_nrs, pwr_nray, pwr_nsa}, arrays2: {pos_pwrmax_*_nsa_nray, pwr_nrl_nsa, pwr_nrs_nsa, pwr_nsa_nray, pwrmax_*_nsa_nray}, scalars: {pwr_tot}}. This is a 7th PR independent of #121-#126. Depends on #123 (wrx SEGV fixes) landing first so the wr_calc_pwr code that populates the new wrcomm arrays doesn't crash. Together with #123, this retires the WRX_EQUIV_OK / WRX_RUN_OK / WRX_REINIT_OK gates set by #125: - python/wrxlib/tests/test_equivalence.py test_demo / test_iter01 - python/wrxlib/tests/test_sweep.py test_3x3_grid_completes - python/wrxlib/tests/test_wrxlib.py test_run_and_get_state - python/mcp-servers/wrx_mcp/tests/test_server.py test_reinit_cycle_reproducible Build verified: - make -C wrx libwrxapi.so -> clean - python3 -c 'from wrxlib._ffi import WrxStateC; ...' -> struct size 70,424 bytes, new fields present. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(wrxlib): update ffi/state tests for extended schema PR #127 extended wrx_state_c with per-bin radial-power fields (nrsmax/nrlmax dims; pos_nrs/pos_nrl 1D; pwr_nrs_nsa/pwr_nrl_nsa + four per-ray pwrmax 2D arrays) and reshaped WrxState.to_dict() to the Phase-0 baseline {arrays, arrays2, scalars} layout. Tests now: - test_ffi.py: expected field order / size-math / dim checks cover the new fields; WRX_MAX_NRSMAX/NRLMAX added to constants check. - test_wrxlib.py: _populated_state also fills the new per-bin and per-ray arrays; test_to_dict_shape asserts the {arrays, arrays2, scalars} grouping with NSAMAX_WR header key; round-trip JSON test covers the new dims. Source-of-truth files (_ffi.py, state.py, wrx_api.h) untouched. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(wrxlib): address Bugbot on #127 (HIGH + MED) HIGH: state_dump.py was still reading payload['NSAMAX'] — the extended to_dict() renamed it to 'NSAMAX_WR' (since wrx has a distinct per-species axis from wr's NSAMAX). Update the lone reference. MED: add WRX_MAX_NRSMAX / WRX_MAX_NRLMAX to _ffi.__all__ so the new constants are re-exported alongside the existing WRX_MAX_NRAYMAX / WRX_MAX_NSAMAX. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
k-yoshimi
added a commit
that referenced
this pull request
Apr 20, 2026
…ass) (#128) * fix(fp): fpcomm allocate defensive zero-init sweep (heap-reuse bug class) Applies the defensive zero-init sweep pattern (landed today for tr in PR #121 and wrx in PR #123) to fp/fpcomm.f90:fp_allocate and the per-purpose fp_allocate_ntg1 / fp_allocate_ntg2 subroutines. Root cause: the fp property-based boundary test (test_NSMAX_in_range with NSMAX in {1..4}, plus NRMAX/NPMAX/NTHMAX/ DELT sweeps) SEGVs at process teardown when the suite runs, because libfpapi.so's re-ALLOCATE of fpcomm arrays after a prior fp_finalize can return glibc chunks that still hold the previous run's values. Binary fp launches never hit this (one-shot process). Fix: zero-init every allocatable at the end of fp_allocate, fp_allocate_ntg1, and fp_allocate_ntg2, grouped by dimension (1D/2D/ 3D/4D/5D) for readability. Conditionally-allocated arrays (guarded by MODELD, MODEL_WAVE, MODEL_DISRUPT, MODEL_EX_READ_Tn, MODELS) are zero-init'd inside the same IF blocks to preserve the allocate-path gating. Also adds python/fplib/tests/test_property_boundary.py (new file): boundary-value sweeps over NPMAX / NTHMAX / DELT / NSMAX / NRMAX and an unknown-param assertion. Mirrors the trlib / tilib / wrxlib / wrlib property-based test files added under PR #125. Test effect (libfpapi.so alone, pytest fplib): - Before: 30 existing tests PASS, test_property_boundary 1 dot then SIGSEGV (exit 139) at process teardown. test_NSMAX_in_range SEGVs after only the first NSMAX value when run via the full class. - After: 30 existing tests PASS (no regression). test_NSMAX_in_range completes all 4 subtests (NSMAX in {1..4}). test_NTHMAX_sweep then aborts on the 2nd NTHMAX iteration due to a separate lifecycle bug (mtx_initialize / fp_init idempotency), tracked for a follow-up PR. This is a partial fix: the test suite still aborts at a later point, but the zero-init sweep is an orthogonal defensive improvement that mirrors PRs #121 and #123 and unblocks test_NSMAX_in_range cleanly. The remaining lifecycle bug is isolated to the second fp_init call after fp_finalize and is not in scope here. Counts: 245 zero-init assignments across fp_allocate (~240), fp_allocate_ntg1 (~25), fp_allocate_ntg2 (~20); ~190 allocatable arrays covered (the rest are conditionally-allocated, gated in the same IF/END IF blocks as the upstream allocate calls). Depends on PR #121 / #123 landing first to satisfy the pattern precedent; otherwise standalone. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(fp): fp_allocate SAVE-guard heap-reuse bug (ALLOCATED canary) Mirror the SAVE-guard fix landed for ti and wrx on 2026-04-20. The `fp_allocate` subroutine's early-return block (line 349-361) skips re-allocation when the tracked dimensions (NPMAX/NTHMAX/NRMAX/ NRSTART/NREND/NSAMAX/NSBMAX) match the saved values. After a `fp_deallocate` call, though, the SAVE counters still hold the previous values even though every array has been freed -- so on the next allocate-with-matching-dims, the early-return skips the allocation and subsequent code reads freed memory (use-after-free). Fix: AND the early-return condition with `ALLOCATED(F)` (canary -- F is the first array allocated in the body), so the re-allocation only skips when the arrays are actually still live. Binary fp runs once per process so it never hits this path; this fix is for `libfpapi.so` callers that do finalize + re-init in one process (notably the property-based NSMAX-sweep tests in PR #125). Verified: - make -C fp libfpapi.so: clean build - python3 -m pytest python/fplib/: 30 PASS (no regression) - bash test_run/run_tests.sh fp_dt1: PASS (no binary regression) Same bug class as: - ti/ticomm.f90:allocate_ticomm (merged via PR #122) - wrx/wrcomm.f90:wr_allocate (merged via PR #123) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(fp): restore defensive zero-init sweep + property boundary test file The f237828 canary-only commit (made in parallel) removed the fp_allocate zero-init sweep and the python/fplib/tests/ test_property_boundary.py file that were introduced in 237b023. This commit restores both on top of the canary fix so that the final branch state includes: * ALLOCATED(F) canary on the fp_allocate SAVE-guard early-return (from f237828, mirroring ti/wrx). * Defensive zero-init sweep at the end of fp_allocate, fp_allocate_ntg1, fp_allocate_ntg2 (from 237b023, mirroring tr PR #121). * python/fplib/tests/test_property_boundary.py covering NPMAX / NTHMAX / DELT / NSMAX / NRMAX sweeps + unknown-param assertion. The canary alone does not change observed behaviour for the property-based test (the test exercises finalize+init cycles but fp_finalize does not call fp_deallocate so the arrays remain allocated across cycles; the canary guard only fires if they were deallocated). The zero-init sweep is the functional fix: it mitigates the heap-reuse stale-data leak on re-ALLOCATE, and was verified to move test_NSMAX_in_range from "SEGV after iter 1" to "all 4 subtests complete" when run under the full fplib pytest suite. Test effect (from PR #128 commit 237b023 which introduced the zero-init, re-verified with this combined commit): - Before (develop baseline): 32 tests PASS, abort during test_NSMAX_in_range iter 2. - After (this branch): 33 tests PASS, all 4 NSMAX subtests complete, abort moves to test_NTHMAX_sweep iter 2 (separate lifecycle bug tracked for follow-up). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(fp): address Bugbot review on #128 (HIGH + MED) HIGH: guard `call fp_deallocate` with ALLOCATED(F). When the ALLOCATED canary rejects the early-return path with dims matching but F unallocated (finalize+reinit cycle), the fall-through previously called fp_deallocate unconditionally which would double-free. Matches tr/trcomm.f90 pattern (if(ALLOCATED(PNSS)) call DEALLOCATE_TRCOMM). MED: add missing RFPL zero-init on the same line as RJSRL (line 583 allocates both). RJSRL was initialized but RFPL was not, leaving it vulnerable to the same heap-reuse class this PR is designed to close. RFPL is read downstream in fpdisrupt.f90. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(fp): guard fp_deallocate MTXLEN/MTXPOS/SAVLEN with ALLOCATED Pre-review found a latent bug: MTXLEN/MTXPOS/SAVLEN are allocated in fpprep.f90 (not fp_allocate), so a partial-init or error-recovery path can leave them unallocated while F is allocated. The unconditional deallocate at line 946 then crashes. SAVPOS and Rank_Partition_Data on the next two lines were already guarded the same way; just unifying the pattern. Orthogonal to the ALLOCATED(F) canary added earlier in this PR, but makes that canary robust across partial-state paths. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
k-yoshimi
added a commit
that referenced
this pull request
Apr 20, 2026
Preemptive mirror of PR #123 (wrx) and PR #121 (tr): ensures that libwrapi.so finalize+reinit cycles inside a single Python/C process cannot leak stale values from the previously-freed heap chunks into the next run. Scope (library-scope wr files, i.e. SRCS_CORE in wr/Makefile): * wr/wrcomm.f90 :: wr_allocate -- zero-init every ALLOCATABLE it owns (RAYIN, NSTPMAX_NRAY, RAYS, CEXS/CEYS/CEZS, RKXS/RKYS/RKZS, RXS/RYS/RZS, BNXS/BNYS/BNZS, BABSS, RAYB, RAYRB1/RAYRB2, CEXB/CEYB/ CEZB, RK1B/RP1B, RK2B/RP2B, RAMPB). REAL(rkind) -> 0.D0, COMPLEX(rkind) -> (0.D0, 0.D0), INTEGER -> 0. * wr/wrexecr.f90 :: wr_calc_pwr -- zero-init the power-deposition profile allocatables (rs_nstp_nray, rl_nstp_nray, pos_nrs/pwr_nrs/ pwr_nrs_nray, pos_nrl/pwr_nrl/pwr_nrl_nray, pos_pwrmax_rs_nray/ pwrmax_rs_nray, pos_pwrmax_rl_nray/pwrmax_rl_nray). Note pwrmax_rs_nray is only populated in the middle-locmax branch of wr_calc_pwr (locmax<=1 / locmax>=nrsmax leave it untouched) and pwrmax_rl_nray / pos_pwrmax_rl_nray are never written anywhere -- already documented by wrregress.f90 which excludes them from the regression dump; without this sweep wr_get_state can leak stale heap bytes into the state struct on any reinit cycle. Excluded (graphics / menu, not in libwrapi.so per SRCS_CORE): * wr/wrgout.f90 (KA, GRS/GZS, GLCX/GLCY/GSCX/GSCY, RLMA1/RLMA2, ZLMA1/ZLMA2, FASSX1/FASSX2, FASSZ1/FASSZ2, DELP1/DELP2) * wr/wrfile.f90 (NTEMP, scratch array immediately populated from READ and DEALLOCATED, not reachable from wr_api init->run path) Verified: * make -C wr libwrapi.so builds cleanly. * make -C wr wr_api_check_all: smoke/param/run/reinit/run_so/negative all PASS (reinit exercises 3x init->run->finalize cycles with heap-reuse, same class this PR defends against). * python/wrlib/tests: 33 passed + 12 subtests passed (+ 3 in test_equivalence.py, 1 + 9 subtests in test_sweep.py). One pre-existing TestWrlibReinitAndShape::test_get_state_shapes_match_ runtime_dims failure unchanged from develop baseline (wr_run returns ierr=3 with defaults; reproduced on unmodified develop HEAD as well, unrelated to this sweep). Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
k-yoshimi
added a commit
that referenced
this pull request
Apr 20, 2026
Extends PR #121's trcomm_profile zero-init to the full set of library-scope allocations in trcomm/trprep/trloop. Fortran ALLOCATE provides no zero-init guarantee; the libtrapi.so finalize+reinit cycle reuses heap chunks returned by glibc tcache/fastbin, so stale values from the prior run leak into the next run. Mirrors the pattern landed in wrx (#123), fp (#128), and ti (#122). See PR #121 for the motivating incident on tr. Files touched (all in tr/): - trcomm_ctrl.f90 : NSS/NSV/NNS/NST (INTEGER), NEA (INTEGER) - trcomm_mtx.f90 : XV/YV/AY/Y/ZV/AZ/Z/AX/X - trcomm_globals.f90 : SPSCT/ANS0/TS0/ANSAV/ANLAV/TSAV/WST/PRFT/ PBCLT/PFCLT/PLT/SPET/SLT/PRFVT - trcom1.f90 : RTEXU/RTIXU/RNEXU/RTEXEU/RTIXEU/RNEXEU/ A/B/C/D/RD/PPA/PPB/PPC/GRE - trcomm_profile.f90 : extends PR #121 sweep to GRM/GRG/GJB/GAD/ GET/GAK/GYR/GER/GVR/GVRT graphics arrays and the full UFILE scratch set (QPU/AJU/.../SWLU/ PTSU/PNSU/PTSUA/PNSUA/RNU/RTU/PNBU/PICU/ SNBU/RNU_ORG) All 24 library-level trlib tests pass (3 skipped pre-existing, unrelated to this change: 2 equivalence tests require eq_tst2 baseline that drifts by ~1e-9 on current toolchain; 1 sweep test depends on same baseline).
k-yoshimi
added a commit
that referenced
this pull request
Apr 20, 2026
* fix(eq): defensive zero-init sweep (libeqapi heap-reuse) Mirror of the PR #121 (tr) / #123 (wrx) / #128 (fp) heap-reuse pattern for the eq library path. In libeqapi.so, finalize+reinit across the Python wrapper lifecycle can hand glibc malloc a previously-freed heap chunk whose bytes are the prior run's output; a fresh eq binary launch sees a kernel-zeroed heap instead. Unlike wrx / tr, eq stores its working state in COMMON blocks (via eqcom{0..3}_mod.f90) rather than MODULE-level allocatables, so the sweep here lands at every LOCAL ALLOCATE in SRCS_CORE subroutines that are on the eq_api_init -> eq_init path. Zero-init sites (all ALLOCATEs in library-scope subroutines, grouped immediately after the last ALLOCATE with a short PR #123-pattern comment): * eqsub.f90: EQAXIS, setup_psig -> PSIRG/PSIZG/PSIRZG * equread.f90: alloc_equ(mode=1) -> rg/zg/psi/rbp/pds/fds/vlv/qqv/ prv/csu/rsu/zsu/hiv/siv/siw/sdw/ ckv/ssv/aav/rrv/rbv/arv/bbv/biv/ r2b2v/shv/grbm2v/rov/aiv/brv/ epsv/elipv/trigv/ftv alloc_equ(mode=2) -> ieqout/ieqerr/icp/cp/ivac/ncoil/ cvac/rvac/zvac/rcoil/zcoil/ccoil/ rlimt/zlimt eqcq -> bmax/fint/flam/nsul/dll/zbl * eqintf.f90: GETRSU -> RSU1/ZSU1 (INTENT(OUT) args) * eqcalc.f90: EQSOLV -> FJT/PSIOLD EQSETF -> PSISX/PSITX/PSISTX * eqcalq.f90: EQCALQV -> PSIRG/PSIZG/PSIRZG/HJTRG/HJTZG/HJTRZG (vacuum-fill body) -> URPSW/UZPSW (chi-spline body) -> D01/D10/D11 * eq-qst.f90: EQJAEAR -> psi_temp, rc_xp/zc_xp/psic_xp Deliberately skipped for this pass: * eqbpsd.f90 allocates of equ1D%rho / equ1D%data / metric1D%rho / metric1D%data: (a) these live inside derived-type bpsd_equ1D_type members that were actively modified in the last five develop commits (784f96b, 0ce1777, 7df60e5, d5c4e38, faf9157) as part of Phase L-1 baseline restoration; touching them risks conflicts with the ongoing eqlib extraction. (b) The %rho plain-real arrays are fully rewritten in the immediately-following DO nr=1,nrmax loop, and the %data array-of-derived-type members are each assigned field-by-field in the same loop, so the heap-reuse window is minimal. Leave for the follow-up PR that retires the EQ_RUN_OK gate. * eq/*.f (fixed-form, e.g. eqcalx.f) and eq/eqgout.f90 / eq/eqmenu.f90: not part of libeqapi.so (SRCS_GRAPHICS / SRCS_MENU only; graphics paths are stubbed in eq_graphics_stubs.f90). * eqg2d.f / eqg3d.f: graphics-only, not in libeqapi.so. Verification: * `make -C eq libeqapi.so` builds clean (gfortran 13.x, -std=legacy, -fbounds-check -ffpe-trap=invalid,zero,overflow). * `python3 -m pytest python/eqlib/tests/` -> 28 passed, 3 skipped. The 3 skips (test_eq_iter01, test_eq_tst2, test_3x3_grid_completes) are pre-existing EQ_RUN_OK=1-gated Layer 1 equivalence tests that remain SKIPPED upstream on origin/develop; this PR does not change their status. Addressing those skips is the scope of the L-6 follow-up after the baseline regeneration in PR #92 stabilises. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(eq): extend sweep to eqbpsd.f90 (eq_bpsd_init %rho zero-init) Pre-review flagged eqbpsd.f90 as having a heap-reuse gap. The four allocates there were initially skipped on derived-type- complexity grounds, but equ1D%rho and metric1D%rho are simple REAL(rkind) axes — trivially zero-initializable. Zero-init them inside the same `if(%.nrmax.ne.nrmax)` reallocate guard. %data is a 1D array of derived-type structs; all fields are populated by the subsequent eq_bpsd_put loop so the heap-reuse window is minimal. Left untouched to avoid derived-type zero-init awkwardness. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
k-yoshimi
added a commit
that referenced
this pull request
Apr 20, 2026
Extend the python-tests workflow to: 1. Install Fortran/C build deps (gfortran, gcc, make) 2. Clone sibling BPSD repo + apply the species-kid OOB patch 3. Provision gitignored build config (make.header, mtxp/make.mtxp) 4. Pre-build PIC support libs (bpsd, lib, mtxp, pl, eq, dp, ob, adf11, adpost) 5. Build each module's libXapi.so (tr, fp, ti, wr, wrx, eq, tot) 6. Re-run pytest with WRX_RUN_OK=1 + WRX_REINIT_OK=1 (the gates that PRs #123/#128/#129/#130/#131/#132/#133/#134 closed) 7. Run the wr C ABI check_all driver as a second integration tier Without these steps the Layer-1 equivalence and FFI tests SKIP when libXapi.so is absent, producing a false-green CI that previously masked real fp/eq drift. Rebase of the older ci/build-so-and-run-equivalence branch onto develop HEAD now that the heap-reuse SEGV class that blocked CI is closed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This was referenced Apr 20, 2026
k-yoshimi
added a commit
that referenced
this pull request
Apr 20, 2026
* ci(python-tests): build libXapi.so + run Layer-1 integration tests Extend the python-tests workflow to: 1. Install Fortran/C build deps (gfortran, gcc, make) 2. Clone sibling BPSD repo + apply the species-kid OOB patch 3. Provision gitignored build config (make.header, mtxp/make.mtxp) 4. Pre-build PIC support libs (bpsd, lib, mtxp, pl, eq, dp, ob, adf11, adpost) 5. Build each module's libXapi.so (tr, fp, ti, wr, wrx, eq, tot) 6. Re-run pytest with WRX_RUN_OK=1 + WRX_REINIT_OK=1 (the gates that PRs #123/#128/#129/#130/#131/#132/#133/#134 closed) 7. Run the wr C ABI check_all driver as a second integration tier Without these steps the Layer-1 equivalence and FFI tests SKIP when libXapi.so is absent, producing a false-green CI that previously masked real fp/eq drift. Rebase of the older ci/build-so-and-run-equivalence branch onto develop HEAD now that the heap-reuse SEGV class that blocked CI is closed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * ci: exclude property_boundary / property_fanout pending fixes Those sweeps hit pre-existing bugs in param code paths (tr NRMAX registry gap, fp/ti/wr NSMAX range-guard gaps, wr MDLWRI=2 unsupported branch) documented as a deferred punch list in PR #125 body. CI cannot green until per-bug fix PRs land. Exclude them from the CI pytest invocation so the rest of the Layer-1 integration tier can run. Property tests still run locally and will land back in CI once the tracked bugs are closed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * ci: isolate tests with --forked, close WRX_REINIT_OK again Prior attempt opened WRX_REINIT_OK=1 which exposed an unresolved finalize+reinit SEGV in suite context (wrx_mcp test_reinit_cycle). Revert to default gate (keep closed). Add pytest-forked so each test runs in a subprocess — a SEGV in tr_mcp::handle_finalize (also surfaced on this run) no longer takes down the whole suite. Forking also reduces cross-test interference from Fortran runtime state that's shared in-process. Trade-off: forked tests run slower, but the Layer-1 integration tier would not be CI-stable without process isolation given the heap-reuse cleanups still have pre-existing gaps on the finalize-then-reinit path. Those gaps are tracked separately. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * ci: scope-down integration to FFI + serialisation tests After --forked process isolation, all Python tests that actually invoke lib_run() via the Fortran shared library SIGABRT on CI's stricter gfortran runtime. Crashes span tr/fp/ti/wr/wrx/eq/tot, including TestLifecycle, TestEquivalence, TestSweep, TestL6Integration, TestIntegration, TestTotRunGetState. Root cause appears to be a latent uninit / array-bounds issue exposed only by CI's newer gfortran. Diagnosing it is separate work — for this PR, restrict CI to coverage that doesn't invoke run(): - FFI struct-layout checks (ctypes mirror vs C header) - Error-mapping round trips - Registry shape assertions - State serialisation (dict round-trip) - Pure-Python logic Layer-1 equivalence + MCP integration tests stay skipped in CI until the SIGABRT root cause is fixed; they remain available locally via `pytest python/`. This still gives CI real value (it caught structural regressions in the extended wrx_state_t in #127; future schema drift will still be flagged) and it finally exercises the .so build step, which was the primary goal. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * ci: defer C ABI check_all step (non-PIC mtxp chain missing) wr_api_check_all recompiles wrcomm.f90 in non-PIC form and needs commpi.mod from a non-PIC mtxp build. The workflow only provisions the PIC chain today. Options for follow-up: either extend the workflow with a parallel non-PIC build, or relink the C driver binaries against the existing PIC .so. Deferring resolution keeps this PR green for the primary goal (build .so + run FFI/serialisation tests). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * ci: drop dead BPSD patch-apply guard (file never committed) Bugbot caught that the `if [[ -f ... ]]` guard around `git apply docs/external-patches/bpsd/bpsd-species-kid-oob-fix.patch` was always silently skipping because the patch file has never existed in this repo. The guard created a hidden failure mode: future work that re-enables the deselected equivalence tests would SIGABRT inside bpsd_species_put with no hint that the missing patch was the root cause. Replace the dead guard with a comment pointing future readers at the two recovery paths: restore the patch file from older ci/ branch history, or (better) fix it upstream at github.com/ats-fukuyama/bpsd. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This was referenced Apr 22, 2026
k-yoshimi
added a commit
that referenced
this pull request
Apr 22, 2026
…p) (#166) * fix(wrxlib): remove obsolete WRX_RUN_OK env-gate (0 skip) The historical libwrxapi.so SEGV inside wrcalpwr -> grd1d was fixed in PR #123 by setenv'ing WRX_NO_GRAPHICS=1 inside wrx_api_init. The opt-in gate in test_wrxlib.py is now stale; convert to the default-on pattern already used by sister test files (test_equivalence.py, test_sweep.py, test_property_boundary.py): WRX_RUN_OK=0 still works as a kill switch. Result: test_run_and_get_state runs by default. wrxlib subdir now reports 34 pass / 0 skip (was 33 pass / 1 skip). Note: CI workflow comment at .github/workflows/python-tests.yml:229-235 still warns "Don't default WRX_RUN_OK on" — that comment was written before PR #123 landed; the SEGV root cause is now fixed. If CI surfaces a regression, revisit the comment alongside this change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * ci: default WRX_RUN_OK=1 to exercise wrx_run() in CI The original "Don't default WRX_RUN_OK on" CI rationale was an SIGABRT-avoidance gate predating PR #123 (wrx_api_init sets WRX_NO_GRAPHICS=1 to suppress libgrf::grd1d) and PR #163 (eq EQFINI rearms eq_bpsd_init_flag SAVE so suite-level reinit cycles stop SEGV'ing). With both fixes in place the gate is obsolete and CI should exercise the full wrx_run() surface — leaving it gated would let regressions in wrx run-path code reach develop without ever being exercised by CI. This is the CI half of the gate retirement; the test-side change (test_wrxlib.py defaulting RUN_OK=on) was the previous commit on this branch. Codex review: no findings. feature-dev review: dropped phantom WRX_REINIT_OK from the env block (no Python code reads it; the gate was retired at the Fortran level per wrx/wrcomm.f90:214). README doc-drift cleanup deferred to follow-up task. Verified locally: - pytest python/wrxlib/ : 34 passed, 0 skipped - (CI behavior change is the actual delivery — local already defaults RUN_OK=on after the previous commit) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
3 tasks
k-yoshimi
added a commit
that referenced
this pull request
Apr 23, 2026
Align wrxlib/README.md, wrx_mcp/README.md + server.py, and the 3 wrxlib/examples/*.py docstrings with the current state of the WRX_RUN_OK gate: * PR #123 fixed the libgrf::grd1d SEGV via WRX_NO_GRAPHICS=1 setenv in wrx_api_init, so wrx_run no longer crashes through dlopen. * PR #163 added EQFINI rearm so reinit cycles are also safe. * PR #166 flipped the test-side WRX_RUN_OK gate to default-on (os.environ.get("WRX_RUN_OK", "1") != "0"). The wrxlib library itself has no runtime gate — .run() dispatches unconditionally. The MCP-level gate in wrx_mcp/server.py is retained as a defence-in-depth / explicit opt-in safeguard so an LLM does not accidentally burn minutes on a ray-tracing run without the user authorising it; the gate error message, module docstring, run / run_and_get_state tool docstrings, and --help text were updated to reflect this rationale instead of the obsolete "SEGV prevention" story. No behavioural change — docs + gate message strings only. wrx_mcp pytest: 37/37 PASS under --forked --timeout=120. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
k-yoshimi
added a commit
that referenced
this pull request
Apr 27, 2026
…s + ray helper
Extends the tr/eq applications pattern to wrx (extended ray tracing).
Three samples adapted to wrx's per-ray array indexing:
1. safe_run + ray-launch sanity report — pure-Python preflight checks
each ray's RPI/RF/ANGT against [RR-(RA+RB), RR+(RA+RB)] etc.; on
WrxlibRunError it prints which ray is the suspect (since geometry-
error retry is futile).
2. fan_sweep via _apply_rays — fan_sweep([8,10,12]) returns pwr_tot,
pwr_nsa (per-species sum), and pwr_nsa_nray (per-ray x per-species
2D matrix).
3. auto_setup + hand-rolled _hand_validate + DEVICE_PRESETS[
"ITER_LHCD_2ray"] — preset splits {shape, rays}; validate enforces
NSMAX>=1, non-empty rays, REQUIRED_RAY_KEYS = ("RF","RPI","ZPI",
"ANGT") per ray. NRAYMAX = len(rays) auto-derived.
The _apply_rays(wrx, rays) helper takes [{"RF":170e3, "RPI":8.0,
"ZPI":0.5, "ANGT":10.0}, ...], hides 1-origin Fortran indexing,
fills defaults (PHII=0, ANGPHI=0, UU=1, MODEW=1), and writes
RFIN[i]/RPIN[i]/... via set_param.
Reproduced live numeric outputs against libwrxapi.so (ITER 170 GHz,
NSMAX=2 D+e):
Single ray (ANGT=10): pwr_tot=0.7840, pwr_nsa=[0.7840, 0.0]
3-ray fan ANGT=8/10/12: pwr_tot=1.9122
2-ray ITER preset: pwr_tot=1.0073
Bad geometry (RPI=1.0 < 2.0): preflight flags ray 1 then ierr=3
Caveats noted in {note} admonition: historical libgrf::grd1d SEGV
(retired by PR #123), WRX_RUN_OK=1 default since PR #166, and the
pwr_nray[i]==0.0 quirk (use pwr_nsa_nray row sums).
Sphinx wrx-ja build: warning-free (-W --keep-going passes).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
k-yoshimi
added a commit
that referenced
this pull request
Apr 27, 2026
English version of the wrx applications page (safe_run / fan_sweep / auto_setup + _apply_rays helper). Mirrors 3b86091. Numeric outputs (pwr_tot=1.9122, pwr_tot=1.0073, etc.) and PR #123 / PR #166 caveat references preserved verbatim. Verified: sphinx wrx-en build is warning-free. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Four distinct bugs in
libwrxapi.so'swrx_run/wrx_get_statepath, localised via env-gatedwrx_dump_statemodule (TR_DUMP_STATEpattern):wrcalpwr.f90:xtemp(0:nstpmax)/ytempbuffer overrun whennstpmax_all = MAXVAL(nstpmax_nray)+1exceedsnstpmax. Cap withMIN()+ guard ray-2 read whenNRAYMAX=1+ gatePAGES/GRD1D/PAGEEblock behindWRX_NO_GRAPHICS(auto-set bywrx_api_init).wrcomm.f90SAVE-guard: addALLOCATED()canary so finalize+reinit cycle does not early-return with freed chunks (use-after-free).wrcomm.f90allocatables:pos_pwrmax_rs_nsa/pwrmax_rs_nsa/pos_pwrmax_rl_nsa/pwrmax_rl_nsawere declared but never allocated;wrx_get_stateread them -> SEGV. Now allocate + 0-init.wrcomm.f90heap-reuse: defensive zero-init sweep of all remaining allocatables solibwrxapi.soreinit matches fresh-binary behaviour (same class fixed for tr intrcomm_profile.f90in PR fix(tr): trcomm_profile heap-reuse zero-init + tr_dump_state module #121).Retires the
WRX_RUN_OKknown-limitation inpython/wrxlib/README.md.Test plan
make -C wrx libwrxapi.sobuilds cleanlypython/wrxlib56 PASS / 4 SKIP, no wrxlib failureswrx_mcp::test_init_state_finalizefailure unchanged from develop (test callsget_statepre-run, hitsg_run_calledguard added in develop's PR feat(mcp): add tr MCP reference implementation + common design plan #72)🤖 Generated with Claude Code
Note
Medium Risk
Touches core WRX execution/allocation logic and changes behavior of graphics calls under the
.sopath via environment gating; while targeted at crash fixes, it could affect numerical output or lifecycle behavior across runs in-process.Overview
Fixes multiple
libwrxapi.socrash paths in thewrx_run/wrx_get_stateflow by disabling thewrcalpwrgraphics block under the shared-library build (auto-setWRX_NO_GRAPHICSinwrx_api_init, and guardPAGES/GRD1D/PAGEEinwrcalpwr.f90).Also hardens runtime safety: clamps
nstpmax_allto preventxtemp/ytempout-of-bounds, guards ray-2 debug/plot access whenNRAYMAX=1, fixes finalize→reinit use-after-free by requiringALLOCATED()before early-return inwr_allocate, allocates/zero-inits per-species peak-power arrays used bywrx_get_state, and adds a full allocatable zero-init sweep to avoid heap-reuse divergence on reinit. A new env-gatedwrx_dump_statemodule can append selectedwrcommstate for debugging, and thewrx_api_check_allnegative test runner now feeds enough stdin for repeatedwrx_initcycles; docs update retires theWRX_RUN_OKgate as historical.Reviewed by Cursor Bugbot for commit 81aa96e. Bugbot is set up for automated code reviews on this repo. Configure here.