Skip to content

docs(fp): Phase L implementation plans for fp library-ization#5

Merged
k-yoshimi merged 3 commits into
developfrom
docs/fp-library-plans
Apr 18, 2026
Merged

docs(fp): Phase L implementation plans for fp library-ization#5
k-yoshimi merged 3 commits into
developfrom
docs/fp-library-plans

Conversation

@k-yoshimi
Copy link
Copy Markdown
Owner

@k-yoshimi k-yoshimi commented Apr 18, 2026

Summary

Phase L (library-ization) implementation plans for the fp module (Fokker-Planck solver). 8 sub-phases (L-0..L-7) modeled on the tr Phase L design (docs/superpowers/specs/2026-04-17-tr-library-design.md, merged via PR #3).

Sub-phases

File Sub-phase
2026-04-18-fp-library-L0-baseline.md Regression-test infrastructure (fp_iter01/jt60/dt1 baselines)
2026-04-18-fp-library-L1-graphics-split.md Makefile-only SRCS_CORE/SRCS_GRAPHICS/SRCS_MENU split
2026-04-18-fp-library-L2-c-abi-foundation.md fp_state.f90, fp_api.f90 stub, fp_api.h
2026-04-18-fp-library-L3-param-registry.md fp_param_registry.f90 (~25 params)
2026-04-18-fp-library-L4-shared-lib-build.md PIC build, libfpapi.so
2026-04-18-fp-library-L5-python-wrapper.md python/fplib/ (ctypes 2-layer)
2026-04-18-fp-library-L6-test-4layers.md 4-layer tests
2026-04-18-fp-library-L7-docs.md README + sweep notebook

Constraints respected

  • Tests mandatory in every sub-phase (4-layer pattern from tr)
  • Minimal code change (additive only; existing fp binary preserved; libfpapi.so added as new target)
  • L-0 includes Phase 0 equivalent since fp has no regression infra yet

Test plan

  • PR review for plan structure consistency with tr Phase L design
  • Verify each sub-phase has explicit test additions/verifications

Note

Low Risk
Documentation-only change that adds detailed implementation plans; no production code, build, or test runner behavior is modified in this PR.

Overview
Adds a full set of FP “Phase L” library-ization implementation plans (docs/superpowers/plans/2026-04-18-fp-library-L0 through L7). The documents lay out a staged approach covering regression baselining, Makefile source grouping, C ABI + shared library build, parameter registry, Python ctypes wrapper, 4-layer testing, and user-facing docs, with explicit file-by-file changes and validation steps per phase.

Reviewed by Cursor Bugbot for commit 1ac7533. Bugbot is set up for automated code reviews on this repo. Configure here.

@k-yoshimi
Copy link
Copy Markdown
Owner Author

@cursor review

8 sub-phases (L-0..L-7) modeled on the tr Phase L design in
docs/superpowers/specs/2026-04-17-tr-library-design.md, adapted for the fp
(Fokker-Planck) module.

L-0 includes regression-test infrastructure setup (no Phase 0 exists for fp
yet); subsequent sub-phases follow the same 5-function C ABI + Python ctypes
2-layer wrapper pattern as tr.

Tests are mandatory in every sub-phase (4-layer pattern: equivalence /
C ABI / Python wrapper / sweep smoke). Existing fp executable is preserved
throughout; libfpapi.so is added as a new build target.
Comment thread docs/superpowers/plans/2026-04-18-fp-library-L6-test-4layers.md Outdated
Comment thread docs/superpowers/plans/2026-04-18-fp-library-L2-c-abi-foundation.md Outdated
@k-yoshimi
Copy link
Copy Markdown
Owner Author

Pre-merge review (parent agent + code-reviewer agent)

PR を draft 化 します。merge 前に以下の修正が必要です。

Critical (compile-blocking)

  • L-2 fp_api.f90 の USE chain (fpprepfpcaleind) で fpcaleind モジュールが fp/ に存在しない(46 .f90 を Glob で確認)。make fp_api_check ステップで即 fail。
  • L-3 fp_param_registry.f90MODELGNSMAX を CASE 設定するが、fpcomm_parm には両方とも未宣言。fpcomm_parm の model switch は MODELE/MODELA/MODELC(NSM)/MODELR/MODELD/MODELS/MODELW(NSM)、species count は nsamax/nsbmax(小文字)。コンパイルエラー。
  • L-3 PA/PN/PZ/PTPR/PTPP/PTSplcomm_parm 由来で fpcomm_parm 経由では setter として使えない。USE plcomm_parm を別途追加するか、登録を見直し。

Major

  • L-4 Task 2.1〜2.7: lib*_pic.a のビルドを $(LD) $(LDFLAGS) lib*.a $(OBJS) で行っている。静的アーカイブは $(AR) rcs lib*.a $(OBJS) を使うべき(TR 設計 7.2 と同じ)。7 ライブラリ全てで誤り、後続の libfpapi.so リンクが fail。
  • L-2 PUBLIC :: fp_init_c, fp_run_c, ... だが C header / test code は fp_init, fp_run, ... を使用。BIND(C, NAME="fp_init") で解決されるが命名混乱。
  • L-6 _helpers.py の baseline loader は metrics.jsonscalars / profile キーがある前提だが L-0 で schema を定義していない。extract_fp_metrics.py の出力次第で KeyError
  • L-5 ITER01_PARAMS fixture が L-3 の壊れた MODELG/NSMAX を継承。

Minor

  • L-7 doc/fp-library.md(単数)作成だがプロジェクトは docs/(複数)。

修正方針

  • L-3 の登録前に fpcomm_parm / plcomm_parm を grep で実シンボル確認するステップを追加
  • L-4 全 lib*_pic.a の生成コマンドを $(AR) rcs に統一
  • L-6 で metrics.json の schema を L-0 に明文化
  • L-7 ディレクトリ名を docs/ に修正

修正後 gh pr ready で draft 解除して再レビューします。

@k-yoshimi k-yoshimi marked this pull request as draft April 18, 2026 06:20
@k-yoshimi k-yoshimi marked this pull request as ready for review April 18, 2026 06:47
@k-yoshimi
Copy link
Copy Markdown
Owner Author

@cursor review

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Bugbot reviewed your changes and found no new issues!

2 issues from previous reviews remain unresolved.

Fix All in Cursor

Comment @cursor review or bugbot run to trigger another review on this PR

Reviewed by Cursor Bugbot for commit a3c17e8. Configure here.

@k-yoshimi
Copy link
Copy Markdown
Owner Author

@cursor review

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Bugbot reviewed your changes and found no new issues!

Comment @cursor review or bugbot run to trigger another review on this PR

Reviewed by Cursor Bugbot for commit 1ac7533. Configure here.

@k-yoshimi k-yoshimi merged commit c1eb651 into develop Apr 18, 2026
1 check passed
@k-yoshimi k-yoshimi deleted the docs/fp-library-plans branch April 18, 2026 15:42
k-yoshimi added a commit that referenced this pull request May 4, 2026
…188)

* docs(spec): add L-7b-ii BPSD broker coupling verification design

Brainstorm-stage spec for L-7b-ii: orchestrator-level verification of
the eq -> tr BPSD coupling. MVP scope ("A-medium"):

  - new non-mutating Fortran helper tr_check_bpsd_pull (4-slot check)
  - C ABI export + Python Trlib.check_bpsd_pull() wrapper
  - CouplingRule extended with kind/verify fields
  - new ("eq","tr") verify rule in COUPLING_RULES
  - 3-layer test strategy (mock dispatch / unit / integration)

Codex independent reviewed in 4 rounds (overall design + sec 6 + sec 7 + sec 8); all
HIGH/MED findings addressed:

  - non-mutating BPSD pull (per-slot ierr accumulation, no TRCOMM mutation)
  - existing TotPipelineCouplingError reused (no new exception class)
  - 3-level exception chain via existing broad except
  - MODELG=0 + eq->tr -> CouplingError as acceptance criterion
  - wr-side BPSD speaker pending evaluation (L-7a divergence-risk
    judgement preserved; re-evaluation trigger documented)

Implementation risks (4) carry explicit fallbacks. Out-of-scope items
explicitly cross-reference L-7a / L-7b-i specs by section name (not
file:line) for churn resilience.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(spec): L-7b-ii self-review fixes (BS7)

  - replace MAX_NRHO placeholder with reference to existing BPSD
    callers' allocation pattern
  - reconcile test case count (12 -> 11; C-2 documented but not
    implemented per drop note)
  - clarify error message format uses callable_repr (with
    repr() fallback for non-qualname callables like partial)

No semantic change; corrects three minor inconsistencies found in
the brainstorming spec self-review pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(spec): L-7b-ii final Codex review fixes

Six fixes from final Codex full-spec review (HIGH 1, MED 3, LOW 2):

  HIGH: verify checks 3 eq-pushed slots only (device/equ1D/metric1D),
        not 4. plasmaf is tr's own BPSD output (tr_bpsd_put), absent
        on a fresh eq->tr pipeline -- including it would cause
        spurious False on first-time pipelines. Updated §1 data flow,
        §2 Fortran helper, §3 Python wrapper docstring, C prototype.

  MED-1: __post_init__ now validates transform is non-None for
        kind=transfer (was missing -- transfer dispatch unconditionally
        calls rule.transform(raw), would crash on None default).

  MED-2: AC6 clarified -- Layer C drop decision is plan-time, not
        post-MVP. Either Layer C is required (Eq wrapper sufficient)
        or removed from MVP scope (insufficient + follow-up PR).
        Removes the 'both required and droppable' ambiguity.

  MED-3: §8.4 risk #5 added: stale BPSD slot data across tests/runs.
        Layer A/B unaffected; Layer C uses --forked or per-test BPSD
        reset for isolation.

  LOW-1: line 76 estimate table 12 -> 11 cases (matches §7 count
        after C-2 drop).

  LOW-2: @DataClass(frozen=True) preserved in §4 dataclass; explicit
        note that __post_init__ raises ValueError only (no field
        mutation) so frozen remains compatible.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(spec): L-7b-ii confirmation review fixes

3 missed fixes from confirmation review:

  HIGH (still-broken): 'all four' verbiage at line 126 + plasmaf in
        rule.doc string at lines 247-255 -- both fixed. Remaining
        plasmaf mentions are all explicit 'intentionally excluded'
        rationale (intentional, kept).

  MED-1 (partial): added type-narrowing asserts in §5 dispatch
        snippet -- assert rule.transform is not None / assert
        rule.dst_param is not None -- to satisfy mypy/pyright on the
        Optional[Callable] / Optional[str] fields. __post_init__
        guarantees they hold for kind=transfer.

  MED-2 (partial): §1 in-scope clarified -- Layer A+B unconditional,
        Layer C conditional on plan-time Eq wrapper sufficiency
        check (cross-references AC6 + §8.4 risk #2). Resolves the
        'integration coverage promised but droppable' contradiction.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(spec): L-7b-ii final assert narrowing for rule.verify

One-line fix from final confirmation review: §5 dispatch verify
branch was missing 'assert rule.verify is not None' for static
checker narrowing on the Optional[Callable] field. Symmetric to
the transfer-branch asserts already present.

Codex verdict after this fix: READY-FOR-IMPLEMENTATION.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(plan): add L-7b-ii BPSD broker coupling implementation plan

Implementation plan for the L-7b-ii spec
(docs/superpowers/specs/2026-05-03-l7b-ii-bpsd-broker-coupling-design.md).

4-phase commit pattern mirroring L-7b-i (PR #187):

  Phase 1: tr_check_bpsd_pull Fortran helper + C ABI + Python wrapper
  Phase 2: CouplingRule kind/verify extension + dispatch + ('eq','tr') rule
  Phase 3: 11 test cases across 3 layers + README updates
  Phase 4: pre-push gate + PR open + CI/Bugbot wait + squash merge

Plan-time risks resolved at write-time:

  - Eq wrapper sufficient (set_param/set_param_str/run/get_state all
    present in python/eqlib/eqlib.py); Layer C in scope
  - bpsd_get_data is INTENT(INOUT), self-allocates when nrmax=0;
    helper sets local%nrmax=0 explicitly
  - g_initialized/g_prepared lifecycle flags exist in tr/tr_api.f90:64-65

Plan-time risk deferred to implementation:

  - Layer C runtime (< 5s estimate unmeasured); fallback is
    @pytest.mark.slow gating per spec §8.4

Step granularity bite-sized (read context -> edit -> verify -> commit)
per writing-plans skill convention. Suitable for executing-plans or
subagent-driven-development.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(plan): L-7b-ii Codex review fixes

Six fixes from Codex plan review (HIGH 2, MED 3, LOW 1):

  HIGH-1: pre-push pytest commands now include --forked --timeout=120
        --timeout-method=signal per CLAUDE.md non-negotiable. Note
        added on pytest-forked plugin install + documented exception
        when plugin unavailable.

  HIGH-2: reviewer agent type Agent(subagent_type='feature-dev:
        code-reviewer', ...) per CLAUDE.md. Fallback note for
        environments where feature-dev: is unavailable points to
        superpowers:code-reviewer.

  MED-1: Layer C --forked is now required (not advisory). Without
        BPSD broker isolation, C-3 (MODELG=0 expected failure) may
        spuriously pass after C-1 populated slots.

  MED-2: base SHA references updated f3dafae -> bdc0913 (the plan
        commit itself is the new chore tip, so executor will branch
        from it). Phase 0/1.5/2.4/3.6 squash bases corrected.

  MED-3: spec-coverage matrix entry for §7 MODELG fixture corrected
        to acknowledge deviation: plan uses local _eq_params_modelg()
        helper instead of mutating shared tot_demo2014_params.py
        (rationale: avoid Layer 1 baseline side effects). eqdata path
        IS still reused.

  LOW-1: Layer A case count clarified -- 11 logical cases (A-5
        implemented as 3 sub-tests), 13 individual pytest tests.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(plan): align Phase 4.1 Step 2 --timeout to CLAUDE.md (=120)

Final fix from confirmation review: Phase 4.1 Step 2 used
--timeout=60 (inconsistent with Step 1/3 and CLAUDE.md CI flags).
Aligned to --timeout=120.

Plan now READY-FOR-EXECUTION per Codex.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(plan): align all pytest commands to CLAUDE.md flags

5 additional pytest commands (Tasks 1.4, 2.2, 2.3, 3.1, 3.2 sanity
checks) updated to --forked --timeout=120 --timeout-method=signal,
matching CI workflow and Phase 4 pre-push gate. Per CLAUDE.md
non-negotiable: 'Local test: run the relevant pytest locally with
the SAME flags the CI workflow uses.'

All 9 pytest invocations in the plan now consistent.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* tr+trlib: add tr_check_bpsd_pull BPSD broker verification helper (L-7b-ii)

New non-mutating Fortran helper tr_check_bpsd_pull pulls the 3
eq-pushed BPSD slots (device, equ1D, metric1D) into local
discardable types and reports ok = 1 iff all 3 per-slot ierr == 0.
plasmaf is intentionally NOT checked: it is tr's own BPSD output
(tr_bpsd_put), absent on a fresh eq->tr pipeline.

Implementation notes:
- Local types initialized with %nrmax = 0 to trigger BPSD's
  self-allocation path (per ../../bpsd/bpsd_equ1D.f90:143-150,
  bpsd_get_* is INTENT(INOUT) and allocates internally when
  the caller passes nrmax=0).
- Per-slot ierr accumulated to AND-condition (avoids masking
  earlier failures by later successes).
- TRCOMM untouched -- safe to call before or after tr.run().
- TR_STATE_ABI_VERSION not bumped (no struct change).

Exposed via C ABI (tr/tr_api.h) and Python wrapper
(Trlib.check_bpsd_pull). Smoke-tested: fresh Trlib() init
returns False (BPSD slots empty); existing trlib tests unchanged.

Spec: docs/superpowers/specs/2026-05-03-l7b-ii-bpsd-broker-coupling-design.md

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* pipeline: extend CouplingRule for verify rules (kind/verify dispatch) (L-7b-ii)

CouplingRule dataclass gains two new fields:

  kind: str = "transfer"   # "transfer" | "verify"
  verify: Optional[Callable[[Any], bool]] = None

Existing transfer-rule fields (src_state_key, dst_param, transform)
are defaulted to None and validated by __post_init__:

  * kind="transfer" requires all three transfer fields
  * kind="verify"   requires verify callable
  * unknown kind raises ValueError

@DataClass(frozen=True) preserved; __post_init__ only raises and
never mutates self, so frozen remains compatible.

run_pipeline rule iteration gains a kind branch:

  * transfer: existing extract -> transform -> set_param -> record
  * verify:   ok = rule.verify(curr_inst); raise CouplingError if
              False or if verify itself raised.
              applied.append only on success (matches transfer-rule
              "succeeded snapshot" semantics of PipelineStep.coupling_applied)

Both error paths flow through the existing broad except at
pipeline.py and become TotPipelineRunError with __cause__ chain
(3-level: RunError -> CouplingError -> OriginalError if any).

Scope note (L-7b-ii deferred at registry level):

  The original spec proposed registering an ('eq','tr') verify rule
  using BPSD as a cross-module broker (Trlib.check_bpsd_pull). During
  implementation we found that libeqapi.so and libtrapi.so each carry
  a private copy of the BPSD module-level state -- ___bpsd_equ1d_MOD_
  equ1dx is static (s) in nm output of both shared objects. Therefore
  bpsd_put from libeqapi.so does NOT propagate to libtrapi.so's
  bpsd_get, and the verify rule cannot succeed under the per-module
  TotPipeline architecture. The legacy Tot / libtotapi.so path keeps
  eq+tr+bpsd co-linked and is unaffected.

  Until a cross-.so sharing scheme (unified .so, IPC, or RTLD_GLOBAL
  with weak symbols) is decided, no ('eq','tr') rule is registered.
  The verify dispatch infrastructure is in place and mock-tested
  (Layer A) so that the rule can be added later without further
  changes to pipeline.py.

One existing test (test_run_pipeline_missing_source_state_raises_
coupling_error) was updated to pass an explicit transform=lambda v: v
to satisfy the new __post_init__ contract; it had previously
relied on the old identity default.

Existing ("fp","tr") transfer rule and Layer 1 baselines (demo2014,
ht6m at 1e-10) are unaffected.

Spec: docs/superpowers/specs/2026-05-03-l7b-ii-bpsd-broker-coupling-design.md

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test+docs: Layer A/B BPSD coupling verify tests + READMEs (L-7b-ii)

Adds 11 new test cases across 2 layers (Layer C deferred — see below):

  Layer A (test_pipeline_verify.py, no .so): 8 mock-based dispatch
    tests covering verify-True success, verify-False raise+cause-chain,
    verify-raise 3-level chain, mixed transfer+verify ordering,
    __post_init__ validation (3 cases), unregistered-pair silent skip.

  Layer B (test_bpsd_check.py, libtrapi.so): 3 unit tests for
    Trlib.check_bpsd_pull() -- fresh init returns False, closed
    Trlib raises TrlibError, no exception leak from Fortran.

Also updates test_pipeline_dataclasses::test_coupling_rule_defaults
to pass an explicit transform=lambda v: v: the previous version
relied on the old default `transform = lambda v: v`, which is no
longer a default per the new __post_init__ contract introduced in
the previous commit.

Layer C (eq + tr integration) deferred:

  実装中の調査で `libeqapi.so` と `libtrapi.so` が独立した .so で、
  それぞれが BPSD module-level state (`___bpsd_equ1d_MOD_equ1dx` ほか)
  を private に持つことが判明 (nm で確認、両 .so で static linkage)。
  spec が想定した「BPSD = eq/tr 共有ブローカー」は
  `Tot` / `libtotapi.so` (eq+tr+bpsd 同梱) では成立するが、
  `TotPipeline` (per-module .so) では成立しない。
  したがって ('eq','tr') verify ルールは本 PR では登録せず、
  Layer C 統合テストも保留 (実行しても push が反映されないので、
  C-1 は構造的にパスしえず、C-3 は誤った理由で pass する)。

  verify ディスパッチの機構自体は本 PR で完成済み (Layer A 網羅)
  なので、共有方針が決まり次第 ('eq','tr') ルールは
  COUPLING_RULES に 1 行追加するだけで有効化できる。
  詳細は totlib/README.md の Coupling rules 節と
  trlib/README.md の "BPSD ブローカー pull 検証" 節を参照。

README updates:

  python/totlib/README.md: documents the kind="transfer" / "verify"
    distinction in the Coupling rules section, plus a Japanese-language
    deferral note explaining the per-.so BPSD isolation finding.

  python/trlib/README.md: new "BPSD ブローカー pull 検証" section
    (Japanese) documenting Trlib.check_bpsd_pull() with usage example
    and the same scope caveat about per-.so isolation.

Existing Layer 1 baselines (demo2014, ht6m at 1e-10) and existing
pipeline tests are unaffected (the verify dispatch doesn't fire when
no rule is registered for the pair).

Spec: docs/superpowers/specs/2026-05-03-l7b-ii-bpsd-broker-coupling-design.md

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(L-7b-ii): address reviewer findings (MED + LOW polish)

Codex MED1 (older .so compatibility): wrap tr_check_bpsd_pull
ctypes prototype attachment in try/except, mirroring the existing
tr_validate pattern in _ffi.py. Older builds of libtrapi.so that
predate L-7b-ii will now defer the AttributeError to first call of
Trlib.check_bpsd_pull instead of failing at module import.

Codex MED3 (chain-depth accuracy): totlib/README's "3-level cause
chain" was inaccurate for the False-return path. Clarify that
False produces a 2-level chain (TotPipelineRunError ->
TotPipelineCouplingError, no original cause), while a verify()
exception produces the 3-level chain (... -> original exception).

In-house L-1 (docstring count): test_pipeline_verify.py header
said "Six cases" but defines 8 test functions because A-5 expands
into three separate validation tests. Updated the docstring to
clearly distinguish "six logical cases" from "eight test
functions" with a per-case description.

Codex MED2 (spec staleness): added §0 (in Japanese, matching
totlib/trlib README convention) at the top of the spec doc
recording the implementation-time deferral, summarising what
shipped vs deferred and listing candidate resolution paths
(shared .so / IPC / RTLD_GLOBAL+weak / serialisation). Status
field changed from "Draft" to "Partial".

Verification:
- Layer A (8) + Layer B (3) + dataclasses (8) + pipeline (30) +
  equivalence (2) = 51 passed; 1 pre-existing Py3.10
  ExceptionGroup test still fails (documented in plan).
- Trlib() smoke load + check_bpsd_pull still returns False
  on fresh init.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant