Skip to content

fix(heck-relay): stop overwriting Rosales OPT params with Seminario projections#280

Merged
ericchansen merged 1 commit into
masterfrom
diag/heck-relay-seminario
May 25, 2026
Merged

fix(heck-relay): stop overwriting Rosales OPT params with Seminario projections#280
ericchansen merged 1 commit into
masterfrom
diag/heck-relay-seminario

Conversation

@ericchansen
Copy link
Copy Markdown
Owner

Resolves #277.

TL;DR

The Heck-relay loader was silently corrupting the published Rosales OPT force field by overwriting its OPT parameters with raw Seminario projections. Fixing this single line drops the ObjectiveFunction by 43×, takes JaxLoss from "diverges by 21 orders of magnitude" to ratio ≈ 1.29, and brings bond_length R² from −48 to 0.98.

The bug

ff_template = ForceField.from_mm3_fld(ff_path, include_standard=True)
ff_opt = ForceField.from_mm3_fld(ff_path, include_standard=False)
ff_template.freeze_standard_params(ff_opt)
ff = estimate_force_constants(molecules, forcefield=ff_template, ...)  # 💥

freeze_standard_params marks the non-OPT params (standard MM3 backbone) as frozen and leaves the OPT params active. estimate_force_constants skips frozen params (see q2mm/models/seminario.py:341) and re-estimates the active ones — overwriting the carefully-fitted Rosales OPT values with raw Seminario projections.

Three-baseline diagnostic

The new scripts/diagnose_heck_relay.py builds three FFs and evaluates each (no optimization, just evaluation). Output committed to q2mm-data/benchmarks/heck-relay/diagnostic/:

Baseline obj ratio R²(bond_len) R²(bond_ang) R²(eig_diag)
A — Untouched Rosales 3.42 × 10⁶ 1.31 0.980 0.786 −12.6
B — Current loader 1.17 × 10⁸ NaN −4787 −7.25 −4.66
C — Seminario only 1.24 × 10⁸ 1.77 × 10⁷³ −849 −5.13 0.05

A's bond_length R² (0.980) is dramatically better than B's (−4787) — matching the issue's decision rule "A fine, B explodes → narrow on Seminario behavior with freeze_standard_params".

The fix

Keep freeze_standard_params (still needed for the optimizer's active mask) but skip estimate_force_constants. Use the published Rosales OPT parameters as-is.

Post-fix baseline

(Companion q2mm-data PR regenerates the heck-relay convergence JSONs.)

Metric Before (loader bug) After (this PR)
Initial ObjectiveFunction 1.48 × 10⁸ 3.46 × 10⁶
Initial JaxLoss 1e30 (penalty) 4.48 × 10⁶
Ratio ~10²¹ 1.29
Status out_of_band out_of_band (12 % over band)
bond_length R² −48.06 0.980
bond_angle R² −5.47 0.786
eig_diagonal R² −4.66 −12.6

The eigenmatrix actually gets worse (−4.66 → −12.6), but that just reflects the real MM3* ↔ JAX-engine cross-engine gap that's now visible without being masked by the corrupted FF. Geometry is now well-reproduced and JaxLoss is no longer divergent.

Out of scope

  • The same freeze_standard_params + estimate_force_constants pattern exists in load_rh_enamide and the load_pd_* loaders. They aren't catastrophically broken today (Rh-enamide runs at R² = 0.97, Pd-allyl at −1.41), but the same audit should probably extend to them in a follow-up.
  • Heck-relay now joins pd-conjugate as a candidate for the ratio_tol=None experiment (#276).

Validation

  • python -m ruff check q2mm/ test/ scripts/ — clean.
  • python -m ruff format --check q2mm test scripts examples — clean.
  • python -m pytest test/ -x -q -m "not (openmm or tinker or jax or jax_md or psi4)" -k "heck or systems" — 11 passed.
  • End-to-end scripts/regenerate_convergence_results.py --system heck-relay --skip-optimization ran clean on a single GPU; output (with full provenance) is in the companion q2mm-data PR.

Companion PR

ericchansen/q2mm-data#? — adds the diagnostic JSON and the regenerated convergence baseline.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes the heck-relay system loader so it no longer corrupts the published Rosales OPT force field by re-running Seminario projections over active OPT parameters, and updates diagnostics/docs to reflect the corrected baseline behavior.

Changes:

  • Stop calling estimate_force_constants() in load_heck_relay() while still using freeze_standard_params() to preserve the intended active-parameter mask.
  • Add a one-off diagnostic script to compare multiple heck-relay force-field baselines and write a JSON report with provenance.
  • Update heck-relay and optimizer-comparison documentation to reflect the corrected baseline metrics and ratio behavior.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.

File Description
scripts/diagnose_heck_relay.py Adds a three-baseline diagnostic + JSON/provenance output for heck-relay.
q2mm/diagnostics/systems.py Fixes heck-relay loader to keep Rosales OPT parameters as-published (no Seminario overwrite).
docs/systems/heck-relay.md Updates heck-relay system page to describe the loader fix and new baseline metrics.
docs/benchmarks/optimizer-comparison.md Updates cross-system comparison tables/text for the new heck-relay ratio and R² values.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread scripts/diagnose_heck_relay.py Outdated
Comment thread scripts/diagnose_heck_relay.py
Comment thread docs/systems/heck-relay.md Outdated
Comment thread docs/systems/heck-relay.md Outdated
Comment thread q2mm/diagnostics/systems.py
…rojections

Resolves #277.

The Heck-relay loader silently corrupted the published Rosales OPT
force field. The sequence

    ff_template = ForceField.from_mm3_fld(ff_path, include_standard=True)
    ff_opt = ForceField.from_mm3_fld(ff_path, include_standard=False)
    ff_template.freeze_standard_params(ff_opt)
    ff = estimate_force_constants(molecules, forcefield=ff_template, ...)

freezes all *non*-OPT params (standard MM3 backbone) and leaves the
OPT-substructure params active. `estimate_force_constants` then skips
the frozen params (q2mm/models/seminario.py:341) and re-estimates the
*active* (= OPT) ones — overwriting the carefully-fitted Rosales values
with raw Seminario projections.

The new three-baseline diagnostic
(`scripts/diagnose_heck_relay.py`, committed alongside this fix and
mirrored to `q2mm-data/benchmarks/heck-relay/diagnostic/`) confirms
the divergence:

  Baseline                obj          ratio      R²(bond_len)
  A_untouched_rosales     3.42e+06     1.31         0.980
  B_current_loader        1.17e+08     n/a (NaN)   -4787.5
  C_seminario_only        1.24e+08     1.77e+73    -849.0

Fix: keep `freeze_standard_params` (still needed for the optimizer's
active mask) but skip `estimate_force_constants`. Use the Rosales OPT
values as-published.

Post-fix regenerated baseline (in companion q2mm-data PR):

  obj: 1.48e+08 → 3.46e+06 (43x smaller)
  ratio: ~10^21 (sanitized 1e30) → 1.29 (finite, real)
  ratio_status: out_of_band → out_of_band (still 12% over the band)
  bond_length R²: -48.06 → 0.980
  bond_angle R²:  -5.47  → 0.786
  eig_diagonal R²: -4.66 → -12.62 (worse; reflects the real
    MM3* ↔ JAX-engine cross-engine gap, not a loader bug)

Heck relay now sits with pd-conjugate as a candidate for the
ratio_tol=None experiment (issue #276): the gate barely misses
instead of catastrophically failing.

Out of scope (per #277): the same `freeze_standard_params +
estimate_force_constants` pattern exists in load_rh_enamide and
load_pd_*. Those systems are not affected catastrophically (Rh-enamide
runs at R²=0.97 and Pd-allyl at -1.41 today) but the same audit
should probably extend to them in a follow-up.

Also includes:
- scripts/diagnose_heck_relay.py — the diagnostic script (committed
  per #277 acceptance criteria, output mirrored to q2mm-data).
- docs/systems/heck-relay.md — updated to reflect the fix.
- docs/benchmarks/optimizer-comparison.md — heck-relay row and notes
  updated; per-category R² table updated to current numbers.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@ericchansen ericchansen force-pushed the diag/heck-relay-seminario branch from 986ec19 to 4e86326 Compare May 25, 2026 14:20
@ericchansen ericchansen merged commit f27e33c into master May 25, 2026
11 checks passed
@ericchansen ericchansen deleted the diag/heck-relay-seminario branch May 25, 2026 14:26
ericchansen added a commit that referenced this pull request May 26, 2026
…oad_system

Replaces six per-system loaders + a private _load_wahlers_system helper
with a single declarative dispatch system, addressing the
inconsistencies (load_X vs _load_X_system, divergent shapes) and the
copy-paste between similar loaders.

New module: q2mm/models/loaders.py
  - Strategy primitives, named after what they do (no implicit ordering
    that produced the q2mm#277 bug):
      load_published_opt(ff_path)
      load_qfuerza_only(molecules, ff_path)
      load_qfuerza_fresh(molecule)
      compose_opt_with_mm3_base(opt_path, base_path, *, metal)
      load_published_opt_composed(opt_path, base_path, *, metal)
  - Each docstring cites Farrugia 2025 (the QFUERZA paper, AGENTS.md
    "Key Papers") and documents the returned FF's frozen/active
    invariants.

Rewritten q2mm/diagnostics/systems.py:
  - New `SystemSpec` dataclass replaces `BenchmarkSystem`.  Each spec
    declares: molecule_loader callable, ff_strategy literal, ff_paths
    mapping, metadata dict, optional metal symbol.
  - New `load_system(key, *, engine=None, functional_form=None,
    molecule_loader_kwargs=None)` is the single loader entry point.
    Reference construction is uniform across multi-molecule systems
    (`ReferenceData.from_molecules` with eigenmatrix-diagonal=True);
    single-molecule systems with frequency-only references (ch3f) take
    the `engine` argument and use `_build_frequency_reference`.

Deleted (no shims, alpha discipline):
  - load_ch3f, load_rh_enamide, load_heck_relay, load_pd_allyl,
    load_pd_conjugate, load_rh_conjugate (six public functions).
  - _load_wahlers_system (private helper).
  - BenchmarkSystem dataclass (replaced by SystemSpec).

Survives because still per-system:
  - load_X_molecules() functions (genuinely different file
    formats/layouts per system; referenced by SystemSpec.molecule_loader).
  - _load_gaussian_molecules, _load_ch3f_molecules, _load_wahlers_molecules
    (factored out of the deleted loaders).

Per-system code in SYSTEMS shrinks to ~10 lines of declarative spec
per system (vs ~40-60 lines of repetitive procedure in the old design).

Caller migration in the same commit (no shims):
  - q2mm/diagnostics/cli.py:    SYSTEMS[k].loader(engine) → load_system(k, engine=engine, ...)
                                BenchmarkSystem → SystemSpec
                                special-cased load_ch3f(data_dir=...) → load_system("ch3f", molecule_loader_kwargs={"data_dir": ...})
  - q2mm/diagnostics/__init__.py: re-export SystemSpec + load_system instead of BenchmarkSystem.
  - scripts/regenerate_convergence_results.py: SYSTEMS[k].loader → load_system.
  - scripts/bench_composed.py: load_ch3f → load_system("ch3f").
  - scripts/run_rh_benchmarks.sh: embedded Python — load_rh_enamide → load_system("rh-enamide").
  - test/integration/test_heck_validation.py regression: load_heck_relay → load_system.
  - test/integration/test_backend_optimizer_matrix.py: load_ch3f → load_system.
  - test/benchmarks/test_optimization.py: same.

Smoke tests:
  - `scripts/regenerate_convergence_results.py --system ch3f --skip-optimization`
    runs end-to-end and produces the same baseline numbers
    (ratio=1.000, ok, single 'frequency' category).
  - Full unit suite: 656 passed, 272 skipped.
  - Ruff check + format: clean.

Behavior changes (intentional, per plan, will surface in commit 5):
  - load_rh_enamide previously called qfuerza_into on a copy of the
    rh-enamide mm3.fld — i.e. it OVERWROTE the Donoghue OPT values
    with raw FUERZA projections.  After this commit it uses
    load_published_opt, which preserves the published OPT values.
    Numbers for rh-enamide will move in commit 5 (regeneration).
  - load_pd_allyl, load_pd_conjugate, load_rh_conjugate had the same
    overwrite pattern via _load_wahlers_system + qfuerza_into.  After
    this commit they use load_published_opt_composed and preserve the
    Wahlers OPT values.  Numbers will move in commit 5.
  - load_heck_relay is unchanged in behavior — its loader was already
    fixed in PR #280 to skip qfuerza_into.  This commit just routes it
    through the load_published_opt strategy.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
ericchansen added a commit that referenced this pull request May 26, 2026
…h + qfuerza_into

The old ``estimate_force_constants(molecules, forcefield=None, ...)``
overloaded two completely different operations on one signature:

  forcefield=None  → build a brand-new FF (single molecule only).
  forcefield=ff    → mutate ff (well, ff.copy()) — silently overwriting
                     every "unfrozen" param.

The second mode is exactly the footgun that produced the q2mm#277
Heck-relay bug.  Splitting the function into two names with two
narrower contracts makes the intent explicit at every call site:

  qfuerza_fresh(molecule, ...) -> ForceField
      Build a fresh FF from one molecule's QM Hessian.  Errors if
      multiple molecules are passed.

  qfuerza_into(ff, molecules, ...) -> None
      Overwrite the values of every *unfrozen* parameter in ``ff``
      using the QFUERZA projection.  Frozen params are skipped (the
      load_heck_relay invariant).  Mutates ff in place.  No return.

Both names cite Farrugia et al. 2025 (J. Chem. Theory Comput.
22, 469, DOI 10.1021/acs.jctc.5c01751), the methodology paper of
record per AGENTS.md.  The estimate_force_constants name is deleted
outright — no shim, per alpha discipline.

Other internal renames in seminario.py to match the new vocabulary
(TypeError message, two docstring/comment refs).  __init__.py public
API now exports ``qfuerza_fresh`` and ``qfuerza_into`` instead of
``estimate_force_constants``.

Migration of all 85 call sites is mechanical:

  fresh path: estimate_force_constants(mol, ...) → qfuerza_fresh(mol, ...)
  into path:  ff = estimate_force_constants(mols, forcefield=ff_template, ...)
              → ff = ff_template.copy()
                qfuerza_into(ff, mols, ...)

19 files touched.  The Heck-relay loader behavior is unchanged
(it doesn't call into qfuerza_into at all post-#280); the rh-enamide
and Wahlers loaders still have the same shape, with
``forcefield=ff_template, ...`` → ``copy + qfuerza_into`` and same
semantics.

Tests:
- Existing unit test renamed to
  ``test_qfuerza_into_does_not_overwrite_frozen_opt_block``;
  same regression coverage for the q2mm#277 pattern.
- All 656 unit tests pass; ruff check + format pass clean.

No behavior change visible from outside the call sites — same loops
inside ``_estimate_into_ff`` (the renamed body), same skip-frozen
short-circuits.  The Heck-relay regression test from PR #280
continues to pass.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
ericchansen added a commit that referenced this pull request May 26, 2026
…oad_system

Replaces six per-system loaders + a private _load_wahlers_system helper
with a single declarative dispatch system, addressing the
inconsistencies (load_X vs _load_X_system, divergent shapes) and the
copy-paste between similar loaders.

New module: q2mm/models/loaders.py
  - Strategy primitives, named after what they do (no implicit ordering
    that produced the q2mm#277 bug):
      load_published_opt(ff_path)
      load_qfuerza_only(molecules, ff_path)
      load_qfuerza_fresh(molecule)
      compose_opt_with_mm3_base(opt_path, base_path, *, metal)
      load_published_opt_composed(opt_path, base_path, *, metal)
  - Each docstring cites Farrugia 2025 (the QFUERZA paper, AGENTS.md
    "Key Papers") and documents the returned FF's frozen/active
    invariants.

Rewritten q2mm/diagnostics/systems.py:
  - New `SystemSpec` dataclass replaces `BenchmarkSystem`.  Each spec
    declares: molecule_loader callable, ff_strategy literal, ff_paths
    mapping, metadata dict, optional metal symbol.
  - New `load_system(key, *, engine=None, functional_form=None,
    molecule_loader_kwargs=None)` is the single loader entry point.
    Reference construction is uniform across multi-molecule systems
    (`ReferenceData.from_molecules` with eigenmatrix-diagonal=True);
    single-molecule systems with frequency-only references (ch3f) take
    the `engine` argument and use `_build_frequency_reference`.

Deleted (no shims, alpha discipline):
  - load_ch3f, load_rh_enamide, load_heck_relay, load_pd_allyl,
    load_pd_conjugate, load_rh_conjugate (six public functions).
  - _load_wahlers_system (private helper).
  - BenchmarkSystem dataclass (replaced by SystemSpec).

Survives because still per-system:
  - load_X_molecules() functions (genuinely different file
    formats/layouts per system; referenced by SystemSpec.molecule_loader).
  - _load_gaussian_molecules, _load_ch3f_molecules, _load_wahlers_molecules
    (factored out of the deleted loaders).

Per-system code in SYSTEMS shrinks to ~10 lines of declarative spec
per system (vs ~40-60 lines of repetitive procedure in the old design).

Caller migration in the same commit (no shims):
  - q2mm/diagnostics/cli.py:    SYSTEMS[k].loader(engine) → load_system(k, engine=engine, ...)
                                BenchmarkSystem → SystemSpec
                                special-cased load_ch3f(data_dir=...) → load_system("ch3f", molecule_loader_kwargs={"data_dir": ...})
  - q2mm/diagnostics/__init__.py: re-export SystemSpec + load_system instead of BenchmarkSystem.
  - scripts/regenerate_convergence_results.py: SYSTEMS[k].loader → load_system.
  - scripts/bench_composed.py: load_ch3f → load_system("ch3f").
  - scripts/run_rh_benchmarks.sh: embedded Python — load_rh_enamide → load_system("rh-enamide").
  - test/integration/test_heck_validation.py regression: load_heck_relay → load_system.
  - test/integration/test_backend_optimizer_matrix.py: load_ch3f → load_system.
  - test/benchmarks/test_optimization.py: same.

Smoke tests:
  - `scripts/regenerate_convergence_results.py --system ch3f --skip-optimization`
    runs end-to-end and produces the same baseline numbers
    (ratio=1.000, ok, single 'frequency' category).
  - Full unit suite: 656 passed, 272 skipped.
  - Ruff check + format: clean.

Behavior changes (intentional, per plan, will surface in commit 5):
  - load_rh_enamide previously called qfuerza_into on a copy of the
    rh-enamide mm3.fld — i.e. it OVERWROTE the Donoghue OPT values
    with raw FUERZA projections.  After this commit it uses
    load_published_opt, which preserves the published OPT values.
    Numbers for rh-enamide will move in commit 5 (regeneration).
  - load_pd_allyl, load_pd_conjugate, load_rh_conjugate had the same
    overwrite pattern via _load_wahlers_system + qfuerza_into.  After
    this commit they use load_published_opt_composed and preserve the
    Wahlers OPT values.  Numbers will move in commit 5.
  - load_heck_relay is unchanged in behavior — its loader was already
    fixed in PR #280 to skip qfuerza_into.  This commit just routes it
    through the load_published_opt strategy.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
ericchansen added a commit that referenced this pull request May 26, 2026
…oad_system

Replaces six per-system loaders + a private _load_wahlers_system helper
with a single declarative dispatch system, addressing the
inconsistencies (load_X vs _load_X_system, divergent shapes) and the
copy-paste between similar loaders.

New module: q2mm/models/loaders.py
  - Strategy primitives, named after what they do (no implicit ordering
    that produced the q2mm#277 bug):
      load_published_opt(ff_path)
      load_qfuerza_fresh(molecule)
      compose_opt_with_mm3_base(opt_path, base_path, *, metal)
        -> (composed, opt_only)
      load_published_opt_composed(opt_path, base_path, *, metal)
  - Each docstring cites Farrugia 2025 (the QFUERZA paper, AGENTS.md
    "Key Papers") and documents the returned FF's frozen/active
    invariants.
  - Module-level `_METAL_VDW` dict holds published metal vdW parameters
    for systems whose MM3 base file lacks them (today only PD from
    Rosales 2020).  Lives here rather than in `q2mm/diagnostics/`
    so the dependency direction is models <- diagnostics, not the
    reverse.

Rewritten q2mm/diagnostics/systems.py:
  - New `SystemSpec` dataclass replaces `BenchmarkSystem`.  Each spec
    declares: molecule_loader callable, ff_strategy literal, ff_paths
    mapping (uniformly `Callable[[], Path]`, no Path|Callable sum
    type), optional `normal_modes_path` resolver, metadata dict,
    optional metal symbol.
  - New `load_system(key, *, engine=None, functional_form=None,
    molecule_loader_kwargs=None)` is the single loader entry point.
    Reference construction is uniform across multi-molecule systems
    (`ReferenceData.from_molecules` with eigenmatrix-diagonal=True);
    single-molecule systems with frequency-only references (ch3f) take
    the `engine` argument and use `_build_frequency_reference`.
  - Normal-modes side-loading is generic: any spec setting
    `normal_modes_path` gets its modes loaded; the dispatcher has no
    `key == "ch3f"` special-case.

Deleted (no shims, alpha discipline):
  - load_ch3f, load_rh_enamide, load_heck_relay, load_pd_allyl,
    load_pd_conjugate, load_rh_conjugate (six public functions).
  - _load_wahlers_system (private helper).
  - BenchmarkSystem dataclass (replaced by SystemSpec).
  - _metal_vdw_registry() function (replaced by `_METAL_VDW` dict in
    loaders.py, with no lazy import).

Survives because still per-system:
  - load_X_molecules() functions (genuinely different file
    formats/layouts per system; referenced by SystemSpec.molecule_loader).
  - _load_gaussian_molecules, _load_ch3f_molecules, _load_wahlers_molecules
    (factored out of the deleted loaders).

Per-system code in SYSTEMS shrinks to ~10 lines of declarative spec
per system (vs ~40-60 lines of repetitive procedure in the old design).

Caller migration in the same commit (no shims):
  - q2mm/diagnostics/cli.py:    SYSTEMS[k].loader(engine) -> load_system(k, engine=engine, ...)
                                BenchmarkSystem -> SystemSpec
                                special-cased load_ch3f(data_dir=...) -> load_system("ch3f", molecule_loader_kwargs={"data_dir": ...})
  - q2mm/diagnostics/__init__.py: re-export SystemSpec + load_system instead of BenchmarkSystem.
  - scripts/regenerate_convergence_results.py: SYSTEMS[k].loader -> load_system.
  - scripts/bench_composed.py: load_ch3f -> load_system("ch3f").
  - scripts/run_rh_benchmarks.sh: embedded Python -- load_rh_enamide -> load_system("rh-enamide").
  - test/integration/test_heck_validation.py regression: load_heck_relay -> load_system.
  - test/integration/test_backend_optimizer_matrix.py: load_ch3f -> load_system.
  - test/benchmarks/test_optimization.py: same.

Smoke tests:
  - `scripts/regenerate_convergence_results.py --system ch3f --skip-optimization`
    runs end-to-end and produces the same baseline numbers
    (ratio=1.000, ok, single 'frequency' category).
  - Full unit suite: 656 passed, 272 skipped.
  - Ruff check + format: clean.

Behavior changes (intentional, per plan, will surface in commit 5):
  - load_rh_enamide previously called qfuerza_into on a copy of the
    rh-enamide mm3.fld -- i.e. it OVERWROTE the Donoghue OPT values
    with raw FUERZA projections.  After this commit it uses
    load_published_opt, which preserves the published OPT values.
    Numbers for rh-enamide will move in commit 5 (regeneration).
  - load_pd_allyl, load_pd_conjugate, load_rh_conjugate had the same
    overwrite pattern via _load_wahlers_system + qfuerza_into.  After
    this commit they use load_published_opt_composed and preserve the
    Wahlers OPT values.  Numbers will move in commit 5.
  - load_heck_relay is unchanged in behavior -- its loader was already
    fixed in PR #280 to skip qfuerza_into.  This commit just routes it
    through the load_published_opt strategy.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
ericchansen added a commit that referenced this pull request May 26, 2026
…h + qfuerza_into

The old ``estimate_force_constants(molecules, forcefield=None, ...)``
overloaded two completely different operations on one signature:

  forcefield=None  → build a brand-new FF (single molecule only).
  forcefield=ff    → mutate ff (well, ff.copy()) — silently overwriting
                     every "unfrozen" param.

The second mode is exactly the footgun that produced the q2mm#277
Heck-relay bug.  Splitting the function into two names with two
narrower contracts makes the intent explicit at every call site:

  qfuerza_fresh(molecule, ...) -> ForceField
      Build a fresh FF from one molecule's QM Hessian.  Errors if
      multiple molecules are passed.

  qfuerza_into(ff, molecules, ...) -> None
      Overwrite the values of every *unfrozen* parameter in ``ff``
      using the QFUERZA projection.  Frozen params are skipped (the
      load_heck_relay invariant).  Mutates ff in place.  No return.

Both names cite Farrugia et al. 2025 (J. Chem. Theory Comput.
22, 469, DOI 10.1021/acs.jctc.5c01751), the methodology paper of
record per AGENTS.md.  The estimate_force_constants name is deleted
outright — no shim, per alpha discipline.

Other internal renames in seminario.py to match the new vocabulary
(TypeError message, two docstring/comment refs).  __init__.py public
API now exports ``qfuerza_fresh`` and ``qfuerza_into`` instead of
``estimate_force_constants``.

Migration of all 85 call sites is mechanical:

  fresh path: estimate_force_constants(mol, ...) → qfuerza_fresh(mol, ...)
  into path:  ff = estimate_force_constants(mols, forcefield=ff_template, ...)
              → ff = ff_template.copy()
                qfuerza_into(ff, mols, ...)

19 files touched.  The Heck-relay loader behavior is unchanged
(it doesn't call into qfuerza_into at all post-#280); the rh-enamide
and Wahlers loaders still have the same shape, with
``forcefield=ff_template, ...`` → ``copy + qfuerza_into`` and same
semantics.

Tests:
- Existing unit test renamed to
  ``test_qfuerza_into_does_not_overwrite_frozen_opt_block``;
  same regression coverage for the q2mm#277 pattern.
- All 656 unit tests pass; ruff check + format pass clean.

No behavior change visible from outside the call sites — same loops
inside ``_estimate_into_ff`` (the renamed body), same skip-frozen
short-circuits.  The Heck-relay regression test from PR #280
continues to pass.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
ericchansen added a commit that referenced this pull request May 26, 2026
…oad_system

Replaces six per-system loaders + a private _load_wahlers_system helper
with a single declarative dispatch system, addressing the
inconsistencies (load_X vs _load_X_system, divergent shapes) and the
copy-paste between similar loaders.

New module: q2mm/models/loaders.py
  - Strategy primitives, named after what they do (no implicit ordering
    that produced the q2mm#277 bug):
      load_published_opt(ff_path)
      load_qfuerza_fresh(molecule)
      compose_opt_with_mm3_base(opt_path, base_path, *, metal)
        -> (composed, opt_only)
      load_published_opt_composed(opt_path, base_path, *, metal)
  - Each docstring cites Farrugia 2025 (the QFUERZA paper, AGENTS.md
    "Key Papers") and documents the returned FF's frozen/active
    invariants.
  - Module-level `_METAL_VDW` dict holds published metal vdW parameters
    for systems whose MM3 base file lacks them (today only PD from
    Rosales 2020).  Lives here rather than in `q2mm/diagnostics/`
    so the dependency direction is models <- diagnostics, not the
    reverse.

Rewritten q2mm/diagnostics/systems.py:
  - New `SystemSpec` dataclass replaces `BenchmarkSystem`.  Each spec
    declares: molecule_loader callable, ff_strategy literal, ff_paths
    mapping (uniformly `Callable[[], Path]`, no Path|Callable sum
    type), optional `normal_modes_path` resolver, metadata dict,
    optional metal symbol.
  - New `load_system(key, *, engine=None, functional_form=None,
    molecule_loader_kwargs=None)` is the single loader entry point.
    Reference construction is uniform across multi-molecule systems
    (`ReferenceData.from_molecules` with eigenmatrix-diagonal=True);
    single-molecule systems with frequency-only references (ch3f) take
    the `engine` argument and use `_build_frequency_reference`.
  - Normal-modes side-loading is generic: any spec setting
    `normal_modes_path` gets its modes loaded; the dispatcher has no
    `key == "ch3f"` special-case.

Deleted (no shims, alpha discipline):
  - load_ch3f, load_rh_enamide, load_heck_relay, load_pd_allyl,
    load_pd_conjugate, load_rh_conjugate (six public functions).
  - _load_wahlers_system (private helper).
  - BenchmarkSystem dataclass (replaced by SystemSpec).
  - _metal_vdw_registry() function (replaced by `_METAL_VDW` dict in
    loaders.py, with no lazy import).

Survives because still per-system:
  - load_X_molecules() functions (genuinely different file
    formats/layouts per system; referenced by SystemSpec.molecule_loader).
  - _load_gaussian_molecules, _load_ch3f_molecules, _load_wahlers_molecules
    (factored out of the deleted loaders).

Per-system code in SYSTEMS shrinks to ~10 lines of declarative spec
per system (vs ~40-60 lines of repetitive procedure in the old design).

Caller migration in the same commit (no shims):
  - q2mm/diagnostics/cli.py:    SYSTEMS[k].loader(engine) -> load_system(k, engine=engine, ...)
                                BenchmarkSystem -> SystemSpec
                                special-cased load_ch3f(data_dir=...) -> load_system("ch3f", molecule_loader_kwargs={"data_dir": ...})
  - q2mm/diagnostics/__init__.py: re-export SystemSpec + load_system instead of BenchmarkSystem.
  - scripts/regenerate_convergence_results.py: SYSTEMS[k].loader -> load_system.
  - scripts/bench_composed.py: load_ch3f -> load_system("ch3f").
  - scripts/run_rh_benchmarks.sh: embedded Python -- load_rh_enamide -> load_system("rh-enamide").
  - test/integration/test_heck_validation.py regression: load_heck_relay -> load_system.
  - test/integration/test_backend_optimizer_matrix.py: load_ch3f -> load_system.
  - test/benchmarks/test_optimization.py: same.

Smoke tests:
  - `scripts/regenerate_convergence_results.py --system ch3f --skip-optimization`
    runs end-to-end and produces the same baseline numbers
    (ratio=1.000, ok, single 'frequency' category).
  - Full unit suite: 656 passed, 272 skipped.
  - Ruff check + format: clean.

Behavior changes (intentional, per plan, will surface in commit 5):
  - load_rh_enamide previously called qfuerza_into on a copy of the
    rh-enamide mm3.fld -- i.e. it OVERWROTE the Donoghue OPT values
    with raw FUERZA projections.  After this commit it uses
    load_published_opt, which preserves the published OPT values.
    Numbers for rh-enamide will move in commit 5 (regeneration).
  - load_pd_allyl, load_pd_conjugate, load_rh_conjugate had the same
    overwrite pattern via _load_wahlers_system + qfuerza_into.  After
    this commit they use load_published_opt_composed and preserve the
    Wahlers OPT values.  Numbers will move in commit 5.
  - load_heck_relay is unchanged in behavior -- its loader was already
    fixed in PR #280 to skip qfuerza_into.  This commit just routes it
    through the load_published_opt strategy.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Diagnose heck-relay Seminario divergence (bond_length R² ≈ -48, JaxLoss diverges)

2 participants