fix(heck-relay): stop overwriting Rosales OPT params with Seminario projections by ericchansen · Pull Request #280 · ericchansen/q2mm

ericchansen · 2026-05-25T05:19:19Z

Resolves #277.

TL;DR

The Heck-relay loader was silently corrupting the published Rosales OPT force field by overwriting its OPT parameters with raw Seminario projections. Fixing this single line drops the ObjectiveFunction by 43×, takes JaxLoss from "diverges by 21 orders of magnitude" to ratio ≈ 1.29, and brings bond_length R² from −48 to 0.98.

The bug

ff_template = ForceField.from_mm3_fld(ff_path, include_standard=True)
ff_opt = ForceField.from_mm3_fld(ff_path, include_standard=False)
ff_template.freeze_standard_params(ff_opt)
ff = estimate_force_constants(molecules, forcefield=ff_template, ...)  # 💥

freeze_standard_params marks the non-OPT params (standard MM3 backbone) as frozen and leaves the OPT params active. estimate_force_constants skips frozen params (see q2mm/models/seminario.py:341) and re-estimates the active ones — overwriting the carefully-fitted Rosales OPT values with raw Seminario projections.

Three-baseline diagnostic

The new scripts/diagnose_heck_relay.py builds three FFs and evaluates each (no optimization, just evaluation). Output committed to q2mm-data/benchmarks/heck-relay/diagnostic/:

Baseline	obj	ratio	R²(bond_len)	R²(bond_ang)	R²(eig_diag)
A — Untouched Rosales	3.42 × 10⁶	1.31	0.980	0.786	−12.6
B — Current loader	1.17 × 10⁸	NaN	−4787	−7.25	−4.66
C — Seminario only	1.24 × 10⁸	1.77 × 10⁷³	−849	−5.13	0.05

A's bond_length R² (0.980) is dramatically better than B's (−4787) — matching the issue's decision rule "A fine, B explodes → narrow on Seminario behavior with freeze_standard_params".

The fix

Keep freeze_standard_params (still needed for the optimizer's active mask) but skip estimate_force_constants. Use the published Rosales OPT parameters as-is.

Post-fix baseline

(Companion q2mm-data PR regenerates the heck-relay convergence JSONs.)

Metric	Before (loader bug)	After (this PR)
Initial ObjectiveFunction	1.48 × 10⁸	3.46 × 10⁶
Initial JaxLoss	1e30 (penalty)	4.48 × 10⁶
Ratio	~10²¹	1.29
Status	out_of_band	out_of_band (12 % over band)
bond_length R²	−48.06	0.980
bond_angle R²	−5.47	0.786
eig_diagonal R²	−4.66	−12.6

The eigenmatrix actually gets worse (−4.66 → −12.6), but that just reflects the real MM3* ↔ JAX-engine cross-engine gap that's now visible without being masked by the corrupted FF. Geometry is now well-reproduced and JaxLoss is no longer divergent.

Out of scope

The same freeze_standard_params + estimate_force_constants pattern exists in load_rh_enamide and the load_pd_* loaders. They aren't catastrophically broken today (Rh-enamide runs at R² = 0.97, Pd-allyl at −1.41), but the same audit should probably extend to them in a follow-up.
Heck-relay now joins pd-conjugate as a candidate for the ratio_tol=None experiment (#276).

Validation

python -m ruff check q2mm/ test/ scripts/ — clean.
python -m ruff format --check q2mm test scripts examples — clean.
python -m pytest test/ -x -q -m "not (openmm or tinker or jax or jax_md or psi4)" -k "heck or systems" — 11 passed.
End-to-end scripts/regenerate_convergence_results.py --system heck-relay --skip-optimization ran clean on a single GPU; output (with full provenance) is in the companion q2mm-data PR.

Companion PR

ericchansen/q2mm-data#? — adds the diagnostic JSON and the regenerated convergence baseline.

Copilot

Pull request overview

Fixes the heck-relay system loader so it no longer corrupts the published Rosales OPT force field by re-running Seminario projections over active OPT parameters, and updates diagnostics/docs to reflect the corrected baseline behavior.

Changes:

Stop calling estimate_force_constants() in load_heck_relay() while still using freeze_standard_params() to preserve the intended active-parameter mask.
Add a one-off diagnostic script to compare multiple heck-relay force-field baselines and write a JSON report with provenance.
Update heck-relay and optimizer-comparison documentation to reflect the corrected baseline metrics and ratio behavior.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.

File	Description
scripts/diagnose_heck_relay.py	Adds a three-baseline diagnostic + JSON/provenance output for heck-relay.
q2mm/diagnostics/systems.py	Fixes heck-relay loader to keep Rosales OPT parameters as-published (no Seminario overwrite).
docs/systems/heck-relay.md	Updates heck-relay system page to describe the loader fix and new baseline metrics.
docs/benchmarks/optimizer-comparison.md	Updates cross-system comparison tables/text for the new heck-relay ratio and R² values.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

…rojections Resolves #277. The Heck-relay loader silently corrupted the published Rosales OPT force field. The sequence ff_template = ForceField.from_mm3_fld(ff_path, include_standard=True) ff_opt = ForceField.from_mm3_fld(ff_path, include_standard=False) ff_template.freeze_standard_params(ff_opt) ff = estimate_force_constants(molecules, forcefield=ff_template, ...) freezes all *non*-OPT params (standard MM3 backbone) and leaves the OPT-substructure params active. `estimate_force_constants` then skips the frozen params (q2mm/models/seminario.py:341) and re-estimates the *active* (= OPT) ones — overwriting the carefully-fitted Rosales values with raw Seminario projections. The new three-baseline diagnostic (`scripts/diagnose_heck_relay.py`, committed alongside this fix and mirrored to `q2mm-data/benchmarks/heck-relay/diagnostic/`) confirms the divergence: Baseline obj ratio R²(bond_len) A_untouched_rosales 3.42e+06 1.31 0.980 B_current_loader 1.17e+08 n/a (NaN) -4787.5 C_seminario_only 1.24e+08 1.77e+73 -849.0 Fix: keep `freeze_standard_params` (still needed for the optimizer's active mask) but skip `estimate_force_constants`. Use the Rosales OPT values as-published. Post-fix regenerated baseline (in companion q2mm-data PR): obj: 1.48e+08 → 3.46e+06 (43x smaller) ratio: ~10^21 (sanitized 1e30) → 1.29 (finite, real) ratio_status: out_of_band → out_of_band (still 12% over the band) bond_length R²: -48.06 → 0.980 bond_angle R²: -5.47 → 0.786 eig_diagonal R²: -4.66 → -12.62 (worse; reflects the real MM3* ↔ JAX-engine cross-engine gap, not a loader bug) Heck relay now sits with pd-conjugate as a candidate for the ratio_tol=None experiment (issue #276): the gate barely misses instead of catastrophically failing. Out of scope (per #277): the same `freeze_standard_params + estimate_force_constants` pattern exists in load_rh_enamide and load_pd_*. Those systems are not affected catastrophically (Rh-enamide runs at R²=0.97 and Pd-allyl at -1.41 today) but the same audit should probably extend to them in a follow-up. Also includes: - scripts/diagnose_heck_relay.py — the diagnostic script (committed per #277 acceptance criteria, output mirrored to q2mm-data). - docs/systems/heck-relay.md — updated to reflect the fix. - docs/benchmarks/optimizer-comparison.md — heck-relay row and notes updated; per-category R² table updated to current numbers. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…oad_system Replaces six per-system loaders + a private _load_wahlers_system helper with a single declarative dispatch system, addressing the inconsistencies (load_X vs _load_X_system, divergent shapes) and the copy-paste between similar loaders. New module: q2mm/models/loaders.py - Strategy primitives, named after what they do (no implicit ordering that produced the q2mm#277 bug): load_published_opt(ff_path) load_qfuerza_only(molecules, ff_path) load_qfuerza_fresh(molecule) compose_opt_with_mm3_base(opt_path, base_path, *, metal) load_published_opt_composed(opt_path, base_path, *, metal) - Each docstring cites Farrugia 2025 (the QFUERZA paper, AGENTS.md "Key Papers") and documents the returned FF's frozen/active invariants. Rewritten q2mm/diagnostics/systems.py: - New `SystemSpec` dataclass replaces `BenchmarkSystem`. Each spec declares: molecule_loader callable, ff_strategy literal, ff_paths mapping, metadata dict, optional metal symbol. - New `load_system(key, *, engine=None, functional_form=None, molecule_loader_kwargs=None)` is the single loader entry point. Reference construction is uniform across multi-molecule systems (`ReferenceData.from_molecules` with eigenmatrix-diagonal=True); single-molecule systems with frequency-only references (ch3f) take the `engine` argument and use `_build_frequency_reference`. Deleted (no shims, alpha discipline): - load_ch3f, load_rh_enamide, load_heck_relay, load_pd_allyl, load_pd_conjugate, load_rh_conjugate (six public functions). - _load_wahlers_system (private helper). - BenchmarkSystem dataclass (replaced by SystemSpec). Survives because still per-system: - load_X_molecules() functions (genuinely different file formats/layouts per system; referenced by SystemSpec.molecule_loader). - _load_gaussian_molecules, _load_ch3f_molecules, _load_wahlers_molecules (factored out of the deleted loaders). Per-system code in SYSTEMS shrinks to ~10 lines of declarative spec per system (vs ~40-60 lines of repetitive procedure in the old design). Caller migration in the same commit (no shims): - q2mm/diagnostics/cli.py: SYSTEMS[k].loader(engine) → load_system(k, engine=engine, ...) BenchmarkSystem → SystemSpec special-cased load_ch3f(data_dir=...) → load_system("ch3f", molecule_loader_kwargs={"data_dir": ...}) - q2mm/diagnostics/__init__.py: re-export SystemSpec + load_system instead of BenchmarkSystem. - scripts/regenerate_convergence_results.py: SYSTEMS[k].loader → load_system. - scripts/bench_composed.py: load_ch3f → load_system("ch3f"). - scripts/run_rh_benchmarks.sh: embedded Python — load_rh_enamide → load_system("rh-enamide"). - test/integration/test_heck_validation.py regression: load_heck_relay → load_system. - test/integration/test_backend_optimizer_matrix.py: load_ch3f → load_system. - test/benchmarks/test_optimization.py: same. Smoke tests: - `scripts/regenerate_convergence_results.py --system ch3f --skip-optimization` runs end-to-end and produces the same baseline numbers (ratio=1.000, ok, single 'frequency' category). - Full unit suite: 656 passed, 272 skipped. - Ruff check + format: clean. Behavior changes (intentional, per plan, will surface in commit 5): - load_rh_enamide previously called qfuerza_into on a copy of the rh-enamide mm3.fld — i.e. it OVERWROTE the Donoghue OPT values with raw FUERZA projections. After this commit it uses load_published_opt, which preserves the published OPT values. Numbers for rh-enamide will move in commit 5 (regeneration). - load_pd_allyl, load_pd_conjugate, load_rh_conjugate had the same overwrite pattern via _load_wahlers_system + qfuerza_into. After this commit they use load_published_opt_composed and preserve the Wahlers OPT values. Numbers will move in commit 5. - load_heck_relay is unchanged in behavior — its loader was already fixed in PR #280 to skip qfuerza_into. This commit just routes it through the load_published_opt strategy. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…h + qfuerza_into The old ``estimate_force_constants(molecules, forcefield=None, ...)`` overloaded two completely different operations on one signature: forcefield=None → build a brand-new FF (single molecule only). forcefield=ff → mutate ff (well, ff.copy()) — silently overwriting every "unfrozen" param. The second mode is exactly the footgun that produced the q2mm#277 Heck-relay bug. Splitting the function into two names with two narrower contracts makes the intent explicit at every call site: qfuerza_fresh(molecule, ...) -> ForceField Build a fresh FF from one molecule's QM Hessian. Errors if multiple molecules are passed. qfuerza_into(ff, molecules, ...) -> None Overwrite the values of every *unfrozen* parameter in ``ff`` using the QFUERZA projection. Frozen params are skipped (the load_heck_relay invariant). Mutates ff in place. No return. Both names cite Farrugia et al. 2025 (J. Chem. Theory Comput. 22, 469, DOI 10.1021/acs.jctc.5c01751), the methodology paper of record per AGENTS.md. The estimate_force_constants name is deleted outright — no shim, per alpha discipline. Other internal renames in seminario.py to match the new vocabulary (TypeError message, two docstring/comment refs). __init__.py public API now exports ``qfuerza_fresh`` and ``qfuerza_into`` instead of ``estimate_force_constants``. Migration of all 85 call sites is mechanical: fresh path: estimate_force_constants(mol, ...) → qfuerza_fresh(mol, ...) into path: ff = estimate_force_constants(mols, forcefield=ff_template, ...) → ff = ff_template.copy() qfuerza_into(ff, mols, ...) 19 files touched. The Heck-relay loader behavior is unchanged (it doesn't call into qfuerza_into at all post-#280); the rh-enamide and Wahlers loaders still have the same shape, with ``forcefield=ff_template, ...`` → ``copy + qfuerza_into`` and same semantics. Tests: - Existing unit test renamed to ``test_qfuerza_into_does_not_overwrite_frozen_opt_block``; same regression coverage for the q2mm#277 pattern. - All 656 unit tests pass; ruff check + format pass clean. No behavior change visible from outside the call sites — same loops inside ``_estimate_into_ff`` (the renamed body), same skip-frozen short-circuits. The Heck-relay regression test from PR #280 continues to pass. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…oad_system Replaces six per-system loaders + a private _load_wahlers_system helper with a single declarative dispatch system, addressing the inconsistencies (load_X vs _load_X_system, divergent shapes) and the copy-paste between similar loaders. New module: q2mm/models/loaders.py - Strategy primitives, named after what they do (no implicit ordering that produced the q2mm#277 bug): load_published_opt(ff_path) load_qfuerza_only(molecules, ff_path) load_qfuerza_fresh(molecule) compose_opt_with_mm3_base(opt_path, base_path, *, metal) load_published_opt_composed(opt_path, base_path, *, metal) - Each docstring cites Farrugia 2025 (the QFUERZA paper, AGENTS.md "Key Papers") and documents the returned FF's frozen/active invariants. Rewritten q2mm/diagnostics/systems.py: - New `SystemSpec` dataclass replaces `BenchmarkSystem`. Each spec declares: molecule_loader callable, ff_strategy literal, ff_paths mapping, metadata dict, optional metal symbol. - New `load_system(key, *, engine=None, functional_form=None, molecule_loader_kwargs=None)` is the single loader entry point. Reference construction is uniform across multi-molecule systems (`ReferenceData.from_molecules` with eigenmatrix-diagonal=True); single-molecule systems with frequency-only references (ch3f) take the `engine` argument and use `_build_frequency_reference`. Deleted (no shims, alpha discipline): - load_ch3f, load_rh_enamide, load_heck_relay, load_pd_allyl, load_pd_conjugate, load_rh_conjugate (six public functions). - _load_wahlers_system (private helper). - BenchmarkSystem dataclass (replaced by SystemSpec). Survives because still per-system: - load_X_molecules() functions (genuinely different file formats/layouts per system; referenced by SystemSpec.molecule_loader). - _load_gaussian_molecules, _load_ch3f_molecules, _load_wahlers_molecules (factored out of the deleted loaders). Per-system code in SYSTEMS shrinks to ~10 lines of declarative spec per system (vs ~40-60 lines of repetitive procedure in the old design). Caller migration in the same commit (no shims): - q2mm/diagnostics/cli.py: SYSTEMS[k].loader(engine) → load_system(k, engine=engine, ...) BenchmarkSystem → SystemSpec special-cased load_ch3f(data_dir=...) → load_system("ch3f", molecule_loader_kwargs={"data_dir": ...}) - q2mm/diagnostics/__init__.py: re-export SystemSpec + load_system instead of BenchmarkSystem. - scripts/regenerate_convergence_results.py: SYSTEMS[k].loader → load_system. - scripts/bench_composed.py: load_ch3f → load_system("ch3f"). - scripts/run_rh_benchmarks.sh: embedded Python — load_rh_enamide → load_system("rh-enamide"). - test/integration/test_heck_validation.py regression: load_heck_relay → load_system. - test/integration/test_backend_optimizer_matrix.py: load_ch3f → load_system. - test/benchmarks/test_optimization.py: same. Smoke tests: - `scripts/regenerate_convergence_results.py --system ch3f --skip-optimization` runs end-to-end and produces the same baseline numbers (ratio=1.000, ok, single 'frequency' category). - Full unit suite: 656 passed, 272 skipped. - Ruff check + format: clean. Behavior changes (intentional, per plan, will surface in commit 5): - load_rh_enamide previously called qfuerza_into on a copy of the rh-enamide mm3.fld — i.e. it OVERWROTE the Donoghue OPT values with raw FUERZA projections. After this commit it uses load_published_opt, which preserves the published OPT values. Numbers for rh-enamide will move in commit 5 (regeneration). - load_pd_allyl, load_pd_conjugate, load_rh_conjugate had the same overwrite pattern via _load_wahlers_system + qfuerza_into. After this commit they use load_published_opt_composed and preserve the Wahlers OPT values. Numbers will move in commit 5. - load_heck_relay is unchanged in behavior — its loader was already fixed in PR #280 to skip qfuerza_into. This commit just routes it through the load_published_opt strategy. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…oad_system Replaces six per-system loaders + a private _load_wahlers_system helper with a single declarative dispatch system, addressing the inconsistencies (load_X vs _load_X_system, divergent shapes) and the copy-paste between similar loaders. New module: q2mm/models/loaders.py - Strategy primitives, named after what they do (no implicit ordering that produced the q2mm#277 bug): load_published_opt(ff_path) load_qfuerza_fresh(molecule) compose_opt_with_mm3_base(opt_path, base_path, *, metal) -> (composed, opt_only) load_published_opt_composed(opt_path, base_path, *, metal) - Each docstring cites Farrugia 2025 (the QFUERZA paper, AGENTS.md "Key Papers") and documents the returned FF's frozen/active invariants. - Module-level `_METAL_VDW` dict holds published metal vdW parameters for systems whose MM3 base file lacks them (today only PD from Rosales 2020). Lives here rather than in `q2mm/diagnostics/` so the dependency direction is models <- diagnostics, not the reverse. Rewritten q2mm/diagnostics/systems.py: - New `SystemSpec` dataclass replaces `BenchmarkSystem`. Each spec declares: molecule_loader callable, ff_strategy literal, ff_paths mapping (uniformly `Callable[[], Path]`, no Path|Callable sum type), optional `normal_modes_path` resolver, metadata dict, optional metal symbol. - New `load_system(key, *, engine=None, functional_form=None, molecule_loader_kwargs=None)` is the single loader entry point. Reference construction is uniform across multi-molecule systems (`ReferenceData.from_molecules` with eigenmatrix-diagonal=True); single-molecule systems with frequency-only references (ch3f) take the `engine` argument and use `_build_frequency_reference`. - Normal-modes side-loading is generic: any spec setting `normal_modes_path` gets its modes loaded; the dispatcher has no `key == "ch3f"` special-case. Deleted (no shims, alpha discipline): - load_ch3f, load_rh_enamide, load_heck_relay, load_pd_allyl, load_pd_conjugate, load_rh_conjugate (six public functions). - _load_wahlers_system (private helper). - BenchmarkSystem dataclass (replaced by SystemSpec). - _metal_vdw_registry() function (replaced by `_METAL_VDW` dict in loaders.py, with no lazy import). Survives because still per-system: - load_X_molecules() functions (genuinely different file formats/layouts per system; referenced by SystemSpec.molecule_loader). - _load_gaussian_molecules, _load_ch3f_molecules, _load_wahlers_molecules (factored out of the deleted loaders). Per-system code in SYSTEMS shrinks to ~10 lines of declarative spec per system (vs ~40-60 lines of repetitive procedure in the old design). Caller migration in the same commit (no shims): - q2mm/diagnostics/cli.py: SYSTEMS[k].loader(engine) -> load_system(k, engine=engine, ...) BenchmarkSystem -> SystemSpec special-cased load_ch3f(data_dir=...) -> load_system("ch3f", molecule_loader_kwargs={"data_dir": ...}) - q2mm/diagnostics/__init__.py: re-export SystemSpec + load_system instead of BenchmarkSystem. - scripts/regenerate_convergence_results.py: SYSTEMS[k].loader -> load_system. - scripts/bench_composed.py: load_ch3f -> load_system("ch3f"). - scripts/run_rh_benchmarks.sh: embedded Python -- load_rh_enamide -> load_system("rh-enamide"). - test/integration/test_heck_validation.py regression: load_heck_relay -> load_system. - test/integration/test_backend_optimizer_matrix.py: load_ch3f -> load_system. - test/benchmarks/test_optimization.py: same. Smoke tests: - `scripts/regenerate_convergence_results.py --system ch3f --skip-optimization` runs end-to-end and produces the same baseline numbers (ratio=1.000, ok, single 'frequency' category). - Full unit suite: 656 passed, 272 skipped. - Ruff check + format: clean. Behavior changes (intentional, per plan, will surface in commit 5): - load_rh_enamide previously called qfuerza_into on a copy of the rh-enamide mm3.fld -- i.e. it OVERWROTE the Donoghue OPT values with raw FUERZA projections. After this commit it uses load_published_opt, which preserves the published OPT values. Numbers for rh-enamide will move in commit 5 (regeneration). - load_pd_allyl, load_pd_conjugate, load_rh_conjugate had the same overwrite pattern via _load_wahlers_system + qfuerza_into. After this commit they use load_published_opt_composed and preserve the Wahlers OPT values. Numbers will move in commit 5. - load_heck_relay is unchanged in behavior -- its loader was already fixed in PR #280 to skip qfuerza_into. This commit just routes it through the load_published_opt strategy. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…h + qfuerza_into The old ``estimate_force_constants(molecules, forcefield=None, ...)`` overloaded two completely different operations on one signature: forcefield=None → build a brand-new FF (single molecule only). forcefield=ff → mutate ff (well, ff.copy()) — silently overwriting every "unfrozen" param. The second mode is exactly the footgun that produced the q2mm#277 Heck-relay bug. Splitting the function into two names with two narrower contracts makes the intent explicit at every call site: qfuerza_fresh(molecule, ...) -> ForceField Build a fresh FF from one molecule's QM Hessian. Errors if multiple molecules are passed. qfuerza_into(ff, molecules, ...) -> None Overwrite the values of every *unfrozen* parameter in ``ff`` using the QFUERZA projection. Frozen params are skipped (the load_heck_relay invariant). Mutates ff in place. No return. Both names cite Farrugia et al. 2025 (J. Chem. Theory Comput. 22, 469, DOI 10.1021/acs.jctc.5c01751), the methodology paper of record per AGENTS.md. The estimate_force_constants name is deleted outright — no shim, per alpha discipline. Other internal renames in seminario.py to match the new vocabulary (TypeError message, two docstring/comment refs). __init__.py public API now exports ``qfuerza_fresh`` and ``qfuerza_into`` instead of ``estimate_force_constants``. Migration of all 85 call sites is mechanical: fresh path: estimate_force_constants(mol, ...) → qfuerza_fresh(mol, ...) into path: ff = estimate_force_constants(mols, forcefield=ff_template, ...) → ff = ff_template.copy() qfuerza_into(ff, mols, ...) 19 files touched. The Heck-relay loader behavior is unchanged (it doesn't call into qfuerza_into at all post-#280); the rh-enamide and Wahlers loaders still have the same shape, with ``forcefield=ff_template, ...`` → ``copy + qfuerza_into`` and same semantics. Tests: - Existing unit test renamed to ``test_qfuerza_into_does_not_overwrite_frozen_opt_block``; same regression coverage for the q2mm#277 pattern. - All 656 unit tests pass; ruff check + format pass clean. No behavior change visible from outside the call sites — same loops inside ``_estimate_into_ff`` (the renamed body), same skip-frozen short-circuits. The Heck-relay regression test from PR #280 continues to pass. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…oad_system Replaces six per-system loaders + a private _load_wahlers_system helper with a single declarative dispatch system, addressing the inconsistencies (load_X vs _load_X_system, divergent shapes) and the copy-paste between similar loaders. New module: q2mm/models/loaders.py - Strategy primitives, named after what they do (no implicit ordering that produced the q2mm#277 bug): load_published_opt(ff_path) load_qfuerza_fresh(molecule) compose_opt_with_mm3_base(opt_path, base_path, *, metal) -> (composed, opt_only) load_published_opt_composed(opt_path, base_path, *, metal) - Each docstring cites Farrugia 2025 (the QFUERZA paper, AGENTS.md "Key Papers") and documents the returned FF's frozen/active invariants. - Module-level `_METAL_VDW` dict holds published metal vdW parameters for systems whose MM3 base file lacks them (today only PD from Rosales 2020). Lives here rather than in `q2mm/diagnostics/` so the dependency direction is models <- diagnostics, not the reverse. Rewritten q2mm/diagnostics/systems.py: - New `SystemSpec` dataclass replaces `BenchmarkSystem`. Each spec declares: molecule_loader callable, ff_strategy literal, ff_paths mapping (uniformly `Callable[[], Path]`, no Path|Callable sum type), optional `normal_modes_path` resolver, metadata dict, optional metal symbol. - New `load_system(key, *, engine=None, functional_form=None, molecule_loader_kwargs=None)` is the single loader entry point. Reference construction is uniform across multi-molecule systems (`ReferenceData.from_molecules` with eigenmatrix-diagonal=True); single-molecule systems with frequency-only references (ch3f) take the `engine` argument and use `_build_frequency_reference`. - Normal-modes side-loading is generic: any spec setting `normal_modes_path` gets its modes loaded; the dispatcher has no `key == "ch3f"` special-case. Deleted (no shims, alpha discipline): - load_ch3f, load_rh_enamide, load_heck_relay, load_pd_allyl, load_pd_conjugate, load_rh_conjugate (six public functions). - _load_wahlers_system (private helper). - BenchmarkSystem dataclass (replaced by SystemSpec). - _metal_vdw_registry() function (replaced by `_METAL_VDW` dict in loaders.py, with no lazy import). Survives because still per-system: - load_X_molecules() functions (genuinely different file formats/layouts per system; referenced by SystemSpec.molecule_loader). - _load_gaussian_molecules, _load_ch3f_molecules, _load_wahlers_molecules (factored out of the deleted loaders). Per-system code in SYSTEMS shrinks to ~10 lines of declarative spec per system (vs ~40-60 lines of repetitive procedure in the old design). Caller migration in the same commit (no shims): - q2mm/diagnostics/cli.py: SYSTEMS[k].loader(engine) -> load_system(k, engine=engine, ...) BenchmarkSystem -> SystemSpec special-cased load_ch3f(data_dir=...) -> load_system("ch3f", molecule_loader_kwargs={"data_dir": ...}) - q2mm/diagnostics/__init__.py: re-export SystemSpec + load_system instead of BenchmarkSystem. - scripts/regenerate_convergence_results.py: SYSTEMS[k].loader -> load_system. - scripts/bench_composed.py: load_ch3f -> load_system("ch3f"). - scripts/run_rh_benchmarks.sh: embedded Python -- load_rh_enamide -> load_system("rh-enamide"). - test/integration/test_heck_validation.py regression: load_heck_relay -> load_system. - test/integration/test_backend_optimizer_matrix.py: load_ch3f -> load_system. - test/benchmarks/test_optimization.py: same. Smoke tests: - `scripts/regenerate_convergence_results.py --system ch3f --skip-optimization` runs end-to-end and produces the same baseline numbers (ratio=1.000, ok, single 'frequency' category). - Full unit suite: 656 passed, 272 skipped. - Ruff check + format: clean. Behavior changes (intentional, per plan, will surface in commit 5): - load_rh_enamide previously called qfuerza_into on a copy of the rh-enamide mm3.fld -- i.e. it OVERWROTE the Donoghue OPT values with raw FUERZA projections. After this commit it uses load_published_opt, which preserves the published OPT values. Numbers for rh-enamide will move in commit 5 (regeneration). - load_pd_allyl, load_pd_conjugate, load_rh_conjugate had the same overwrite pattern via _load_wahlers_system + qfuerza_into. After this commit they use load_published_opt_composed and preserve the Wahlers OPT values. Numbers will move in commit 5. - load_heck_relay is unchanged in behavior -- its loader was already fixed in PR #280 to skip qfuerza_into. This commit just routes it through the load_published_opt strategy. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot AI review requested due to automatic review settings May 25, 2026 05:19

Copilot started reviewing on behalf of ericchansen May 25, 2026 05:19 View session

ericchansen mentioned this pull request May 25, 2026

fix(heck-relay): regenerate convergence + add three-baseline diagnostic ericchansen/q2mm-data#3

Merged

Copilot AI reviewed May 25, 2026

View reviewed changes

Comment thread scripts/diagnose_heck_relay.py Outdated

Comment thread scripts/diagnose_heck_relay.py

Comment thread docs/systems/heck-relay.md Outdated

Comment thread docs/systems/heck-relay.md Outdated

Comment thread q2mm/diagnostics/systems.py

ericchansen mentioned this pull request May 25, 2026

Diagnose heck-relay Seminario divergence (bond_length R² ≈ -48, JaxLoss diverges) #277

Closed

4 tasks

ericchansen force-pushed the diag/heck-relay-seminario branch from 986ec19 to 4e86326 Compare May 25, 2026 14:20

ericchansen merged commit f27e33c into master May 25, 2026
11 checks passed

ericchansen deleted the diag/heck-relay-seminario branch May 25, 2026 14:26

ericchansen mentioned this pull request May 25, 2026

refactor: loader API + frozen-as-invariant + QFUERZA split #281

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(heck-relay): stop overwriting Rosales OPT params with Seminario projections#280

fix(heck-relay): stop overwriting Rosales OPT params with Seminario projections#280
ericchansen merged 1 commit into
masterfrom
diag/heck-relay-seminario

ericchansen commented May 25, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ericchansen commented May 25, 2026

TL;DR

The bug

Three-baseline diagnostic

The fix

Post-fix baseline

Out of scope

Validation

Companion PR

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants