Skip to content

refactor: Consolidate element data, DRY up codebase, fix packaging#107

Merged
ericchansen merged 3 commits intomasterfrom
refactor/consolidate-test-fixtures
Mar 21, 2026
Merged

refactor: Consolidate element data, DRY up codebase, fix packaging#107
ericchansen merged 3 commits intomasterfrom
refactor/consolidate-test-fixtures

Conversation

@ericchansen
Copy link
Copy Markdown
Owner

Summary

DRY up the codebase, consolidate element data into a single annotated source of truth, fix packaging gaps, and reduce test fixture duplication. Net result: -243 lines with zero test regressions.

Highlights

q2mm/elements.py — Single source of truth for periodic table data

  • Element NamedTuple with (z, symbol, mass, covalent_radius) and cited provenance (NUBASE2020, Cordero 2008)
  • All consumers (constants.py, identifiers.py, objective.py, molecule.py) now import from this one module
  • Zero q2mm imports — breaks all circular dependency chains

DRY refactors

  • forcefield.py: Generic _build_param_maps() + _match_for_export() replace 5 near-identical functions
  • seminario.py: Generic _match_mode() + _collect_matching() replace 4 near-identical functions
  • report.py/cli.py: Shared build_leaderboard_rows() helper
  • test/_shared.py: 16 path constants + 3 molecule factories replace duplicates across 13 test files

Bug fixes

  • from_molecules() feature drift: Silently lacked include_eigenmatrix and eigenmatrix_diagonal_only params that from_molecule() had — now delegates properly
  • Duplicate Rhodium: Removed legacy "RH" entry from MASSES (104 entries for 103 elements)

Packaging & defensive imports

  • [all] extras now include jax/jaxlib; [dev] includes openmm
  • scipy import moved to lazy (inside methods) so module loads without scipy installed
  • Benchmark CLI: added --data-dir for wheel installs where example data isn't bundled
  • Tinker bond_tolerance now configurable (was hardcoded 1.3)

Dead code removed

  • Deleted q2mm/parsers/_patterns.py (nothing imported from it)
  • Removed duplicate MacroModel constants from constants.py
  • Removed dead Q2MMMolecule import in forcefield.py

Testing

  • 356 passed, 44 skipped — identical to baseline
  • ruff check and ruff format clean

ericchansen and others added 2 commits March 21, 2026 01:11
Extract repeated path constants (REPO_ROOT, SN2_XYZ, CH3F_HESS, etc.)
and molecule factory functions (_diatomic, _water, _noble_gas_pair)
into test/_shared.py.  Update 13 test files to import from the shared
module instead of redefining them locally.

- Create test/_shared.py with path constants, data-availability flags,
  and parameterised make_diatomic/make_water/make_noble_gas_pair
  factories
- Re-export shared items from test/conftest.py for fixture use
- Preserve per-file default overrides via thin wrapper functions
  (e.g. bond_tolerance=1.5 in JAX tests, name='water-like' in
  OpenMM tests)

No behaviour change: 356 passed, 44 skipped (identical to baseline).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Create q2mm/elements.py as single source of truth for periodic table
  data (masses, radii, symbols) with annotated provenance (NUBASE2020,
  Cordero 2008)
- Wire constants.py, identifiers.py, objective.py, molecule.py to
  import from elements.py instead of maintaining separate copies
- Delete dead _patterns.py module and MacroModel constant duplicates
- Remove duplicate Rhodium entry from MASSES
- Extract generic _build_param_maps() and _match_for_export() replacing
  5 near-identical functions in forcefield.py
- Extract generic _match_mode() and _collect_matching() replacing 4
  near-identical functions in seminario.py
- Rewrite from_molecules() to delegate to from_molecule(), fixing
  silent feature drift (missing eigenmatrix params)
- Extract shared build_leaderboard_rows() used by cli.py and report.py
- Make scipy import lazy in scipy_opt.py
- Make Tinker bond_tolerance configurable (was hardcoded 1.3)
- Add --data-dir to benchmark CLI for wheel installs
- Fix packaging extras: jax/jaxlib in [all], openmm in [dev]
- Extract DFT scaling constant DEFAULT_DFT_SCALING in seminario.py
- Consolidate test fixtures into test/_shared.py (16 path constants,
  3 molecule factories) replacing duplicates across 13 test files

Net: -243 lines, 0 test regressions (356 pass, 44 skip)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings March 21, 2026 06:14
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors Q2MM to reduce duplication by centralizing element data and shared test fixtures, while also tightening packaging/extras and making some imports more defensive to improve install-time behavior.

Changes:

  • Add q2mm/elements.py as a single source of truth for element symbols/masses/radii and update multiple consumers to import from it.
  • DRY up repeated logic in core modules (e.g., forcefield/seminario) and consolidate duplicated test paths/molecule factories into test/_shared.py.
  • Improve packaging/runtime robustness (extras updates, lazy SciPy imports, benchmark CLI --data-dir, configurable Tinker bond tolerance) and remove dead code.

Reviewed changes

Copilot reviewed 27 out of 27 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
test/test_reference_data.py Uses shared fixture paths from test._shared instead of local constants.
test/test_models.py Switches common fixture paths to test._shared; keeps test-specific paths local.
test/test_method_e_pipeline.py Reuses shared SN2/Ethane paths/flags from test._shared.
test/test_method_d_e.py Reuses shared SN2 Hessian path from test._shared.
test/test_eigenmatrix.py Reuses shared SN2/Ethane paths/flags from test._shared.
test/integration/test_seminario_parity.py Imports repo root + SN2 paths from test._shared to avoid duplication.
test/integration/test_scipy_optimizer.py Uses shared molecule factories (make_water, make_diatomic).
test/integration/test_optimization_validation.py Uses shared fixture paths and make_water factory.
test/integration/test_openmm_export.py Uses shared fixture paths and shared molecule factories.
test/integration/test_openmm_backend.py Uses shared fixture paths and shared molecule factories.
test/integration/test_jax_backend.py Uses shared repo/SN2 paths and shared molecule factories.
test/integration/test_e2e_sn2_validation.py Imports all SN2/CH3F fixture paths from test._shared.
test/conftest.py Imports/re-exports shared constants/factories for fixture access.
test/_shared.py New shared module defining common example paths, availability flags, and molecule factories.
q2mm/parsers/_patterns.py Deleted unused shared regex/util module.
q2mm/optimizers/scipy_opt.py Makes SciPy imports lazy (import inside methods) to allow module import without SciPy.
q2mm/optimizers/objective.py Fixes from_molecules() feature drift by delegating to from_molecule() and centralizes atomic symbol lookup.
q2mm/models/seminario.py Consolidates matching helpers and introduces DEFAULT_DFT_SCALING.
q2mm/models/molecule.py Imports covalent radii from centralized q2mm.elements.
q2mm/models/identifiers.py Imports two-letter element set from centralized q2mm.elements.
q2mm/models/forcefield.py Adds generic param-map builder and export-matching helper to reduce duplication.
q2mm/elements.py New centralized periodic table data and derived lookup tables.
q2mm/diagnostics/report.py Extracts shared leaderboard row-building logic into build_leaderboard_rows().
q2mm/diagnostics/cli.py Adds --data-dir support and reuses shared leaderboard row builder.
q2mm/constants.py Re-exports MASSES from q2mm.elements and removes duplicated element data.
q2mm/backends/mm/tinker.py Makes bond detection tolerance configurable via bond_tolerance.
pyproject.toml Updates extras (all includes jax/jaxlib; dev includes openmm).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread q2mm/diagnostics/report.py
Comment thread q2mm/constants.py Outdated
Comment thread q2mm/backends/mm/tinker.py
- Fix 0.0 score treated as NaN due to falsy 'or' check in
  build_leaderboard_rows() (use explicit None check instead)
- Update constants.py docstring to reflect _patterns.py deletion
- Add bond_tolerance to TinkerEngine docstring

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@ericchansen ericchansen merged commit 8f25655 into master Mar 21, 2026
10 checks passed
@ericchansen ericchansen deleted the refactor/consolidate-test-fixtures branch March 21, 2026 06:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants