Skip to content

Dataset energy shifting#388

Merged
chrisiacovella merged 12 commits intochoderalab:mainfrom
chrisiacovella:dev-preprocessing_shifting
Nov 11, 2025
Merged

Dataset energy shifting#388
chrisiacovella merged 12 commits intochoderalab:mainfrom
chrisiacovella:dev-preprocessing_shifting

Conversation

@chrisiacovella
Copy link
Copy Markdown
Member

@chrisiacovella chrisiacovella commented Nov 8, 2025

Pull Request Summary

This PR adds in the ability to shift the energies of systems in a dataset (after self-energies have been removed). This gives the option to shift by the minimum value (making all values positive), the maximum value (making all values negative), or the mean (centering the distribution at zero).

Key changes

Notable points that this PR has either accomplished or will accomplish.

  • the toml file now allows a user to specify a shifting scheme, triggering this operation during the per-datapoint preprocessing operations.

Associated Issue(s)

Pull Request Checklist

  • Issue(s) raised/addressed and linked
  • Includes appropriate unit test(s)
  • Appropriate docstring(s) added/updated
  • Appropriate .rst doc file(s) added/updated
  • PR is ready for review

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Nov 8, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 91.01%. Comparing base (dfa842b) to head (8d20cd7).

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds support for energy shifting to the training pipeline, allowing energies to be shifted by their minimum, maximum, or mean values to potentially improve training stability and speed.

Key changes:

  • Introduces a shift_energies parameter with support for 'min', 'max', and 'mean' shifting modes
  • Implements energy shifting logic in the dataset processing pipeline after self-energy removal
  • Adds comprehensive test coverage for all three shifting modes

Reviewed Changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
modelforge/train/parameters.py Adds ShiftEnergiesMode enum and shift_energies field to training parameters
modelforge/train/training.py Passes shift_energies parameter to datamodule setup
modelforge/dataset/dataset.py Implements energy shifting logic in _per_datapoint_operations and adds parameter to method signatures
modelforge/tests/test_dataset.py Adds comprehensive test for all three energy shifting modes
modelforge/tests/conftest.py Updates datamodule factory to accept shift_energies parameter
docs/training.rst Documents the new shift_energies feature
docs/datasets.rst Documents energy shifting in the dataset operations section

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread modelforge/dataset/dataset.py Outdated
Comment thread modelforge/dataset/dataset.py
Comment thread modelforge/dataset/dataset.py
Comment thread docs/training.rst Outdated
Comment thread docs/datasets.rst Outdated
Comment thread docs/datasets.rst Outdated
Comment thread modelforge/tests/test_dataset.py
chrisiacovella and others added 7 commits November 10, 2025 11:00
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@chrisiacovella
Copy link
Copy Markdown
Member Author

CI is failing due to failing to install the reference version of sake. I've commented this out, but will need revise the testing to use static information (as we've done for schnet, etc.). #389

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 12 out of 12 changed files in this pull request and generated 5 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread modelforge/dataset/dataset.py Outdated
Comment thread modelforge/dataset/dataset.py Outdated
Comment thread modelforge/tests/test_dataset.py
Comment thread modelforge/tests/test_dataset.py Outdated
Comment thread modelforge/tests/test_dataset.py Outdated
chrisiacovella and others added 2 commits November 10, 2025 15:29
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@chrisiacovella chrisiacovella merged commit 8995841 into choderalab:main Nov 11, 2025
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants