fix(rf3): add_missing_atoms also, along with giving elements that wer… by k-chrispens · Pull Request #171 · diff-use/sampleworks

k-chrispens · 2026-03-15T04:12:01Z

…en't in the base structure, adds nan coords, b-factors, and occupancies of 0. We now handle this.

Summary by CodeRabbit

New Features
- Automatic replacement of NaN B-factors with a default value (20.0) and informational logging of replacements.
- New test fixture to simulate structures with missing atoms.
Improvements
- Earlier rejection of NaN B-factors and NaN coordinates to prevent invalid inputs.
- Occupancy validation relaxed to allow zero while still rejecting values outside [0,1].
Tests
- Added tests covering missing atoms, NaN B-factors, NaN coordinates, and occupancy edge cases.

coderabbitai · 2026-03-15T04:12:15Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d3a8a2b0-d934-48ff-ae7e-9cf1f8081959

📥 Commits

Reviewing files that changed from the base of the PR and between 2b3d73a and 37d182e.

📒 Files selected for processing (5)

src/sampleworks/core/rewards/protocol.py
src/sampleworks/models/protenix/wrapper.py
src/sampleworks/models/rf3/wrapper.py
tests/conftest.py
tests/rewards/test_reward_inputs.py

🚧 Files skipped from review as they are similar to previous changes (5)

src/sampleworks/models/protenix/wrapper.py
tests/rewards/test_reward_inputs.py
tests/conftest.py
src/sampleworks/core/rewards/protocol.py
src/sampleworks/models/rf3/wrapper.py

📝 Walkthrough

Walkthrough

Validate AtomArray B-factors (reject NaN), allow occupancy == 0 while enforcing 0 ≤ occupancy ≤ 1, and add TODO about occupancy handling for multi-particle steering. Model featurizers replace NaN B-factors with 20.0 and log replacements. Tests add fixture for AtomArrays with missing atoms and cover these validations.

Changes

Cohort / File(s)	Summary
Reward protocol `src/sampleworks/core/rewards/protocol.py`	Added validation to raise on NaN B-factors; relaxed occupancy check to allow 0 while enforcing [0,1]; added TODO about configurable occupancy handling for multi-particle steering.
Model wrapper featurization `src/sampleworks/models/protenix/wrapper.py`, `src/sampleworks/models/rf3/wrapper.py`	When `b_factor` contains NaNs, replace NaNs with `20.0` and log count/context. `protenix` adds an explicit local type annotation only.
Tests & fixtures `tests/conftest.py`, `tests/rewards/test_reward_inputs.py`	Imported `parse_atom_array`; added `atom_array_1vme_with_missing_atoms` fixture producing AtomArray with missing atoms; added tests asserting rejection of NaN B-factors and NaN coordinates, occupancy bounds, and acceptance of cleaned arrays.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

fix(rf3): squashed RF3 bug that was blocking running on models with unresolved residues #140: Modifies validation/handling of B-factors and occupancies in reward inputs and related wrappers.

Poem

🐇 I sniffed the NaNs beneath the log,
I nudged them up to twenty in a jog,
I wrote a test to catch the sly,
Now clean arrays hop and multiply,
A carrot for each fixed small fog 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name	Status	Explanation	Resolution
Title check	❓ Inconclusive	The title is truncated and incomplete; it cuts off mid-sentence at 'elements that wer…', making it unclear what the full change description is.	Provide the complete, untruncated pull request title so the full intent and scope of changes can be properly evaluated.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage	✅ Passed	Docstring coverage is 80.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch kmc/fix-rf3-error

📝 Coding Plan

Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Copilot

Pull request overview

Adds stricter RewardInputs.from_atom_array() input validation (explicitly rejecting NaN B-factors) and updates model wrappers/tests to ensure atom arrays are cleaned accordingly, particularly when add_missing_atoms introduces unresolved atoms.

Changes:

Add a focused test suite for RewardInputs.from_atom_array() validation failures (NaN B-factors/coords, invalid occupancy).
Introduce a conftest fixture that reproduces add_missing_atoms behavior for a real structure input.
Update RF3 and Protenix wrappers to replace NaN B-factors with a default (20.0) before constructing reward inputs.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
tests/rewards/test_reward_inputs.py	New tests covering RewardInputs validation and wrapper-style cleanup behavior
tests/conftest.py	Adds fixture to generate an atom array with missing atoms added (incl. unresolved atoms)
src/sampleworks/models/rf3/wrapper.py	Sanitizes NaN B-factors on model atom arrays produced via `add_missing_atoms`
src/sampleworks/models/protenix/wrapper.py	Adds typing for `sequence_data` and sanitizes NaN B-factors similarly to RF3
src/sampleworks/core/rewards/protocol.py	Adds NaN B-factor validation; adjusts occupancy validation logic

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/sampleworks/core/rewards/protocol.py`:
- Around line 79-80: Docstring inconsistency: update the docstrings in
protocol.py that currently claim "all occupancies positive"/"positive occupancy"
to reflect the current validation (atom_array.occupancy allowed in [0, 1],
inclusive of 0) and note the interaction with downstream filtering; mention that
zeros are permitted by validate (see atom_array.occupancy check) but
RealSpaceRewardFunction.structure_to_reward_input filters atoms with occupancy >
0 so zero-occupancy atoms will be excluded (e.g., atoms added by
add_missing_atoms), ensuring callers understand both the validation rule and
downstream behavior.

In `@tests/rewards/test_reward_inputs.py`:
- Around line 63-77: The test test_zero_occupancy_rejected assumes zero
occupancy is invalid but protocol validation now accepts occupancy == 0; update
the test to assert the actual invalid case by setting occupancies[0:3] = -0.1
(or another negative value) and still expect
RewardInputs.from_atom_array(atom_array, ensemble_size=1) to raise ValueError,
or if zero occupancy should be rejected instead, revert the protocol validation
change that relaxed the check from occupancy <= 0 to occupancy < 0 so
test_zero_occupancy_rejected and RewardInputs.from_atom_array behavior remain
consistent.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 5db115db-0b19-4851-880c-45e734c87d26

📥 Commits

Reviewing files that changed from the base of the PR and between e5db928 and 8022443.

📒 Files selected for processing (5)

src/sampleworks/core/rewards/protocol.py
src/sampleworks/models/protenix/wrapper.py
src/sampleworks/models/rf3/wrapper.py
tests/conftest.py
tests/rewards/test_reward_inputs.py

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/rewards/test_reward_inputs.py`:
- Around line 12-23: Convert the plain-text docstrings in the
TestRewardInputsFromAtomArray test class and its test methods
(TestRewardInputsFromAtomArray, test_nan_b_factors_from_missing_atoms_rejected,
test_cleaned_missing_atoms_accepted) to NumPy-style docstrings: add a one-line
summary, a blank line, and sections like Parameters (if any), Returns (if any),
and Notes or Examples as appropriate for tests; ensure each docstring follows
the NumPy convention (triple-quoted, summary then section headers) and update
both the class-level docstring and the two method-level docstrings to match the
guideline.
- Around line 56-58: Update the TestRewardInputsFromAtomArray class and its test
methods to use NumPy-style docstrings (describe parameters, returns, and brief
summary) instead of one-line docstrings, and replace the fragile rank-dependent
indexing coords[..., 3, :] in the test with an explicit check on
atom_array.coord: verify it's an np.ndarray and branch on ndim (ndim == 3 for
AtomArrayStack use coords[:, 3, :] = np.nan; otherwise use coords[3, :] =
np.nan) before assigning atom_array.coord = coords so the code is robust to both
AtomArray and AtomArrayStack shapes.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 33adb79a-e8cb-4eac-99c5-a18fc3e8ff42

📥 Commits

Reviewing files that changed from the base of the PR and between 8022443 and aa1091e.

📒 Files selected for processing (1)

tests/rewards/test_reward_inputs.py

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

tests/rewards/test_reward_inputs.py (1)

12-23: ⚠️ Potential issue | 🟡 Minor

Convert class and test docstrings to NumPy style.

Docstrings here are still plain/one-line and don’t follow the repo’s required NumPy-style format for classes/functions.

♻️ Proposed update

 class TestRewardInputsFromAtomArray:
-    """Validate that RewardInputs.from_atom_array rejects invalid atom arrays."""
+    """Validate RewardInputs.from_atom_array input handling.
+
+    Notes
+    -----
+    Covers NaN B-factors, cleaned missing-atom arrays, and NaN coordinates.
+    """

     def test_nan_b_factors_from_missing_atoms_rejected(self, atom_array_1vme_with_missing_atoms):
-        """
-        RewardInputs rejects the raw atom array from add_missing_atoms.
-        """
+        """Reject atom arrays containing NaN B-factors.
+
+        Notes
+        -----
+        Uses a fixture with unresolved atoms introduced by add_missing_atoms.
+        """
@@
     def test_cleaned_missing_atoms_accepted(self, atom_array_1vme_with_missing_atoms):
-        """After applying the same fixes as the RF3 wrapper, the atom array passes."""
+        """Accept cleaned atom arrays after RF3-style preprocessing.
+
+        Notes
+        -----
+        Applies coordinate, occupancy, and B-factor cleanup before validation.
+        """
@@
     def test_nan_coordinates_rejected(self, structure_1vme):
-        """NaN coordinates must be caught before constructing reward tensors."""
+        """Reject atom arrays with NaN coordinates.
+
+        Notes
+        -----
+        Sanitizes B-factors first so coordinate validation is the failing condition.
+        """

As per coding guidelines "Always include NumPy-style docstrings for every function and class".

Also applies to: 50-52

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tests/rewards/test_reward_inputs.py` around lines 12 - 23, The class and its
test methods use plain one-line docstrings instead of the repository-required
NumPy-style docstrings; update the class TestRewardInputsFromAtomArray and its
test methods test_nan_b_factors_from_missing_atoms_rejected and
test_cleaned_missing_atoms_accepted to include NumPy-style docstrings (summary
line, extended description if needed, and sections like
Parameters/Returns/Raises where applicable), and ensure the docstring for
test_nan_b_factors_from_missing_atoms_rejected documents the expected ValueError
raised by RewardInputs.from_atom_array and test fixtures (e.g.,
atom_array_1vme_with_missing_atoms) following the repo format.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/rewards/test_reward_inputs.py`:
- Around line 29-31: The centroid computation can produce all-NaN results when
resolved_coords is empty; check if resolved_coords.size == 0 (or
resolved_coords.shape[0] == 0) after masking with nan_coord_mask and, if so, set
centroid to an array of zeros with the same coordinate dimensionality (matching
aa.coord.shape[1]) and ensure n_nan is still set from nan_coord_mask; update the
code around resolved_coords, centroid, n_nan to use this fallback so subsequent
cleanup paths get valid coordinates (referencing resolved_coords, centroid,
nan_coord_mask, and aa.coord).

---

Duplicate comments:
In `@tests/rewards/test_reward_inputs.py`:
- Around line 12-23: The class and its test methods use plain one-line
docstrings instead of the repository-required NumPy-style docstrings; update the
class TestRewardInputsFromAtomArray and its test methods
test_nan_b_factors_from_missing_atoms_rejected and
test_cleaned_missing_atoms_accepted to include NumPy-style docstrings (summary
line, extended description if needed, and sections like
Parameters/Returns/Raises where applicable), and ensure the docstring for
test_nan_b_factors_from_missing_atoms_rejected documents the expected ValueError
raised by RewardInputs.from_atom_array and test fixtures (e.g.,
atom_array_1vme_with_missing_atoms) following the repo format.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 9049f951-2fe0-4731-8b6f-1d4e29eec3d4

📥 Commits

Reviewing files that changed from the base of the PR and between aa1091e and a987752.

📒 Files selected for processing (5)

src/sampleworks/core/rewards/protocol.py
src/sampleworks/models/protenix/wrapper.py
src/sampleworks/models/rf3/wrapper.py
tests/conftest.py
tests/rewards/test_reward_inputs.py

🚧 Files skipped from review as they are similar to previous changes (3)

src/sampleworks/models/protenix/wrapper.py
tests/conftest.py
src/sampleworks/core/rewards/protocol.py

marcuscollins

All LGTM. Added an issue to follow up on.

…en't in the base structure, adds nan coords, b-factors, and occupancies of 0

… are NaN in tests. Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

…ative occupancies, not just positive

Copilot AI review requested due to automatic review settings March 15, 2026 04:12

k-chrispens had a problem deploying to gpu-testing March 15, 2026 04:12 — with GitHub Actions Error

k-chrispens requested a review from marcuscollins March 15, 2026 04:12

Copilot started reviewing on behalf of k-chrispens March 15, 2026 04:12 View session

Copilot AI reviewed Mar 15, 2026

View reviewed changes

Comment thread src/sampleworks/core/rewards/protocol.py

Comment thread tests/rewards/test_reward_inputs.py Outdated

Comment thread tests/rewards/test_reward_inputs.py

coderabbitai Bot reviewed Mar 15, 2026

View reviewed changes

Comment thread src/sampleworks/core/rewards/protocol.py

Comment thread tests/rewards/test_reward_inputs.py Outdated

k-chrispens temporarily deployed to gpu-testing March 15, 2026 04:32 — with GitHub Actions Inactive

k-chrispens had a problem deploying to gpu-testing March 15, 2026 04:32 — with GitHub Actions Failure

k-chrispens temporarily deployed to gpu-testing March 15, 2026 04:32 — with GitHub Actions Inactive

coderabbitai Bot reviewed Mar 15, 2026

View reviewed changes

Comment thread tests/rewards/test_reward_inputs.py

Comment thread tests/rewards/test_reward_inputs.py

k-chrispens force-pushed the kmc/fix-rf3-error branch from aa1091e to a987752 Compare March 16, 2026 19:47

k-chrispens temporarily deployed to gpu-testing March 16, 2026 19:47 — with GitHub Actions Inactive

coderabbitai Bot reviewed Mar 16, 2026

View reviewed changes

Comment thread tests/rewards/test_reward_inputs.py

marcuscollins had a problem deploying to gpu-testing March 17, 2026 22:28 — with GitHub Actions Error

marcuscollins approved these changes Mar 17, 2026

View reviewed changes

k-chrispens and others added 4 commits March 17, 2026 15:32

fix(rf3): add_missing_atoms also, along with giving elements that wer…

37425c4

…en't in the base structure, adds nan coords, b-factors, and occupancies of 0

fix(tests): remove zero occupancy test, we can allow that

040715f

Apply suggestions from code review--handle case where all coordinates…

e813218

… are NaN in tests. Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

Update RewardInputs documentation to note that it accepts all non-neg…

37d182e

…ative occupancies, not just positive

marcuscollins force-pushed the kmc/fix-rf3-error branch from 2b3d73a to 37d182e Compare March 17, 2026 22:33

marcuscollins had a problem deploying to gpu-testing March 17, 2026 22:33 — with GitHub Actions Error

marcuscollins temporarily deployed to gpu-testing March 17, 2026 22:33 — with GitHub Actions Inactive

marcuscollins merged commit 86cf528 into main Mar 17, 2026
3 of 4 checks passed

coderabbitai Bot mentioned this pull request Mar 22, 2026

feat(rf3): chiral gradient zeroing and tracking #182

Merged

k-chrispens deleted the kmc/fix-rf3-error branch April 22, 2026 00:26

Conversation

k-chrispens commented Mar 15, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Mar 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 inconclusive)

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

marcuscollins left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

k-chrispens commented Mar 15, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Mar 15, 2026 •

edited

Loading