Skip to content

fix(rf3): add_missing_atoms also, along with giving elements that wer…#171

Merged
marcuscollins merged 4 commits intomainfrom
kmc/fix-rf3-error
Mar 17, 2026
Merged

fix(rf3): add_missing_atoms also, along with giving elements that wer…#171
marcuscollins merged 4 commits intomainfrom
kmc/fix-rf3-error

Conversation

@k-chrispens
Copy link
Copy Markdown
Collaborator

@k-chrispens k-chrispens commented Mar 15, 2026

…en't in the base structure, adds nan coords, b-factors, and occupancies of 0. We now handle this.

Summary by CodeRabbit

  • New Features

    • Automatic replacement of NaN B-factors with a default value (20.0) and informational logging of replacements.
    • New test fixture to simulate structures with missing atoms.
  • Improvements

    • Earlier rejection of NaN B-factors and NaN coordinates to prevent invalid inputs.
    • Occupancy validation relaxed to allow zero while still rejecting values outside [0,1].
  • Tests

    • Added tests covering missing atoms, NaN B-factors, NaN coordinates, and occupancy edge cases.

Copilot AI review requested due to automatic review settings March 15, 2026 04:12
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Mar 15, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d3a8a2b0-d934-48ff-ae7e-9cf1f8081959

📥 Commits

Reviewing files that changed from the base of the PR and between 2b3d73a and 37d182e.

📒 Files selected for processing (5)
  • src/sampleworks/core/rewards/protocol.py
  • src/sampleworks/models/protenix/wrapper.py
  • src/sampleworks/models/rf3/wrapper.py
  • tests/conftest.py
  • tests/rewards/test_reward_inputs.py
🚧 Files skipped from review as they are similar to previous changes (5)
  • src/sampleworks/models/protenix/wrapper.py
  • tests/rewards/test_reward_inputs.py
  • tests/conftest.py
  • src/sampleworks/core/rewards/protocol.py
  • src/sampleworks/models/rf3/wrapper.py

📝 Walkthrough

Walkthrough

Validate AtomArray B-factors (reject NaN), allow occupancy == 0 while enforcing 0 ≤ occupancy ≤ 1, and add TODO about occupancy handling for multi-particle steering. Model featurizers replace NaN B-factors with 20.0 and log replacements. Tests add fixture for AtomArrays with missing atoms and cover these validations.

Changes

Cohort / File(s) Summary
Reward protocol
src/sampleworks/core/rewards/protocol.py
Added validation to raise on NaN B-factors; relaxed occupancy check to allow 0 while enforcing [0,1]; added TODO about configurable occupancy handling for multi-particle steering.
Model wrapper featurization
src/sampleworks/models/protenix/wrapper.py, src/sampleworks/models/rf3/wrapper.py
When b_factor contains NaNs, replace NaNs with 20.0 and log count/context. protenix adds an explicit local type annotation only.
Tests & fixtures
tests/conftest.py, tests/rewards/test_reward_inputs.py
Imported parse_atom_array; added atom_array_1vme_with_missing_atoms fixture producing AtomArray with missing atoms; added tests asserting rejection of NaN B-factors and NaN coordinates, occupancy bounds, and acceptance of cleaned arrays.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Poem

🐇 I sniffed the NaNs beneath the log,
I nudged them up to twenty in a jog,
I wrote a test to catch the sly,
Now clean arrays hop and multiply,
A carrot for each fixed small fog 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Title check ❓ Inconclusive The title is truncated and incomplete; it cuts off mid-sentence at 'elements that wer…', making it unclear what the full change description is. Provide the complete, untruncated pull request title so the full intent and scope of changes can be properly evaluated.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed Docstring coverage is 80.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch kmc/fix-rf3-error
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds stricter RewardInputs.from_atom_array() input validation (explicitly rejecting NaN B-factors) and updates model wrappers/tests to ensure atom arrays are cleaned accordingly, particularly when add_missing_atoms introduces unresolved atoms.

Changes:

  • Add a focused test suite for RewardInputs.from_atom_array() validation failures (NaN B-factors/coords, invalid occupancy).
  • Introduce a conftest fixture that reproduces add_missing_atoms behavior for a real structure input.
  • Update RF3 and Protenix wrappers to replace NaN B-factors with a default (20.0) before constructing reward inputs.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
tests/rewards/test_reward_inputs.py New tests covering RewardInputs validation and wrapper-style cleanup behavior
tests/conftest.py Adds fixture to generate an atom array with missing atoms added (incl. unresolved atoms)
src/sampleworks/models/rf3/wrapper.py Sanitizes NaN B-factors on model atom arrays produced via add_missing_atoms
src/sampleworks/models/protenix/wrapper.py Adds typing for sequence_data and sanitizes NaN B-factors similarly to RF3
src/sampleworks/core/rewards/protocol.py Adds NaN B-factor validation; adjusts occupancy validation logic

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Comment thread src/sampleworks/core/rewards/protocol.py
Comment thread tests/rewards/test_reward_inputs.py Outdated
Comment thread tests/rewards/test_reward_inputs.py
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/sampleworks/core/rewards/protocol.py`:
- Around line 79-80: Docstring inconsistency: update the docstrings in
protocol.py that currently claim "all occupancies positive"/"positive occupancy"
to reflect the current validation (atom_array.occupancy allowed in [0, 1],
inclusive of 0) and note the interaction with downstream filtering; mention that
zeros are permitted by validate (see atom_array.occupancy check) but
RealSpaceRewardFunction.structure_to_reward_input filters atoms with occupancy >
0 so zero-occupancy atoms will be excluded (e.g., atoms added by
add_missing_atoms), ensuring callers understand both the validation rule and
downstream behavior.

In `@tests/rewards/test_reward_inputs.py`:
- Around line 63-77: The test test_zero_occupancy_rejected assumes zero
occupancy is invalid but protocol validation now accepts occupancy == 0; update
the test to assert the actual invalid case by setting occupancies[0:3] = -0.1
(or another negative value) and still expect
RewardInputs.from_atom_array(atom_array, ensemble_size=1) to raise ValueError,
or if zero occupancy should be rejected instead, revert the protocol validation
change that relaxed the check from occupancy <= 0 to occupancy < 0 so
test_zero_occupancy_rejected and RewardInputs.from_atom_array behavior remain
consistent.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 5db115db-0b19-4851-880c-45e734c87d26

📥 Commits

Reviewing files that changed from the base of the PR and between e5db928 and 8022443.

📒 Files selected for processing (5)
  • src/sampleworks/core/rewards/protocol.py
  • src/sampleworks/models/protenix/wrapper.py
  • src/sampleworks/models/rf3/wrapper.py
  • tests/conftest.py
  • tests/rewards/test_reward_inputs.py

Comment thread src/sampleworks/core/rewards/protocol.py
Comment thread tests/rewards/test_reward_inputs.py Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/rewards/test_reward_inputs.py`:
- Around line 12-23: Convert the plain-text docstrings in the
TestRewardInputsFromAtomArray test class and its test methods
(TestRewardInputsFromAtomArray, test_nan_b_factors_from_missing_atoms_rejected,
test_cleaned_missing_atoms_accepted) to NumPy-style docstrings: add a one-line
summary, a blank line, and sections like Parameters (if any), Returns (if any),
and Notes or Examples as appropriate for tests; ensure each docstring follows
the NumPy convention (triple-quoted, summary then section headers) and update
both the class-level docstring and the two method-level docstrings to match the
guideline.
- Around line 56-58: Update the TestRewardInputsFromAtomArray class and its test
methods to use NumPy-style docstrings (describe parameters, returns, and brief
summary) instead of one-line docstrings, and replace the fragile rank-dependent
indexing coords[..., 3, :] in the test with an explicit check on
atom_array.coord: verify it's an np.ndarray and branch on ndim (ndim == 3 for
AtomArrayStack use coords[:, 3, :] = np.nan; otherwise use coords[3, :] =
np.nan) before assigning atom_array.coord = coords so the code is robust to both
AtomArray and AtomArrayStack shapes.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 33adb79a-e8cb-4eac-99c5-a18fc3e8ff42

📥 Commits

Reviewing files that changed from the base of the PR and between 8022443 and aa1091e.

📒 Files selected for processing (1)
  • tests/rewards/test_reward_inputs.py

Comment thread tests/rewards/test_reward_inputs.py
Comment thread tests/rewards/test_reward_inputs.py
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
tests/rewards/test_reward_inputs.py (1)

12-23: ⚠️ Potential issue | 🟡 Minor

Convert class and test docstrings to NumPy style.

Docstrings here are still plain/one-line and don’t follow the repo’s required NumPy-style format for classes/functions.

♻️ Proposed update
 class TestRewardInputsFromAtomArray:
-    """Validate that RewardInputs.from_atom_array rejects invalid atom arrays."""
+    """Validate RewardInputs.from_atom_array input handling.
+
+    Notes
+    -----
+    Covers NaN B-factors, cleaned missing-atom arrays, and NaN coordinates.
+    """

     def test_nan_b_factors_from_missing_atoms_rejected(self, atom_array_1vme_with_missing_atoms):
-        """
-        RewardInputs rejects the raw atom array from add_missing_atoms.
-        """
+        """Reject atom arrays containing NaN B-factors.
+
+        Notes
+        -----
+        Uses a fixture with unresolved atoms introduced by add_missing_atoms.
+        """
@@
     def test_cleaned_missing_atoms_accepted(self, atom_array_1vme_with_missing_atoms):
-        """After applying the same fixes as the RF3 wrapper, the atom array passes."""
+        """Accept cleaned atom arrays after RF3-style preprocessing.
+
+        Notes
+        -----
+        Applies coordinate, occupancy, and B-factor cleanup before validation.
+        """
@@
     def test_nan_coordinates_rejected(self, structure_1vme):
-        """NaN coordinates must be caught before constructing reward tensors."""
+        """Reject atom arrays with NaN coordinates.
+
+        Notes
+        -----
+        Sanitizes B-factors first so coordinate validation is the failing condition.
+        """

As per coding guidelines "Always include NumPy-style docstrings for every function and class".

Also applies to: 50-52

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/rewards/test_reward_inputs.py` around lines 12 - 23, The class and its
test methods use plain one-line docstrings instead of the repository-required
NumPy-style docstrings; update the class TestRewardInputsFromAtomArray and its
test methods test_nan_b_factors_from_missing_atoms_rejected and
test_cleaned_missing_atoms_accepted to include NumPy-style docstrings (summary
line, extended description if needed, and sections like
Parameters/Returns/Raises where applicable), and ensure the docstring for
test_nan_b_factors_from_missing_atoms_rejected documents the expected ValueError
raised by RewardInputs.from_atom_array and test fixtures (e.g.,
atom_array_1vme_with_missing_atoms) following the repo format.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/rewards/test_reward_inputs.py`:
- Around line 29-31: The centroid computation can produce all-NaN results when
resolved_coords is empty; check if resolved_coords.size == 0 (or
resolved_coords.shape[0] == 0) after masking with nan_coord_mask and, if so, set
centroid to an array of zeros with the same coordinate dimensionality (matching
aa.coord.shape[1]) and ensure n_nan is still set from nan_coord_mask; update the
code around resolved_coords, centroid, n_nan to use this fallback so subsequent
cleanup paths get valid coordinates (referencing resolved_coords, centroid,
nan_coord_mask, and aa.coord).

---

Duplicate comments:
In `@tests/rewards/test_reward_inputs.py`:
- Around line 12-23: The class and its test methods use plain one-line
docstrings instead of the repository-required NumPy-style docstrings; update the
class TestRewardInputsFromAtomArray and its test methods
test_nan_b_factors_from_missing_atoms_rejected and
test_cleaned_missing_atoms_accepted to include NumPy-style docstrings (summary
line, extended description if needed, and sections like
Parameters/Returns/Raises where applicable), and ensure the docstring for
test_nan_b_factors_from_missing_atoms_rejected documents the expected ValueError
raised by RewardInputs.from_atom_array and test fixtures (e.g.,
atom_array_1vme_with_missing_atoms) following the repo format.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 9049f951-2fe0-4731-8b6f-1d4e29eec3d4

📥 Commits

Reviewing files that changed from the base of the PR and between aa1091e and a987752.

📒 Files selected for processing (5)
  • src/sampleworks/core/rewards/protocol.py
  • src/sampleworks/models/protenix/wrapper.py
  • src/sampleworks/models/rf3/wrapper.py
  • tests/conftest.py
  • tests/rewards/test_reward_inputs.py
🚧 Files skipped from review as they are similar to previous changes (3)
  • src/sampleworks/models/protenix/wrapper.py
  • tests/conftest.py
  • src/sampleworks/core/rewards/protocol.py

Comment thread tests/rewards/test_reward_inputs.py
Copy link
Copy Markdown
Collaborator

@marcuscollins marcuscollins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All LGTM. Added an issue to follow up on.

k-chrispens and others added 4 commits March 17, 2026 15:32
…en't in the base structure, adds nan coords, b-factors, and occupancies of 0
… are NaN in tests.

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
@marcuscollins marcuscollins merged commit 86cf528 into main Mar 17, 2026
3 of 4 checks passed
@k-chrispens k-chrispens deleted the kmc/fix-rf3-error branch April 22, 2026 00:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants