fix(rf3): squashed RF3 bug that was blocking running on models with unresolved residues by k-chrispens · Pull Request #140 · diff-use/sampleworks

k-chrispens · 2026-03-05T23:44:14Z

This also removed the reward_param_mask as it isn't needed anymore if we are computing common atoms and saving the atoms that the model is operating on instead of having to reconcile and save the original structure. This should close #86 as well.

Summary by CodeRabbit

Release Notes

New Features
- Enhanced input validation for atom arrays with checks for element presence, b-factor presence, finite coordinates, and valid occupancy values.
- Improved handling of NaN coordinates with automatic centroid-based noise generation during model initialization.
Refactor
- Simplified coordinate preprocessing pipeline by removing unnecessary masking complexity.
- Streamlined API by removing internal parameters from core utility functions.

coderabbitai · 2026-03-05T23:44:36Z

📝 Walkthrough

Walkthrough

This PR removes reward_param_mask and mask_like fields from RewardInputs, introduces validation in the protocol layer, replaces atom masking with occupancy and coordinate validity filters, updates trajectory-writing functions accordingly, and enhances NaN coordinate handling in model preprocessing.

Changes

Cohort / File(s)	Summary
Protocol API Simplification `src/sampleworks/core/rewards/protocol.py`	Removed `reward_param_mask` and `mask_like` fields from `RewardInputs` dataclass; added input validation to `from_atom_array` (element presence, b_factor presence, finite coordinates, valid occupancy); elements list now built directly without masking.
Structure Processing and Validation `src/sampleworks/eval/structure_utils.py`, `src/sampleworks/eval/grid_search_eval_utils.py`	Added public constant `ATOMWORKS_COMPARISON_OPS`; replaced per-atom masking logic with validity filters based on occupancy > 0 and finite coordinates; atom_array filtered before reconciler construction; minor overload formatting updates.
Trajectory and Coordinate Handling `src/sampleworks/utils/guidance_script_utils.py`, `src/sampleworks/eval/generate_synthetic_density.py`	Removed `reward_param_mask` parameter from `save_trajectory` and internal helpers (`_write_coords_into_array`, `_save_trajectory`, `_save_fk_steering_trajectory`); simplified coordinate assignment; updated type-check suppression comments from pyright to ty syntax.
Model Preprocessing `src/sampleworks/models/rf3/wrapper.py`	Introduced NaN coordinate detection and handling (compute centroid of valid coords, add Gaussian noise, replace NaNs); enforced occupancy to 1.0 for all atoms; updated type-ignore annotation specificity.
Type Checking and Utilities `src/sampleworks/core/samplers/edm.py`, `src/sampleworks/utils/atom_array_utils.py`	Updated inline comments and removed pyright type-ignore annotations; replaced with ty syntax or removed entirely; no functional changes.
Integration and Unit Tests `tests/integration/test_mismatch_integration.py`, `tests/integration/test_pipeline_integration.py`, `tests/utils/test_density_utils.py`	Removed extensive real-pair reconciliation tests and `RealPairExpectation` dataclass; updated `RewardInputs` construction to omit removed fields; added new integration test class `TestRealWrapperPreprocessing` for real-wrapper preprocessing path validation.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Creates a new way to test multiple selections for each protein; parallelizes LDDT script #116: Modifies src/sampleworks/eval/structure_utils.py with related selection-mask and comparison operations handling.
Align sampled structure to reference before computing maps in RSCC eval script; fixing tests. #127: Related preprocessing changes for NaN coordinate removal and downstream validation adjustments.

Suggested reviewers

marcuscollins

Poem

🐰 No masks to hide what's true and bright,
Just honest atoms, clean and right!
We filter out the NaNs with care,
And validate what's really there.
bounces gleefully 🥕✨

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The PR title directly addresses the main change: fixing an RF3 bug related to unresolved residues (NaN coordinates), which is the core issue resolved by removing reward_param_mask and handling NaN coordinates.
Linked Issues check	✅ Passed	The PR successfully addresses issue `#86` by implementing NaN coordinate handling in RF3 wrapper, removing reward_param_mask-based masking, and filtering invalid atoms in structure preprocessing to prevent NaN coordinates in outputs.
Out of Scope Changes check	✅ Passed	All changes are directly related to the core objective of handling NaN coordinates and removing masking logic, including updates to type-check comments, test cleanup, and supporting infrastructure for NaN detection and filtering.
Docstring Coverage	✅ Passed	Docstring coverage is 88.10% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch kmc/fix-rf3-nan-reward-mask

Tip

Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs).
Share your feedback on Discord.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/integration/test_mismatch_integration.py`:
- Around line 638-640: The test duplicates the same assertion on
reward_inputs.input_coords.shape[-2], so replace the second duplicate with the
missing contract check (likely on reward_inputs.occupancies) to verify shapes
match expected_n_model; update the assertion to something like asserting
reward_inputs.occupancies.shape[-2] == real_pair_case.expected_n_model and keep
the existing checks on reward_inputs.elements.shape and
reward_inputs.input_coords.shape, referencing the reward_inputs object and
real_pair_case.expected_n_model to locate the change.

In `@tests/integration/test_pipeline_integration.py`:
- Around line 1076-1081: In the test docstring describing
process_structure_to_trajectory_input feeding into to_reward_inputs, replace the
multiplication sign '×' with a plain ASCII 'x' to satisfy Ruff's RUF002; update
the triple-quoted string that currently reads "wrapper × structure" to "wrapper
x structure" so the test's docstring no longer contains the non-ASCII
multiplication character.
- Around line 1113-1120: The assertion only checks atom counts but not the
validity of values; after computing reward_inputs =
processed.to_reward_inputs(device=device) and verifying n_state == n_reward, add
checks to ensure reward_inputs.elements and reward_inputs.occupancy contain
finite numbers (no NaN/Inf) and that occupancy values are within the expected
range (e.g., 0.0..1.0 or boolean 0/1 depending on the contract) so the test
fails if coordinates or occupancies are invalid; reference reward_inputs,
processed.to_reward_inputs, reward_inputs.elements, reward_inputs.occupancy,
state, n_state, and n_reward when locating where to insert these assertions.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 29056aec-fb05-424b-bf50-ffad4008dc0c

📥 Commits

Reviewing files that changed from the base of the PR and between efb2b2a and 0a133a6.

📒 Files selected for processing (11)

src/sampleworks/core/rewards/protocol.py
src/sampleworks/core/samplers/edm.py
src/sampleworks/eval/generate_synthetic_density.py
src/sampleworks/eval/grid_search_eval_utils.py
src/sampleworks/eval/structure_utils.py
src/sampleworks/models/rf3/wrapper.py
src/sampleworks/utils/atom_array_utils.py
src/sampleworks/utils/guidance_script_utils.py
tests/integration/test_mismatch_integration.py
tests/integration/test_pipeline_integration.py
tests/utils/test_density_utils.py

marcuscollins

Just a couple questions to be sure of things. If you can answer them, I think it will be fine and I will approve.

marcuscollins · 2026-03-06T17:02:11Z

-        _save_trajectory(
-            trajectory, atom_array, output_dir, reward_param_mask, subdir_name, save_every
-        )
+        _save_trajectory(trajectory, atom_array, output_dir, subdir_name, save_every)


The methods in this submodule get used for all models--did only RF3 have this issue that there might be NaN values and strange occupancies? None of the others needed the reward_param_mask?

Correct, specifically its model_atom_array was getting initialized incorrectly. Now that we have the reconciler, no need for reward_param_mask.

Copilot

Pull request overview

This PR fixes a bug where RF3 models with experimentally unresolved residues would crash during guidance due to NaN coordinates and zero occupancy atoms propagating through the pipeline. The fix ensures the RF3 wrapper initializes unresolved atoms with noise-based coordinates (centered on the resolved atom centroid) and sets their occupancy to 1.0. As a result, the reward_param_mask / mask_like masking mechanism in RewardInputs is no longer needed and has been cleanly removed, simplifying the reward and trajectory saving logic. This also addresses issue #86 (NaN coordinates in output structures).

Changes:

RF3 wrapper now replaces NaN coordinates with centroid+noise and sets occupancy=1.0 for all model atoms, ensuring clean atom arrays before downstream processing.
Removed reward_param_mask and mask_like from RewardInputs, adding input validation (element/b_factor annotations, finite coords, valid occupancy) in from_atom_array instead.
Simplified trajectory saving (_write_coords_into_array, save_everything) to require exact atom count match, and migrated remaining pyright ignore comments to ty.

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
`src/sampleworks/core/rewards/protocol.py`	Removed `reward_param_mask`/`mask_like` fields; added input validation in `from_atom_array`; simplified tensor construction.
`src/sampleworks/models/rf3/wrapper.py`	Core fix: replace NaN coords with centroid+noise, set occupancy=1.0 for all model atoms.
`src/sampleworks/eval/structure_utils.py`	Removed mask-based `input_coords` filtering in `to_reward_inputs`; renamed variable; formatting cleanup.
`src/sampleworks/utils/guidance_script_utils.py`	Removed `reward_param_mask` from save/trajectory functions; simplified `_write_coords_into_array`.
`src/sampleworks/utils/atom_array_utils.py`	Removed `pyright: ignore` comments (migrated to `ty`).
`src/sampleworks/core/samplers/edm.py`	Comment update: pyright → ty.
`src/sampleworks/eval/generate_synthetic_density.py`	Comment update: pyright → ty.
`src/sampleworks/eval/grid_search_eval_utils.py`	Comment update: pyright → ty.
`tests/integration/test_reward_mask_mismatch.py`	New test file verifying model-only atoms participate in rewards and end-to-end pipeline works.
`tests/integration/test_pipeline_integration.py`	Removed `reward_param_mask`/`mask_like` from test constructions; added `TestRealWrapperPreprocessing` slow tests.
`tests/integration/test_mismatch_integration.py`	Removed `RealPairExpectation` tests and associated helpers; cleaned up reward assertions.
`tests/utils/test_density_utils.py`	Removed `pyright: ignore` comment.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (1)

src/sampleworks/models/rf3/wrapper.py (1)

318-345: Assert the post-cleanup model_aa invariant.

This block fixes the happy path, but it never validates that the final model_aa still matches atom_to_token_map and contains only finite coordinates. Adding an explicit guard here would turn any upstream preprocessing drift into a clear failure instead of silently propagating a bad model atom array.

♻️ Suggested guard

         if len(model_aa) > num_atoms:
             model_aa = cast(AtomArray, model_aa[:num_atoms])
+        if len(model_aa) != num_atoms:
+            raise ValueError(
+                "RF3 model atom array length mismatch: "
+                f"len(model_aa)={len(model_aa)}, num_atoms={num_atoms}"
+            )

         # atomworks's add_missing_atoms adds unresolved atoms with
         # occupancy=0.0 and NaN coordinates when we get our atom array with
         # InferenceInput.from_atom_array. RF3 operates on these atoms (they're
         # in atom_to_token_map), so initialize their coordinates with noise and
@@
         # All atoms in model_aa are operated on by RF3 during diffusion.
         # Set occupancy to 1.0 regardless of what atomworks assigned (unresolved
         # atoms from add_missing_atoms get occupancy=0.0, but RF3 should use them)
         model_aa.set_annotation("occupancy", np.ones(len(model_aa), dtype=np.float32))
+        if not np.isfinite(model_aa.coord).all():
+            raise ValueError("RF3 model atom array still contains non-finite coordinates")

As per coding guidelines, "Fail fast with clear error messages when invalid states are encountered."

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/sampleworks/models/rf3/wrapper.py` around lines 318 - 345, Add a
fail-fast guard after the cleanup block to validate the model_aa invariant:
ensure len(model_aa) equals len(atom_to_token_map) (or that every index in
atom_to_token_map is < len(model_aa)) and that all coordinates in model_aa.coord
are finite (no NaN/Inf). If either check fails, raise a descriptive exception
(including the lengths and a short diagnostics string) so upstream preprocessing
drift is caught early; place this check immediately after the occupancy set via
model_aa.set_annotation and before further use of model_aa/b_factor.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/integration/test_reward_mask_mismatch.py`:
- Around line 237-240: The test currently only asserts shapes and finite losses
but not the numeric values of emitted coordinates; update the assertions to
verify numeric finiteness of the predicted coordinates by checking that all
entries in result.final_state are finite (e.g., use np.isfinite on
result.final_state.ravel() or result.final_state). Also keep the existing checks
on result.losses (result.losses -> valid_losses) but add an assert that all
coordinates in result.final_state are finite to prevent NaN-coordinate
regressions.
- Around line 255-260: The test currently fabricates `state = torch.randn(1,
case.n_model, 3)`, which masks bugs in `_preprocess()`; replace the synthetic
state with a real pipeline-produced state (e.g. use `processed.input_coords`
from running the pipeline preprocessing step or call the public pipeline method
that yields `processed`) and then compare its atom dimension against
`reward_inputs.elements.shape[-1]`; update the assertion to use `n_state =
processed.input_coords.shape[-2]` (or cast `processed.input_coords` to a tensor
as needed) and assert `n_state == n_reward` to ensure the test validates real
preprocessing output rather than `case.n_model`.

---

Nitpick comments:
In `@src/sampleworks/models/rf3/wrapper.py`:
- Around line 318-345: Add a fail-fast guard after the cleanup block to validate
the model_aa invariant: ensure len(model_aa) equals len(atom_to_token_map) (or
that every index in atom_to_token_map is < len(model_aa)) and that all
coordinates in model_aa.coord are finite (no NaN/Inf). If either check fails,
raise a descriptive exception (including the lengths and a short diagnostics
string) so upstream preprocessing drift is caught early; place this check
immediately after the occupancy set via model_aa.set_annotation and before
further use of model_aa/b_factor.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 6ea609b4-f03d-42eb-9d77-7a2021d47d64

📥 Commits

Reviewing files that changed from the base of the PR and between 0a133a6 and 71377be.

📒 Files selected for processing (4)

src/sampleworks/models/rf3/wrapper.py
tests/integration/test_mismatch_integration.py
tests/integration/test_pipeline_integration.py
tests/integration/test_reward_mask_mismatch.py

🚧 Files skipped from review as they are similar to previous changes (1)

tests/integration/test_pipeline_integration.py

…_atom_array, causing indexing errors downstream

…eward must be computable on the whole model output. This is reasonable because the reconciler takes care of mismatches now, so we can now move the responsibility of ensuring the atoms are all valid for the model to the ModelWrapper itself.

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/sampleworks/utils/guidance_script_utils.py (1)
108-115: ⚠️ Potential issue | 🟠 Major

Call .cpu() before .numpy() when saving trajectories on GPU.

Lines 114 and 138 use .detach().numpy(), which raises RuntimeError on CUDA tensors. The final-state path at line 308 already uses the correct pattern with .detach().cpu().numpy(). These trajectory save paths will cause GPU guidance runs to succeed during computation, then fail only during serialization.
Proposed fix
-        _write_coords_into_array(array_copy, coords.detach().numpy())
+        _write_coords_into_array(array_copy, coords.detach().cpu().numpy())
-        _write_coords_into_array(array_copy, coords[0].detach().numpy())
+        _write_coords_into_array(array_copy, coords[0].detach().cpu().numpy())
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/sampleworks/utils/guidance_script_utils.py` around lines 108 - 115, The
trajectory-saving code uses coords.detach().numpy() which will raise on CUDA
tensors; update the serialization paths that call _write_coords_into_array(...)
to call coords.detach().cpu().numpy() instead (where trajectory is iterated and
array_copy is prepared and passed to save_structure and
_write_coords_into_array) so GPU tensors are moved to CPU before converting to
numpy; mirror the same pattern used for the final-state save.

♻️ Duplicate comments (1)

tests/integration/test_mismatch_integration.py (1)
677-681: ⚠️ Potential issue | 🟡 Minor

Replace the duplicated shape assert with the missing occupancies check.

Line 681 repeats the input_coords assertion, so this test still never verifies that reward_inputs.occupancies stays aligned with model atom count after removing mask_like.
Proposed fix
         assert reward_inputs.elements.shape[-1] == n_model
         assert reward_inputs.b_factors.shape[-1] == n_model
         assert reward_inputs.input_coords.shape[-2] == n_model
-        assert reward_inputs.input_coords.shape[-2] == n_model
+        assert reward_inputs.occupancies.shape[-1] == n_model
As per coding guidelines, `tests/**/*.py` should "Verify test outputs match contracts: shapes, value ranges, mathematical properties".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/integration/test_mismatch_integration.py` around lines 677 - 681, The
fourth assert duplicates the input_coords shape check instead of verifying
occupancies; replace the second "assert reward_inputs.input_coords.shape[-2] ==
n_model" with an assertion that reward_inputs.occupancies.shape[-1] == n_model
so the test verifies occupancies remain aligned with n_model (use the same
n_model variable and reward_inputs object as in the surrounding checks).

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/sampleworks/core/rewards/protocol.py`:
- Around line 69-77: The current validation in RewardInputs.from_atom_array()
checks atom_array.coord (and occupancy/b_factor) but the reward pipeline uses
SampleworksProcessedStructure.input_coords as the canonical coordinates; move
NaN/occupancy coordinate validation out of RewardInputs.from_atom_array() and
instead perform it where the final coordinate path is assembled (i.e., validate
SampleworksProcessedStructure.input_coords before rewards are computed), or
alternatively require and document that wrappers producing model_atom_array must
sanitize model_atom_array.coord (and occupancy/b_factor) before calling
to_reward_inputs(); update checks to reference input_coords or add an explicit
precondition in the RewardInputs.from_atom_array() docstring mentioning
sanitized model_atom_array.

---

Outside diff comments:
In `@src/sampleworks/utils/guidance_script_utils.py`:
- Around line 108-115: The trajectory-saving code uses coords.detach().numpy()
which will raise on CUDA tensors; update the serialization paths that call
_write_coords_into_array(...) to call coords.detach().cpu().numpy() instead
(where trajectory is iterated and array_copy is prepared and passed to
save_structure and _write_coords_into_array) so GPU tensors are moved to CPU
before converting to numpy; mirror the same pattern used for the final-state
save.

---

Duplicate comments:
In `@tests/integration/test_mismatch_integration.py`:
- Around line 677-681: The fourth assert duplicates the input_coords shape check
instead of verifying occupancies; replace the second "assert
reward_inputs.input_coords.shape[-2] == n_model" with an assertion that
reward_inputs.occupancies.shape[-1] == n_model so the test verifies occupancies
remain aligned with n_model (use the same n_model variable and reward_inputs
object as in the surrounding checks).

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 77993f70-fbeb-4d6c-84e5-28e40962442e

📥 Commits

Reviewing files that changed from the base of the PR and between 71377be and 7c854b9.

📒 Files selected for processing (11)

src/sampleworks/core/rewards/protocol.py
src/sampleworks/core/samplers/edm.py
src/sampleworks/eval/generate_synthetic_density.py
src/sampleworks/eval/grid_search_eval_utils.py
src/sampleworks/eval/structure_utils.py
src/sampleworks/models/rf3/wrapper.py
src/sampleworks/utils/atom_array_utils.py
src/sampleworks/utils/guidance_script_utils.py
tests/integration/test_mismatch_integration.py
tests/integration/test_pipeline_integration.py
tests/utils/test_density_utils.py

🚧 Files skipped from review as they are similar to previous changes (6)

src/sampleworks/eval/generate_synthetic_density.py
src/sampleworks/eval/structure_utils.py
src/sampleworks/models/rf3/wrapper.py
tests/utils/test_density_utils.py
src/sampleworks/eval/grid_search_eval_utils.py
src/sampleworks/core/samplers/edm.py

coderabbitai · 2026-03-09T16:11:34Z

+        # input validation: ensure atom_array has required annotations and valid values
+        if not hasattr(atom_array, "element"):
+            raise ValueError("Atom array must have 'element' annotation.")
+        if not hasattr(atom_array, "b_factor"):
+            raise ValueError("Atom array must have 'b_factor' annotation.")
+        if np.any(np.isnan(atom_array.coord)):
+            raise ValueError("Atom array contains NaN coordinates.")
+        if np.any((atom_array.occupancy <= 0) | (atom_array.occupancy > 1)):
+            raise ValueError("Atom array contains invalid occupancy values.")


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash set -euo pipefail printf '\n=== to_reward_inputs implementation ===\n' sed -n '1,220p' src/sampleworks/eval/structure_utils.py printf '\n=== wrapper/model-atom-array sanitation sites ===\n' rg -n -C3 'model_atom_array|occupanc|isnan|isfinite|filter_zero_occupancy|coord' \ src/sampleworks/models src/sampleworks/eval/structure_utils.py

Repository: diff-use/sampleworks

Length of output: 50376

🏁 Script executed:

#!/bin/bash # Check Protenix wrapper for coordinate sanitization rg -n -A10 'model_atom_array|model_aa\s*=|coord.*protenix' \ src/sampleworks/models/protenix/wrapper.py | head -100 # Also check if Protenix structure_processing has any sanitization fd -e py protenix | xargs rg -l 'nan|isfinite|coord' | head -5

Repository: diff-use/sampleworks

Length of output: 2209

Move the NaN/occupancy validation to the final coordinate path, or require all wrappers to sanitize model_atom_array beforehand.

The validation at lines 69-77 checks atom_array.coord before it's used in rewards, but RewardInputs.from_atom_array() uses this only for atom accounting. The actual reward coordinates come from SampleworksProcessedStructure.input_coords (line 97 of structure_utils.py), which are already clean and pre-aligned to the model space. Protenix's wrapper sets occupancy and b_factor on model_atom_array but does not sanitize its coordinates, so this validation will reject valid inputs where the template has placeholder NaNs but input_coords is finite. Either validate input_coords instead, or document that all wrappers must sanitize model_atom_array.coord before calling to_reward_inputs().

🧰 Tools

🪛 Ruff (0.15.4)

[warning] 71-71: Avoid specifying long messages outside the exception class

(TRY003)

[warning] 73-73: Avoid specifying long messages outside the exception class

(TRY003)

[warning] 75-75: Avoid specifying long messages outside the exception class

(TRY003)

[warning] 77-77: Avoid specifying long messages outside the exception class

(TRY003)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/sampleworks/core/rewards/protocol.py` around lines 69 - 77, The current validation in RewardInputs.from_atom_array() checks atom_array.coord (and occupancy/b_factor) but the reward pipeline uses SampleworksProcessedStructure.input_coords as the canonical coordinates; move NaN/occupancy coordinate validation out of RewardInputs.from_atom_array() and instead perform it where the final coordinate path is assembled (i.e., validate SampleworksProcessedStructure.input_coords before rewards are computed), or alternatively require and document that wrappers producing model_atom_array must sanitize model_atom_array.coord (and occupancy/b_factor) before calling to_reward_inputs(); update checks to reference input_coords or add an explicit precondition in the RewardInputs.from_atom_array() docstring mentioning sanitized model_atom_array.

marcuscollins

Questions answered, thanks. Approved.

k-chrispens requested review from Copilot and marcuscollins and removed request for Copilot March 5, 2026 23:44

Copilot started reviewing on behalf of k-chrispens March 5, 2026 23:44 View session

coderabbitai Bot reviewed Mar 5, 2026

View reviewed changes

Comment thread tests/integration/test_mismatch_integration.py Outdated

Comment thread tests/integration/test_pipeline_integration.py

Comment thread tests/integration/test_pipeline_integration.py

marcuscollins reviewed Mar 6, 2026

View reviewed changes

Copilot AI review requested due to automatic review settings March 9, 2026 05:24

Copilot started reviewing on behalf of k-chrispens March 9, 2026 05:24 View session

Copilot AI reviewed Mar 9, 2026

View reviewed changes

Comment thread tests/integration/test_mismatch_integration.py

Comment thread src/sampleworks/core/rewards/protocol.py

coderabbitai Bot reviewed Mar 9, 2026

View reviewed changes

Comment thread tests/integration/test_reward_mask_mismatch.py Outdated

Comment thread tests/integration/test_reward_mask_mismatch.py Outdated

k-chrispens force-pushed the kmc/fix-rf3-nan-reward-mask branch from 71377be to 859566a Compare March 9, 2026 15:56

k-chrispens added 4 commits March 9, 2026 08:57

fix(ty): removing remaining pyright ignores

4dcba73

fix(rf3): fixed bug where nan atoms were still appearing in the model…

a6213e3

…_atom_array, causing indexing errors downstream

chore(cr): remove redundant test code and address final cr nits

7c854b9

k-chrispens force-pushed the kmc/fix-rf3-nan-reward-mask branch from 859566a to 7c854b9 Compare March 9, 2026 15:57

coderabbitai Bot reviewed Mar 9, 2026

View reviewed changes

marcuscollins approved these changes Mar 9, 2026

View reviewed changes

marcuscollins merged commit 9d1ca10 into main Mar 9, 2026
1 check passed

This was referenced Mar 13, 2026

fix(rf3): RF3 was setting up its model_atom_array with elements not in the structure we processed (e.g. Se -> S) which caused a scattering table OOB error #161

Merged

fix(rf3): add_missing_atoms also, along with giving elements that wer… #171

Merged

This was referenced Mar 20, 2026

feat(rf3): chiral gradient zeroing and tracking #182

Merged

fix: update pixi lockfile and address Marcus's docstring nits on previous PR (i had forgotten to add my commits before merging) #184

Merged

coderabbitai Bot mentioned this pull request Mar 27, 2026

fix(rf3): fix issues with 5I09 due to chain breaks and add associated tests #190

Merged

coderabbitai Bot mentioned this pull request Apr 10, 2026

feat(ciffiles): store metadata about ensemble generation directly in output CIF files. #209

Merged

k-chrispens deleted the kmc/fix-rf3-nan-reward-mask branch April 22, 2026 00:48

Conversation

k-chrispens commented Mar 5, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai Bot commented Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

marcuscollins left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

marcuscollins Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

k-chrispens Mar 9, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 9, 2026

Choose a reason for hiding this comment

Uh oh!

marcuscollins left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

k-chrispens commented Mar 5, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Mar 5, 2026 •

edited

Loading