feat(grid search args): rename --use-tweedie; switch --models to --model; switch --methods to --method by marcuscollins · Pull Request #167 · diff-use/sampleworks

marcuscollins · 2026-03-13T23:26:33Z

rename --methods to --method and propagate the change, resolves #151

rename --models to --model and propagate the change; resolves #150

feat(grid search args): rename --use-tweedie argument to --step-scaler-type which defaults to noisespace, resolves #164

Summary by CodeRabbit

Refactor
- Simplified grid search workflow to support single model and method configuration instead of multiple variants, with reorganized command-line arguments for improved clarity
- Replaced the --use-tweedie boolean flag with a new --step-scaler-type option, allowing explicit selection between dataspace, noisespace, and none for scaler configuration

coderabbitai · 2026-03-13T23:26:48Z

Warning

Rate limit exceeded

@marcuscollins has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 23 minutes and 8 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: aeb1b248-a4d6-4b22-998e-6c3d4fecb738

📥 Commits

Reviewing files that changed from the base of the PR and between 5c35e21 and 48f3490.

📒 Files selected for processing (2)

run_all_models.sh
run_grid_search.py

📝 Walkthrough

Walkthrough

This PR consolidates CLI argument handling by converting plural flags to singular (--models to --model, --methods to --method) to enforce single model/method selection, and replaces the --use-tweedie boolean flag with a flexible --step-scaler-type flag that supports multiple scaler options (dataspace, noisespace, none).

Changes

Cohort / File(s)	Summary
Grid Search Configuration `run_grid_search.py`	Updated GridSearchConfig data model from `models: list[str]` and `methods: list[str]` to singular `model: str` and `method: str`. Simplified job generation logic by removing nested loops over multiple models/methods; argument parsing now expects single values via `args.model` and `args.method`. Expanded CLI with additional arguments (step-scaler-type, loss-order, output-dir, dry-run, model-checkpoint). Updated log messages to reflect singular usage.
Guidance Script Arguments `src/sampleworks/utils/guidance_script_arguments.py`	Replaced `--use-tweedie` boolean flag with new `--step-scaler-type` string flag (choices: "dataspace", "noisespace", "none"; default "noisespace"). Updated population logic to assign `self.step_scaler_type` from arguments instead of `self.use_tweedie`.
Guidance Script Utilities `src/sampleworks/utils/guidance_script_utils.py`	Added `NoScalingScaler` import. Replaced use_tweedie-based conditional branching with step_scaler_type mapping: "dataspace" → DataSpaceDPSScaler, "noisespace" → NoiseSpaceDPSScaler, "none" → NoScalingScaler. Raises ValueError on invalid step_scaler_type values. Updated sampler creation to use EDMSamplerConfig and AF3EDMSampler.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 From models many, we picked just one,
And methods singular, now we're done,
No tweedie twists—just scalers to choose,
With dataspace, noisespace, or none to use!
The grid runs cleaner, the flags now sing. 🌟

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 78.33% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main changes: renaming three CLI arguments (--use-tweedie, --models, --methods) and their propagation across the codebase.
Linked Issues check	✅ Passed	All linked issue requirements are met: --methods renamed to --method with updated references, --models renamed to --model with updated references, and --use-tweedie replaced with --step-scaler-type with 'noisespace' as default.
Out of Scope Changes check	✅ Passed	Changes are directly scoped to the three argument renamings and their references; no unrelated modifications detected beyond the explicitly required updates.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat-rename-tweedie-arg

📝 Coding Plan

Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (1)

run_grid_search.py (1)

292-345: Scope method to Boltz-2 jobs to avoid irrelevant run permutations.

method_suffix and JobConfig.method are currently applied to every model. For non-Boltz2 models, this creates extra path/key variance without changing behavior.

Refactor sketch

-        method_suffix = f"_{args.method.replace(' ', '_')}" if args.method else ""
+        effective_method = args.method if model == StructurePredictor.BOLTZ_2 else None
+        method_suffix = f"_{effective_method.replace(' ', '_')}" if effective_method else ""
...
-                                    method=args.method,
+                                    method=effective_method,
...
-                                method=args.method,
+                                method=effective_method,

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@run_grid_search.py` around lines 292 - 345, The code applies method_suffix
and sets JobConfig.method for every model, causing irrelevant permutations;
change the logic so method_suffix and the JobConfig.method field are only set
when the model is the Boltz-2 model (e.g., check if model == "Boltz-2" or
whatever identifier you use for Boltz-2), otherwise use an empty method_suffix
and pass None (or an empty string) for the JobConfig.method; update the loops
that construct output_dir (where method_suffix is used) and the JobConfig
instantiation (where method=args.method is passed) to conditionally include the
method only for Boltz-2 jobs.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@run_grid_search.py`:
- Around line 433-436: The CLI argument for "--model" currently omits "boltz1"
from the choices, causing valid workflows to be rejected; update the parser
definition in run_grid_search.py (the "--model" argument choices list) to
re-include "boltz1" alongside "boltz2", "protenix", and "rf3" so the flag
accepts the legacy Boltz-1 option that get_pixi_env() and related code paths
still support; ensure any downstream validation that checks model names (e.g.,
any references to get_pixi_env or model-specific branching) will handle "boltz1"
consistently.

In `@src/sampleworks/utils/guidance_script_arguments.py`:
- Line 127: The assignment self.step_scaler_type = args.step_scaler_type can
raise AttributeError if args lacks that attribute; change it to use
getattr(args, "step_scaler_type", "noisespace") so self.step_scaler_type always
has a safe default consistent with runtime selection; update the
constructor/initializer where self.step_scaler_type is set (search for the
symbol self.step_scaler_type and the assignment from args) to use getattr with
the "noisespace" fallback.

---

Nitpick comments:
In `@run_grid_search.py`:
- Around line 292-345: The code applies method_suffix and sets JobConfig.method
for every model, causing irrelevant permutations; change the logic so
method_suffix and the JobConfig.method field are only set when the model is the
Boltz-2 model (e.g., check if model == "Boltz-2" or whatever identifier you use
for Boltz-2), otherwise use an empty method_suffix and pass None (or an empty
string) for the JobConfig.method; update the loops that construct output_dir
(where method_suffix is used) and the JobConfig instantiation (where
method=args.method is passed) to conditionally include the method only for
Boltz-2 jobs.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 4f9da7a3-0033-4f05-bef5-b65da64a6122

📥 Commits

Reviewing files that changed from the base of the PR and between 504082c and 1ba1fee.

📒 Files selected for processing (3)

run_grid_search.py
src/sampleworks/utils/guidance_script_arguments.py
src/sampleworks/utils/guidance_script_utils.py

k-chrispens

we should add the boltz1 option back, but other than that looks good! will leave to you to merge.

coderabbitai

Actionable comments posted: 11

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/sampleworks/core/rewards/real_space_density.py (1)

258-289: ⚠️ Potential issue | 🟠 Major

Keep structure_to_reward_input() on the shared extraction path.

This now diverges from extract_density_inputs_from_atomarray(): it skips invalid-coordinate filtering, NaN B-factor repair, and the float32 coercion used everywhere else in the reward path. It also still advertises dict[str, Float[...]] even though "elements" is now torch.long, so callers using this helper can see different runtime and type-checking behavior than the main reward pipeline.

🩹 Suggested simplification

-    def structure_to_reward_input(self, structure: dict) -> dict[str, Float[torch.Tensor, "..."]]:
-        atom_array = structure["asym_unit"]
-        mask = atom_array.occupancy > 0
-        if isinstance(atom_array, AtomArrayStack):
-            atom_array = atom_array[:, mask]
-        else:
-            atom_array = atom_array[mask]
-
-        element_indices = elements_to_scattering_indices(atom_array.element)
-
-        elements = torch.tensor(element_indices, device=self.device, dtype=torch.long).unsqueeze(0)
-        b_factors = torch.from_numpy(atom_array.b_factor).to(self.device).unsqueeze(0)
-        occupancies = torch.from_numpy(atom_array.occupancy).to(self.device).unsqueeze(0)
-
-        coordinates = torch.from_numpy(atom_array.coord).to(self.device)
-        if coordinates.ndim == 2:
-            coordinates = coordinates.unsqueeze(0)
-
-        n_models = coordinates.shape[0]
-        elements = match_batch(elements, target_batch_size=n_models)
-        b_factors = match_batch(b_factors, target_batch_size=n_models)
-        occupancies = match_batch(occupancies, target_batch_size=n_models)
+    def structure_to_reward_input(self, structure: dict) -> dict[str, torch.Tensor]:
+        coordinates, elements, b_factors, occupancies = extract_density_inputs_from_atomarray(
+            structure["asym_unit"], self.device
+        )
 
         return {
             "coordinates": coordinates,
             "elements": elements,
             "b_factors": b_factors,
             "occupancies": occupancies,
         }

As per coding guidelines, "Use cast() or `# ty:ignore[...]` with explanatory comments when type checking cannot be satisfied; treat type errors as real issues."

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/sampleworks/core/rewards/real_space_density.py` around lines 258 - 289,
structure_to_reward_input diverges from extract_density_inputs_from_atomarray by
skipping invalid-coordinate filtering, NaN b-factor repair, and float32
coercion—and its return type annotation falsely claims Float tensors while
"elements" is torch.long. Fix by applying the same preprocessing as
extract_density_inputs_from_atomarray (filter out invalid coords, repair NaN
b_factors, coerce numeric tensors to torch.float32), or simply call/reuse
extract_density_inputs_from_atomarray inside structure_to_reward_input to obtain
coordinates/elements/b_factors/occupancies; then ensure "elements" remains
torch.long but update the function's declared return types (or add a cast/#
type: ignore with comment) so type-checking matches runtime.

♻️ Duplicate comments (1)

src/sampleworks/utils/guidance_script_arguments.py (1)
188-188: ⚠️ Potential issue | 🟡 Minor

Use a defensive fallback for step_scaler_type.

Direct access to args.step_scaler_type can fail with AttributeError if callers construct argparse.Namespace without that field. Using getattr with a default matches the fallback pattern.
Proposed fix
-            self.step_scaler_type = args.step_scaler_type
+            self.step_scaler_type = getattr(args, "step_scaler_type", "noisespace")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/sampleworks/utils/guidance_script_arguments.py` at line 188, Replace the
direct attribute access on args with a defensive getattr to avoid AttributeError
when callers pass an argparse.Namespace missing step_scaler_type: change the
assignment that sets self.step_scaler_type (in guidance_script_arguments.py) to
use getattr(args, "step_scaler_type", <fallback>) so it defaults to a safe value
(e.g., None or the module's existing default) instead of accessing
args.step_scaler_type directly.

🧹 Nitpick comments (1)

src/sampleworks/utils/guidance_script_utils.py (1)

439-453: Normalize step_scaler_type through one canonical vocabulary.

_run_guidance() now expects dataspace / noisespace / none, while the rest of the codebase already exposes data_space_dps / noise_space_dps / no_scaling via StepScalers. Keeping both spellings makes configs and job queues brittle.

♻️ Example alias-based normalization

-from sampleworks.utils.guidance_constants import (
-    GuidanceType,
-    StructurePredictor,
-)
+from sampleworks.utils.guidance_constants import (
+    GuidanceType,
+    StepScalers,
+    StructurePredictor,
+)
...
-    step_scaler_type = getattr(args, "step_scaler_type", "noisespace")
-    if step_scaler_type == "dataspace":
+    step_scaler_aliases = {
+        "dataspace": StepScalers.DATA_SPACE_DPS,
+        "data_space_dps": StepScalers.DATA_SPACE_DPS,
+        "noisespace": StepScalers.NOISE_SPACE_DPS,
+        "noise_space_dps": StepScalers.NOISE_SPACE_DPS,
+        "none": StepScalers.NO_SCALING,
+        "no_scaling": StepScalers.NO_SCALING,
+    }
+    step_scaler_type = step_scaler_aliases[getattr(args, "step_scaler_type", "noisespace")]
+    if step_scaler_type is StepScalers.DATA_SPACE_DPS:
         step_scaler = DataSpaceDPSScaler(
             step_size=args.step_size,
             gradient_normalization=args.gradient_normalization,
         )
-    elif step_scaler_type == "noisespace":
+    elif step_scaler_type is StepScalers.NOISE_SPACE_DPS:
         step_scaler = NoiseSpaceDPSScaler(
             step_size=args.step_size,
             gradient_normalization=args.gradient_normalization,
         )
-    elif step_scaler_type == "none":
+    elif step_scaler_type is StepScalers.NO_SCALING:
         step_scaler = NoScalingScaler()

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/sampleworks/utils/guidance_script_utils.py` around lines 439 - 453, The
code uses ad-hoc values for step_scaler_type in _run_guidance() (e.g.,
"dataspace"/"noisespace"/"none") which conflicts with the canonical StepScalers
vocabulary used elsewhere (data_space_dps/noise_space_dps/no_scaling); add a
small alias normalization before the existing if/elif block: create an alias map
that maps "dataspace"->"data_space_dps", "noisespace"->"noise_space_dps",
"none"->"no_scaling" (and pass-through unknowns), then assign step_scaler_type =
alias_map.get(step_scaler_type, step_scaler_type) and update the conditional to
check the canonical names (or switch to a mapping from canonical name to class:
data_space_dps->DataSpaceDPSScaler, noise_space_dps->NoiseSpaceDPSScaler,
no_scaling->NoScalingScaler) so _run_guidance(), step_scaler_type,
DataSpaceDPSScaler, NoiseSpaceDPSScaler, and NoScalingScaler all use the single
canonical vocabulary.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@GRID_SEARCH.md`:
- Around line 29-30: Update GRID_SEARCH.md to use the singular CLI flags that
the parser expects: replace occurrences of "--models" with "--model" and
"--methods" with "--method" in the documented example commands (these correspond
to the parser changes in run_grid_search.py around the argument handling for
model/method). Ensure all examples (including the blocks around the previous
lines 29-30 and 54-58) consistently use "--model" and "--method" so the docs
match the code.
- Line 85: Replace the typo "imput" with "input" in the user-facing text that
currently reads "the Boltz imput YAML file" (search for the exact phrase "Boltz
imput YAML file" in GRID_SEARCH.md) so the line reads "the Boltz input YAML
file".
- Line 29: Update the documented model options in GRID_SEARCH.md to match the
argparse choices enforced in run_grid_search.py (the argument parser around
lines where model choices are defined, e.g., the parser.add_argument call for
the --models flag in run_grid_search.py:402-425) by removing "boltz1" and
listing only "boltz2, protenix, rf3" (ensure both occurrences at the two lines
noted are changed).
- Around line 66-68: Remove the obsolete reference to --use-tweedie in
GRID_SEARCH.md and replace it with documentation for the new CLI flag
--step-scaler-type (default "noisespace"); note that this flag replaces
--use-tweedie and can affect whether changing the flag triggers a re-run
(similar to other flags that are/aren't reflected in the directory structure),
and mention the default value and where the option is defined (the parser
handling --step-scaler-type in run_grid_search.py around the previous
--use-tweedie logic). Ensure the sentence about using --force-all to re-run
remains and update any examples or flag lists to show --step-scaler-type instead
of --use-tweedie.

In `@run_all_models.sh`:
- Line 17: The script currently uses "set -e" which doesn't propagate failures
through pipelines so a failing "docker run ... | tee" can be masked; update the
top-level shell options to enable pipefail (e.g., replace the existing "set -e"
with "set -e -o pipefail" or "set -eo pipefail") and/or explicitly check
PIPESTATUS after each pipeline that runs "docker run" piped to "tee" (capture
PIPESTATUS[0] into a variable and exit non‑zero if it failed); search for
occurrences of "docker run", "tee" and the existing "set -e" in
run_all_models.sh and apply one of these fixes consistently (modifying the
global set flags is preferred).
- Around line 124-133: In the protenix invocation in run_all_models.sh replace
the incorrect flag --models with --model and add the missing --method flag
(matching how other blocks call run_grid_search.py), e.g., update the protenix
run_grid_search.py CLI line that currently contains --models protenix to use
--model protenix and include the appropriate --method <method_name> token used
elsewhere in the script so the command mirrors the fixed RosettaFold3 block.
- Around line 59-69: The CLI flags in the boltz invocation still use the old
plural names (--models and --methods) which were renamed to --model and --method
in run_grid_search.py; update the boltz command invocation in run_all_models.sh
(the line invoking run_grid_search.py) to use --model boltz2 and --method "X-RAY
DIFFRACTION" instead of --models and --methods so the script matches the new CLI
signature used by the run_grid_search.py entrypoint.
- Around line 81-91: The command in run_all_models.sh uses plural flag names but
the CLI expects singular flags; update the boltz run_grid_search.py invocation
to use --model instead of --models, --method instead of --methods, --scaler
instead of --scalers, --ensemble-size instead of --ensemble-sizes and
--gradient-weight instead of --gradient-weights while keeping the same values
(e.g., replace --models boltz2, --methods "MD", --scalers pure_guidance,
--ensemble-sizes "8", --gradient-weights "0.1 0.2 0.5" with their singular
equivalents) so the flags passed to the script match the expected parameter
names.
- Around line 103-112: Change the rf3 command invocation to use the correct
argument name --model (singular) instead of --models when calling
run_grid_search.py; update the command that launches rf3 (the block invoking
"rf3 run_grid_search.py") to pass --model rf3. Also review the --method flag in
that same invocation: either explicitly set a rf3-appropriate value instead of
relying on the Boltz2 default ("X-RAY DIFFRACTION") or remove the --method flag
if RosettaFold3 (rf3) does not use that concept, so the invocation’s flags match
the run_grid_search.py parser and rf3 semantics.

In `@src/sampleworks/models/rf3/wrapper.py`:
- Around line 357-371: The x_init tensor returned by match_batch can be a
broadcast view and should be cloned to avoid shared-memory in-place mutations;
update the branch that sets x_init = match_batch(x_init.unsqueeze(0),
target_batch_size=ensemble_size) to call .clone() on the result (ensuring
device/dtype are preserved) so it matches the pattern used by other wrappers
(see Boltz/Protenix) and prevent corruption during later in-place diffusion
updates; leave the initialize_from_prior path unchanged.

In `@src/sampleworks/utils/guidance_script_arguments.py`:
- Around line 38-52: The current logic raises ValueError when resolved is empty
and then attempts to check Path(resolved).exists(), which is unreachable; update
the control flow in the function handling candidates/resolved so you first set
resolved = candidates[0] if candidates else "" then if not resolved raise the
ValueError, otherwise perform the filesystem check (if not
Path(resolved).exists() raise ValueError about missing checkpoint) and return
resolved; also replace the EN DASH (–) in the comment with a regular
hyphen-minus (-) to remove the static-analysis warning. Ensure you adjust the
branches around the resolved variable and the Path(...) existence check
accordingly.

---

Outside diff comments:
In `@src/sampleworks/core/rewards/real_space_density.py`:
- Around line 258-289: structure_to_reward_input diverges from
extract_density_inputs_from_atomarray by skipping invalid-coordinate filtering,
NaN b-factor repair, and float32 coercion—and its return type annotation falsely
claims Float tensors while "elements" is torch.long. Fix by applying the same
preprocessing as extract_density_inputs_from_atomarray (filter out invalid
coords, repair NaN b_factors, coerce numeric tensors to torch.float32), or
simply call/reuse extract_density_inputs_from_atomarray inside
structure_to_reward_input to obtain coordinates/elements/b_factors/occupancies;
then ensure "elements" remains torch.long but update the function's declared
return types (or add a cast/# type: ignore with comment) so type-checking
matches runtime.

---

Duplicate comments:
In `@src/sampleworks/utils/guidance_script_arguments.py`:
- Line 188: Replace the direct attribute access on args with a defensive getattr
to avoid AttributeError when callers pass an argparse.Namespace missing
step_scaler_type: change the assignment that sets self.step_scaler_type (in
guidance_script_arguments.py) to use getattr(args, "step_scaler_type",
<fallback>) so it defaults to a safe value (e.g., None or the module's existing
default) instead of accessing args.step_scaler_type directly.

---

Nitpick comments:
In `@src/sampleworks/utils/guidance_script_utils.py`:
- Around line 439-453: The code uses ad-hoc values for step_scaler_type in
_run_guidance() (e.g., "dataspace"/"noisespace"/"none") which conflicts with the
canonical StepScalers vocabulary used elsewhere
(data_space_dps/noise_space_dps/no_scaling); add a small alias normalization
before the existing if/elif block: create an alias map that maps
"dataspace"->"data_space_dps", "noisespace"->"noise_space_dps",
"none"->"no_scaling" (and pass-through unknowns), then assign step_scaler_type =
alias_map.get(step_scaler_type, step_scaler_type) and update the conditional to
check the canonical names (or switch to a mapping from canonical name to class:
data_space_dps->DataSpaceDPSScaler, noise_space_dps->NoiseSpaceDPSScaler,
no_scaling->NoScalingScaler) so _run_guidance(), step_scaler_type,
DataSpaceDPSScaler, NoiseSpaceDPSScaler, and NoScalingScaler all use the single
canonical vocabulary.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: a1d7f20e-cd12-42c3-9d2e-3c60fd3b073e

📥 Commits

Reviewing files that changed from the base of the PR and between 1ba1fee and e95365b.

📒 Files selected for processing (19)

.github/workflows/docker.yml
Dockerfile
GRID_SEARCH.md
run_all_models.sh
scripts/eval/rscc_grid_search_script.py
src/sampleworks/core/forward_models/xray/real_space_density_deps/qfit/sf.py
src/sampleworks/core/rewards/protocol.py
src/sampleworks/core/rewards/real_space_density.py
src/sampleworks/core/samplers/edm.py
src/sampleworks/models/rf3/wrapper.py
src/sampleworks/utils/density_utils.py
src/sampleworks/utils/elements.py
src/sampleworks/utils/guidance_script_arguments.py
src/sampleworks/utils/guidance_script_utils.py
tests/conftest.py
tests/eval/test_eval_dataclasses.py
tests/integration/test_mismatch_integration.py
tests/integration/test_pipeline_integration.py
tests/rewards/test_real_space_density_reward.py

✅ Files skipped from review due to trivial changes (2)

.github/workflows/docker.yml
Dockerfile

coderabbitai · 2026-03-17T19:01:25Z

+```bash
+pixi run -e boltz python run_grid_search.py \
+    --proteins proteins.csv \
+    --models boltz2 \                # options: boltz1, boltz2, protenix, rf3 (make sure env aligns!)


⚠️ Potential issue | 🟠 Major

Fix documented model choices to match parser constraints.

Line 29 and Line 54 include boltz1, but run_grid_search.py:402-425 only allows boltz2, protenix, rf3. Please align the option list with actual argparse choices.

Also applies to: 54-54

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@GRID_SEARCH.md` at line 29, Update the documented model options in GRID_SEARCH.md to match the argparse choices enforced in run_grid_search.py (the argument parser around lines where model choices are defined, e.g., the parser.add_argument call for the --models flag in run_grid_search.py:402-425) by removing "boltz1" and listing only "boltz2, protenix, rf3" (ensure both occurrences at the two lines noted are changed).

coderabbitai · 2026-03-17T19:01:25Z

+| `W`       | the gradient weight.                                                |
+
+In that directory, you will find the following files generated by Sampleworks
+(in addition to model-specific files, e.g. the Boltz imput YAML file):


⚠️ Potential issue | 🟡 Minor

Typo in user-facing docs.

Line 85: “imput” → “input”.

🧰 Tools

🪛 LanguageTool

[grammar] ~85-~85: Ensure spelling is correct
Context: ...to model-specific files, e.g. the Boltz imput YAML file): - refined.cif: the refined...

(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@GRID_SEARCH.md` at line 85, Replace the typo "imput" with "input" in the user-facing text that currently reads "the Boltz imput YAML file" (search for the exact phrase "Boltz imput YAML file" in GRID_SEARCH.md) so the line reads "the Boltz input YAML file".

coderabbitai · 2026-03-17T19:01:25Z

 # Usage:
 #   ./run_all_models.sh

 set -e


⚠️ Potential issue | 🟡 Minor

Background job failures may go undetected.

With set -e but without set -o pipefail, the exit code of docker run is masked by tee. If a Docker job fails, the script will still report "All jobs completed!" and exit 0.

Proposed fix to detect job failures

set -e +set -o pipefail

Alternatively, capture and check exit codes explicitly:

-# Wait for all background jobs -wait +# Wait for all background jobs and track failures +FAILED=0 +for pid in "${PIDS[@]}"; do + if ! wait "$pid"; then + echo "Job PID $pid failed" + FAILED=1 + fi +done echo "" echo "==========================================" -echo "[$(date)] All jobs completed!" +if [ "$FAILED" -eq 0 ]; then + echo "[$(date)] All jobs completed successfully!" +else + echo "[$(date)] Some jobs failed. Check logs for details." + exit 1 +fi echo "=========================================="

Also applies to: 150-151

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@run_all_models.sh` at line 17, The script currently uses "set -e" which doesn't propagate failures through pipelines so a failing "docker run ... | tee" can be masked; update the top-level shell options to enable pipefail (e.g., replace the existing "set -e" with "set -e -o pipefail" or "set -eo pipefail") and/or explicitly check PIPESTATUS after each pipeline that runs "docker run" piped to "tee" (capture PIPESTATUS[0] into a variable and exit non‑zero if it failed); search for occurrences of "docker run", "tee" and the existing "set -e" in run_all_models.sh and apply one of these fixes consistently (modifying the global set flags is preferred).

coderabbitai · 2026-03-17T19:01:26Z

+        # x_init here is a shape-compatible reference carried with the featurized
+        # model input. During guided sampling, alignment/reference coordinates are
+        # built later via process_structure_to_trajectory_input() and AtomReconciler.
+        # TODO: figure out if this is necessary or if we should just remove x_init completely.
+        if len(true_atom_array) == num_atoms:
            x_init = torch.tensor(true_atom_array.coord, device=self.device, dtype=torch.float32)
-            x_init = match_batch(x_init.unsqueeze(0), target_batch_size=ensemble_size).clone()
+            x_init = match_batch(x_init.unsqueeze(0), target_batch_size=ensemble_size)
        else:
-            logger.warning(
-                "True structure not available or atom count mismatch; initializing "
-                "x_init from prior. This means align_to_input will not work properly,"
-                " and reward functions dependent on this won't be accurate."
+            logger.info(
+                f"Input structure has {len(true_atom_array)} atoms, but RF3 operates on "
+                f"{num_atoms} atoms after model-specific preprocessing. Initializing "
+                "x_init from prior to match model shape; guided sampling will "
+                "reconcile alignment and reward inputs later on the common atom subset."
            )
            x_init = self.initialize_from_prior(batch_size=ensemble_size, shape=(num_atoms, 3))


⚠️ Potential issue | 🟠 Major

Missing .clone() after match_batch may cause shared memory issues.

The match_batch function returns a broadcast view (no copy) when the input batch size is 1. If x_init is later mutated in-place during diffusion steps, this could corrupt the original tensor or cause unexpected behavior.

Other wrappers (Boltz at Line 720, Protenix at Line 512) retain .clone() after match_batch for this reason.

Proposed fix

if len(true_atom_array) == num_atoms: x_init = torch.tensor(true_atom_array.coord, device=self.device, dtype=torch.float32) - x_init = match_batch(x_init.unsqueeze(0), target_batch_size=ensemble_size) + x_init = match_batch(x_init.unsqueeze(0), target_batch_size=ensemble_size).clone()

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/sampleworks/models/rf3/wrapper.py` around lines 357 - 371, The x_init tensor returned by match_batch can be a broadcast view and should be cloned to avoid shared-memory in-place mutations; update the branch that sets x_init = match_batch(x_init.unsqueeze(0), target_batch_size=ensemble_size) to call .clone() on the result (ensuring device/dtype are preserved) so it matches the pattern used by other wrappers (see Boltz/Protenix) and prevent corruption during later in-place diffusion updates; leave the initialize_from_prior path unchanged.

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@run_grid_search.py`:
- Line 543: Update the misleading log and help text that refer specifically to
"Boltz2": change the runtime log call log.info(f"Boltz2 method: {args.method}")
to a neutral message such as log.info(f"Method: {args.method}") and update the
argparse help string where --method is defined (the parser.add_argument call
that sets the help for --method) to a generic description like "Method parameter
(applies to the selected model; e.g., boltz2, boltz1, protenix, rf3)" so the
help and logging correctly reflect that --method is not exclusive to Boltz2.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f3cb6635-5101-48d3-8001-b537a7175a42

📥 Commits

Reviewing files that changed from the base of the PR and between e95365b and ea968cf.

📒 Files selected for processing (1)

run_grid_search.py

coderabbitai · 2026-03-17T19:09:36Z

    log.info(f"Gradient weights: {args.gradient_weights}")
    log.info(f"GD steps: {args.num_gd_steps}")
-    log.info(f"Boltz2 methods: {args.methods}")
+    log.info(f"Boltz2 method: {args.method}")


⚠️ Potential issue | 🟡 Minor

Log message is misleading for non-Boltz2 models.

The log says "Boltz2 method" but --method applies regardless of which model is selected. Similarly, the help text at line 447 says "Method for Boltz2" which is confusing when using boltz1, protenix, or rf3.

Suggested fix

- log.info(f"Boltz2 method: {args.method}") + log.info(f"Method: {args.method}")

And at line 447:

- help="Method for Boltz2 ('X-RAY DIFFRACTION', 'MD')", + help="Experimental method ('X-RAY DIFFRACTION', 'MD')",

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@run_grid_search.py` at line 543, Update the misleading log and help text that refer specifically to "Boltz2": change the runtime log call log.info(f"Boltz2 method: {args.method}") to a neutral message such as log.info(f"Method: {args.method}") and update the argparse help string where --method is defined (the parser.add_argument call that sets the help for --method) to a generic description like "Method parameter (applies to the selected model; e.g., boltz2, boltz1, protenix, rf3)" so the help and logging correctly reflect that --method is not exclusive to Boltz2.

…r-type which defaults to noisespace

…change; resolves #150

…e change, resolves #151

coderabbitai

🧹 Nitpick comments (1)

run_grid_search.py (1)

276-277: Consider adding a NumPy-style docstring.

The generate_jobs function lacks documentation. Per coding guidelines, functions should include NumPy-style docstrings describing parameters and return values.

📝 Suggested docstring

 def generate_jobs(args: argparse.Namespace) -> list[JobConfig]:
+    """
+    Generate job configurations for grid search.
+
+    Parameters
+    ----------
+    args : argparse.Namespace
+        Parsed command-line arguments including model, method, scalers,
+        ensemble sizes, gradient weights, and GD steps.
+
+    Returns
+    -------
+    list[JobConfig]
+        List of job configurations covering all parameter combinations.
+    """
     jobs = []

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@run_grid_search.py` around lines 276 - 277, Add a NumPy-style docstring to
the generate_jobs function: document the args parameter (type
argparse.Namespace) and describe what fields of args are used, and document the
return value (list[JobConfig]) including what each JobConfig represents; place
the docstring directly under the def generate_jobs(...): signature using the
NumPy format with Parameters and Returns sections so the function is properly
documented for readers and automated tools.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@run_grid_search.py`:
- Around line 276-277: Add a NumPy-style docstring to the generate_jobs
function: document the args parameter (type argparse.Namespace) and describe
what fields of args are used, and document the return value (list[JobConfig])
including what each JobConfig represents; place the docstring directly under the
def generate_jobs(...): signature using the NumPy format with Parameters and
Returns sections so the function is properly documented for readers and
automated tools.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: b20cd6e7-cb44-4c1f-ab48-4de73e48b87d

📥 Commits

Reviewing files that changed from the base of the PR and between ea968cf and 5c35e21.

📒 Files selected for processing (3)

run_grid_search.py
src/sampleworks/utils/guidance_script_arguments.py
src/sampleworks/utils/guidance_script_utils.py

🚧 Files skipped from review as they are similar to previous changes (1)

src/sampleworks/utils/guidance_script_arguments.py

…n run_all_models.sh

marcuscollins had a problem deploying to gpu-testing March 13, 2026 23:26 — with GitHub Actions Error

marcuscollins requested review from DorisMai and k-chrispens and removed request for k-chrispens March 13, 2026 23:29

coderabbitai Bot reviewed Mar 13, 2026

View reviewed changes

Comment thread run_grid_search.py

Comment thread src/sampleworks/utils/guidance_script_arguments.py

k-chrispens approved these changes Mar 16, 2026

View reviewed changes

Comment thread run_grid_search.py

coderabbitai Bot mentioned this pull request Mar 16, 2026

📝 CodeRabbit Chat: Implement requested code changes #175

Closed

marcuscollins had a problem deploying to gpu-testing March 17, 2026 18:46 — with GitHub Actions Error

marcuscollins had a problem deploying to gpu-testing March 17, 2026 18:47 — with GitHub Actions Error

marcuscollins temporarily deployed to gpu-testing March 17, 2026 18:47 — with GitHub Actions Inactive

coderabbitai Bot reviewed Mar 17, 2026

View reviewed changes

marcuscollins added 3 commits March 17, 2026 12:12

feat(grid search args): rename --use-tweedie argument to --step-scale…

46d939e

…r-type which defaults to noisespace

feat(grid search args): rename --models to --model and propagate the …

2c00d86

…change; resolves #150

feat(grid search args): rename --methods to --method and propagate th…

d2f1f6e

…e change, resolves #151

marcuscollins force-pushed the feat-rename-tweedie-arg branch from ea968cf to 5c35e21 Compare March 17, 2026 19:15

marcuscollins had a problem deploying to gpu-testing March 17, 2026 19:16 — with GitHub Actions Error

coderabbitai Bot reviewed Mar 17, 2026

View reviewed changes

fix:add back boltz1 option to run_grid_search.py, correct arguments i…

48f3490

…n run_all_models.sh

marcuscollins force-pushed the feat-rename-tweedie-arg branch from 5c35e21 to 48f3490 Compare March 17, 2026 19:22

marcuscollins temporarily deployed to gpu-testing March 17, 2026 19:22 — with GitHub Actions Inactive

marcuscollins merged commit fbbd918 into main Mar 17, 2026
4 checks passed

marcuscollins deleted the feat-rename-tweedie-arg branch March 17, 2026 19:45

coderabbitai Bot mentioned this pull request Apr 9, 2026

Add configurable recycling steps and diffusion step count parameters #205

Merged

coderabbitai Bot mentioned this pull request Apr 30, 2026

Add flexible Diffuse params runs #225

Open

Conversation

marcuscollins commented Mar 13, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

k-chrispens left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

marcuscollins commented Mar 13, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Mar 13, 2026 •

edited

Loading