Skip to content

code(eval): refactoring eval scripts, primarily unifying arguments#157

Merged
marcuscollins merged 4 commits intomainfrom
feat-refactor-eval-scripts
Mar 17, 2026
Merged

code(eval): refactoring eval scripts, primarily unifying arguments#157
marcuscollins merged 4 commits intomainfrom
feat-refactor-eval-scripts

Conversation

@marcuscollins
Copy link
Copy Markdown
Collaborator

@marcuscollins marcuscollins commented Mar 12, 2026

Refactoring and unifying arguments to evaluation scripts.

#110

Summary by CodeRabbit

  • Refactor

    • Consolidated evaluation setup and standardized argument parsing across evaluation scripts; switched to a centralized parameter-driven workflow and configurable results path.
    • Simplified trial/config loading and removed legacy on-disk scanning logic; added clearer runtime logging.
  • Bug Fixes

    • Improved validation and error handling for bond geometry calculations to better detect and report invalid structures.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Mar 12, 2026

Warning

Rate limit exceeded

@marcuscollins has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 17 minutes and 6 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: e578087f-0a75-47b9-b04e-47b0869ed50c

📥 Commits

Reviewing files that changed from the base of the PR and between 9c05966 and 9894f70.

📒 Files selected for processing (5)
  • scripts/eval/bond_geometry_eval.py
  • scripts/eval/lddt_evaluation_script.py
  • scripts/eval/rscc_grid_search_script.py
  • scripts/eval/run_and_process_phenix_clashscore.py
  • src/sampleworks/eval/grid_search_eval_utils.py
📝 Walkthrough

Walkthrough

Centralizes evaluation parameter setup via a new setup_evaluation_parameters() and parse_eval_args(), updates evaluation scripts to use these, and adds input validation helper check_pose_and_get_bounds() in bond_geometry_eval.py for bond/angle computations.

Changes

Cohort / File(s) Summary
Core utility
src/sampleworks/eval/grid_search_eval_utils.py
Adds setup_evaluation_parameters(args) that returns (TrialList, dict[str, ProteinConfig]), renames parse_args()parse_eval_args(), replaces workspace_root with grid_search_results_path, and exposes ProteinConfig in imports/exports.
Bond geometry evaluation
scripts/eval/bond_geometry_eval.py
Adds check_pose_and_get_bounds(pose: AtomArray) to validate non-empty structures and presence of bonds; refactors bond_length_violations and bond_angle_violations to use the helper and unified exception handling.
LDDT evaluation
scripts/eval/lddt_evaluation_script.py
Replaces parse_args/scan-based setup with parse_eval_args/setup_evaluation_parameters, obtains trials via setup function, and writes results to args.grid_search_results_path.
RSCC evaluation
scripts/eval/rscc_grid_search_script.py
Switches to parse_eval_args and setup_evaluation_parameters for trial and config loading; removes manual ProteinConfig scanning and grid_search_dir-derived paths.
Clashscore processing
scripts/eval/run_and_process_phenix_clashscore.py
Replaces local CLI parsing with parse_eval_args and uses setup_evaluation_parameters to fetch trials; output path uses args.grid_search_results_path.

Sequence Diagram(s)

sequenceDiagram
    participant CLI as CLI (script)
    participant Setup as grid_search_eval_utils.setup_evaluation_parameters
    participant FS as Filesystem (grid_search results)
    participant Config as ProteinConfig loader
    participant Eval as Evaluation script logic

    CLI->>Setup: parse_eval_args(args)
    Setup->>FS: scan_grid_search_results(args.grid_search_results_path)
    Setup->>Config: ProteinConfig.from_csv(args.protein_configs_csv)
    Setup-->>CLI: (all_trials, protein_configs)
    CLI->>Eval: run evaluations using all_trials + protein_configs
    Eval->>FS: write results to args.grid_search_results_path/*.csv
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested reviewers

  • k-chrispens

Poem

🐰
I hopped through args and trials today,
Gathered configs in a tidy array.
One helper to rule the scattered script sea,
Now evaluations run coordinated and free.
Hops and carrots for CI—hip, hip, hooray! 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 35.71% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the primary focus of the changeset: refactoring evaluation scripts with unified argument handling across multiple eval script files and their shared utilities.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat-refactor-eval-scripts
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (4)
src/sampleworks/eval/grid_search_eval_utils.py (2)

180-188: required=True with default=None is contradictory.

When required=True, argparse will always require the argument, making the default=None unreachable. Consider removing the default for clarity.

🔧 Proposed fix
     parser.add_argument(
         "--grid-search-inputs-path",
         type=Path,
         required=True,
         help="Path to the directory containing the grid search inputs, in particular "
              "the protein configuration CSV file, maps, and reference structures.",
-        default=None,
     )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/sampleworks/eval/grid_search_eval_utils.py` around lines 180 - 188, The
parser.add_argument call for "--grid-search-inputs-path" currently sets
required=True while also specifying default=None, which is contradictory; remove
the default=None (or set required=False if you want a fallback) so the argparse
behavior is consistent—update the parser.add_argument invocation for
"--grid-search-inputs-path" (in the grid_search_eval_utils parser setup) to drop
the default parameter when keeping required=True.

217-241: Missing NumPy-style docstring for setup_evaluation_parameters.

As per coding guidelines, all functions require NumPy-style docstrings. This new public function should document its parameters and return value.

📝 Proposed docstring
 def setup_evaluation_parameters(
         args: argparse.Namespace
 ) -> tuple[TrialList, dict[str, ProteinConfig]]:
+    """
+    Set up evaluation parameters by loading protein configs and scanning for trials.
+
+    Parameters
+    ----------
+    args : argparse.Namespace
+        Parsed command-line arguments containing grid_search_results_path,
+        grid_search_inputs_path, protein_configs_csv, and target_filename.
+
+    Returns
+    -------
+    tuple[TrialList, dict[str, ProteinConfig]]
+        A tuple containing the list of discovered trials and a dictionary
+        mapping protein names to their configurations.
+
+    Raises
+    ------
+    SystemExit
+        If no experiments are found in the grid search directory.
+    """
     grid_search_dir = Path(args.grid_search_results_path)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/sampleworks/eval/grid_search_eval_utils.py` around lines 217 - 241, Add a
NumPy-style docstring to the public function setup_evaluation_parameters that
documents the parameters (args: argparse.Namespace with expected attributes
grid_search_results_path, grid_search_inputs_path, protein_configs_csv,
target_filename) and the return value (tuple[TrialList, dict[str,
ProteinConfig]]), briefly describing what the function does (loads protein
configs, scans grid search results, summarizes trials, and may exit on no
trials). Place the docstring immediately below the def
setup_evaluation_parameters(...) line and include sections: Parameters, Returns,
and Raises/System exit behavior.
scripts/eval/bond_geometry_eval.py (2)

42-44: Remove unused exception variable e.

The exception is caught but never used. Replace with underscore to indicate intentional discard.

🔧 Proposed fix
     try:
         bounds = check_pose_and_get_bounds(pose)
-    except (ValueError, BadStructureError) as e:
+    except (ValueError, BadStructureError):
         return np.nan, pd.DataFrame()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/eval/bond_geometry_eval.py` around lines 42 - 44, The except block
catching ValueError and BadStructureError declares an unused exception variable
`e`; update the handler in the try/except around the call to
check_pose_and_get_bounds(pose) to discard the exception by replacing `except
(ValueError, BadStructureError) as e:` with either `except (ValueError,
BadStructureError) as _:` or simply `except (ValueError, BadStructureError):` so
the unused variable is removed.

90-103: Missing NumPy-style docstring for check_pose_and_get_bounds.

As per coding guidelines, all functions require NumPy-style docstrings.

📝 Proposed docstring
 def check_pose_and_get_bounds(pose: AtomArray):
+    """
+    Validate structure and retrieve distance bounds.
+
+    Parameters
+    ----------
+    pose : AtomArray
+        The structure to validate and compute bounds for.
+
+    Returns
+    -------
+    np.ndarray
+        Distance bounds matrix from RDKit via peppr.
+
+    Raises
+    ------
+    ValueError
+        If the structure is empty or has no bonds.
+    BadStructureError
+        If RDKit cannot compute bounds for the structure.
+    """
     if pose.array_length() == 0:
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/eval/bond_geometry_eval.py` around lines 90 - 103, The function
check_pose_and_get_bounds is missing a NumPy-style docstring; add a docstring
immediately above the function definition that briefly describes what the
function does, documents the pose parameter (type AtomArray), the return value
(bounds from get_distance_bounds and its type), and the exceptions it may raise
(ValueError for empty structure or missing bonds and any errors propagated from
get_distance_bounds, e.g., BadStructureError). Use standard NumPy-style
sections: Parameters, Returns, Raises, and a short one-line summary followed by
a short description, referencing check_pose_and_get_bounds and
get_distance_bounds so readers know where bounds come from.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@scripts/eval/lddt_evaluation_script.py`:
- Line 197: The code assigns the return of setup_evaluation_parameters(args) to
all_experiments but later iterates over all_trials (which is undefined), causing
a NameError; change the variable usage to be consistent by either renaming the
assignment to all_trials or updating the loop to iterate over all_experiments
and ensure setup_evaluation_parameters returns the expected tuple (all_trials,
protein_configs) if needed—verify and update the call sites that use
all_experiments/all_trials (e.g., the assignment line with
setup_evaluation_parameters and the loop at line where all_trials is used) so
the same identifier is used everywhere.

In `@scripts/eval/rscc_grid_search_script.py`:
- Line 272: The script writes rscc_results.csv to the wrong path: change the
df.to_csv call that currently uses args.grid_search_inputs_path to use
args.grid_search_results_path instead so results are saved alongside other
evaluation outputs; locate the df.to_csv(...) invocation in
rscc_grid_search_script.py and replace the path argument from
args.grid_search_inputs_path / "rscc_results.csv" to
args.grid_search_results_path / "rscc_results.csv".

---

Nitpick comments:
In `@scripts/eval/bond_geometry_eval.py`:
- Around line 42-44: The except block catching ValueError and BadStructureError
declares an unused exception variable `e`; update the handler in the try/except
around the call to check_pose_and_get_bounds(pose) to discard the exception by
replacing `except (ValueError, BadStructureError) as e:` with either `except
(ValueError, BadStructureError) as _:` or simply `except (ValueError,
BadStructureError):` so the unused variable is removed.
- Around line 90-103: The function check_pose_and_get_bounds is missing a
NumPy-style docstring; add a docstring immediately above the function definition
that briefly describes what the function does, documents the pose parameter
(type AtomArray), the return value (bounds from get_distance_bounds and its
type), and the exceptions it may raise (ValueError for empty structure or
missing bonds and any errors propagated from get_distance_bounds, e.g.,
BadStructureError). Use standard NumPy-style sections: Parameters, Returns,
Raises, and a short one-line summary followed by a short description,
referencing check_pose_and_get_bounds and get_distance_bounds so readers know
where bounds come from.

In `@src/sampleworks/eval/grid_search_eval_utils.py`:
- Around line 180-188: The parser.add_argument call for
"--grid-search-inputs-path" currently sets required=True while also specifying
default=None, which is contradictory; remove the default=None (or set
required=False if you want a fallback) so the argparse behavior is
consistent—update the parser.add_argument invocation for
"--grid-search-inputs-path" (in the grid_search_eval_utils parser setup) to drop
the default parameter when keeping required=True.
- Around line 217-241: Add a NumPy-style docstring to the public function
setup_evaluation_parameters that documents the parameters (args:
argparse.Namespace with expected attributes grid_search_results_path,
grid_search_inputs_path, protein_configs_csv, target_filename) and the return
value (tuple[TrialList, dict[str, ProteinConfig]]), briefly describing what the
function does (loads protein configs, scans grid search results, summarizes
trials, and may exit on no trials). Place the docstring immediately below the
def setup_evaluation_parameters(...) line and include sections: Parameters,
Returns, and Raises/System exit behavior.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: b386793d-a15d-4605-a2a4-a9cd18b74e8d

📥 Commits

Reviewing files that changed from the base of the PR and between 7495131 and dc8763a.

📒 Files selected for processing (5)
  • scripts/eval/bond_geometry_eval.py
  • scripts/eval/lddt_evaluation_script.py
  • scripts/eval/rscc_grid_search_script.py
  • scripts/eval/run_and_process_phenix_clashscore.py
  • src/sampleworks/eval/grid_search_eval_utils.py

Comment thread scripts/eval/lddt_evaluation_script.py Outdated
Comment thread scripts/eval/rscc_grid_search_script.py Outdated
grid_search_dir = workspace_root / "grid_search_results"
all_trials = scan_grid_search_results(grid_search_dir, target_filename=args.target_filename)
logger.info(f"Found {len(all_trials)} trials with {args.target_filename} files")
# The dropped variable is a list of ProteinConfigs, not used yet in this script
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we reference #97 ?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this is separate. This doesn't use the ProteinConfigs because it analyzes the whole protein, and doesn't break out individual selections. If we think it would be useful to look at clashes for individual selections I believe that's possible, but should be a separate issue.

Copy link
Copy Markdown
Collaborator

@k-chrispens k-chrispens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, but coderabbit caught a few things w.r.t. variable naming and output directory naming. Indicates further need for integration tests on this 😅

Comment thread scripts/eval/rscc_grid_search_script.py Outdated
Comment thread scripts/eval/lddt_evaluation_script.py Outdated
Copy link
Copy Markdown
Collaborator

@k-chrispens k-chrispens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@marcuscollins marcuscollins force-pushed the feat-refactor-eval-scripts branch from 9c05966 to 9894f70 Compare March 17, 2026 19:28
@marcuscollins marcuscollins merged commit 6004513 into main Mar 17, 2026
1 of 4 checks passed
@marcuscollins marcuscollins deleted the feat-refactor-eval-scripts branch March 17, 2026 19:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants