eval: script for classifying conformational changes by k-chrispens · Pull Request #223 · diff-use/sampleworks

k-chrispens · 2026-04-17T20:27:58Z

Evaluation script for classifying conformational changes, which we use for downstream analysis

Summary by CodeRabbit

Chores
- Broadened repository ignore patterns to cover additional archive/data file types and expanded dataset directory matching; added logic to better resolve structure file paths when locating CIF files.
New Features
- Added a structural-variant region classification tool that categorizes alternate-location spans, computes backbone similarity scores between variants, and outputs per-span scores and detailed CSV reports.

coderabbitai · 2026-04-17T20:28:15Z

Warning

Rate limit exceeded

@k-chrispens has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 49 minutes and 57 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 49 minutes and 57 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: cfa4f7f4-d649-43c3-8629-9a60a0db1be8

📥 Commits

Reviewing files that changed from the base of the PR and between f9be48e and da69e98.

📒 Files selected for processing (1)

src/sampleworks/eval/grid_search_eval_utils.py

📝 Walkthrough

Walkthrough

Added a new CLI evaluation script to classify contiguous altloc spans in CIF structures by backbone lDDT and span size, a CIF-path resolver helper, and extended .gitignore to ignore archive, CSV, and dataset path patterns.

Changes

Cohort / File(s)	Summary
Configuration & Artifacts `\.gitignore`	Extended ignore patterns: `initial_dataset_40` (prefix match), `.tar.gz`, `.tgz`, `.csv`; re‑included `src/sampleworks/data/protein_configs.csv`; removed trailing newline.
Altloc classification script `scripts/eval/classify_altloc_regions.py`	New CLI script that resolves CIF paths, enumerates altloc IDs, builds pairwise AtomArray inputs, computes per-residue backbone lDDT across altloc pairs, classifies spans as `side_chain_only`, `domain_shift`, `small_loop`, or `large_loop`, and writes CSV output with `worst_pair_mean_backbone_lddt`, `n_backbone_altloc_residues`, `n_altlocs`, and JSON `pair_lddts`.
CIF path resolution helper `src/sampleworks/eval/grid_search_eval_utils.py`	Added `resolve_cif_path(row: pd.Series, cif_root: Path

Sequence Diagram(s)

sequenceDiagram
  participant CLI as CLI
  participant CSV as Selections CSV
  participant Resolver as resolve_cif_path
  participant Loader as Structure Loader
  participant LDDT as AllAtomLDDT
  participant Output as CSV Writer

  CLI->>CSV: read selection rows
  loop for each row/selection
    CLI->>Resolver: resolve CIF path (row, --cif-root)
    Resolver-->>CLI: cif path
    CLI->>Loader: load CIF, extract altloc atom arrays
    Loader-->>CLI: altloc arrays & metadata
    CLI->>LDDT: build pairwise arrays & compute per-residue backbone lDDT
    LDDT-->>CLI: per-pair per-residue lDDT scores
    CLI->>CLI: classify span (side_chain_only / domain_shift / small_loop / large_loop)
    CLI->>Output: write classification row (incl. pair_lddts JSON)
  end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Adding new evaluation scripts and moving eval scripts to a new home. #106: Related — produces the selections CSV consumed by scripts/eval/classify_altloc_regions.py and touches the altloc evaluation workflow.

Poem

🐰 I hop through CIFs where altlocs play,
Counting backbone runs in a curious way,
I pair each pose, score loops with delight,
Tagging shifts by day and by night,
While .gitignore tucks away the stray.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 75.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'eval: script for classifying conformational changes' directly and specifically describes the main addition: a new evaluation script for classifying conformational changes.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch kmc/classify-conf-change

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Copilot

Pull request overview

Adds an evaluation-time script to classify contiguous altloc (alternate location) regions into conformational-change bins for downstream analysis, based on backbone altloc presence, span length, and pairwise backbone lDDT between altlocs.

Changes:

Introduce scripts/eval/classify_altloc_regions.py to load structures with altlocs, score pairwise altloc comparisons, and emit a per-span classification CSV.
Update .gitignore to ignore initial_dataset_40* directories and additional artifact patterns.

Reviewed changes

Copilot reviewed 1 out of 2 changed files in this pull request and generated 3 comments.

File	Description
scripts/eval/classify_altloc_regions.py	New CLI script to classify altloc spans into side-chain-only / loop-size / domain-shift bins using pairwise backbone lDDT.
.gitignore	Expands ignore patterns for large datasets and artifacts (including CSVs).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

coderabbitai

Actionable comments posted: 4

🧹 Nitpick comments (4)

scripts/eval/classify_altloc_regions.py (4)
356-360: Shadowing the outer row: pd.Series with the per-span row: dict is confusing.

_process_structure takes row: pd.Series and then rebinds row on line 358 inside the loop. It happens to be harmless today (the outer row isn't referenced after the loop), but the name collision makes the function harder to read and one accidental reintroduction of row["protein"] below the loop would silently pick up the last span's dict. Rename one of them, e.g. span_row, covered = out.
♻️ Proposed rename
-        row, covered = out
-        rows.append(row)
+        span_row, covered = out
+        rows.append(span_row)
         classified_res_ids.update(covered)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/eval/classify_altloc_regions.py` around lines 356 - 360, The loop
inside _process_structure shadows the outer parameter row: pd.Series by
rebinding row to a per-span dict; rename the inner binding to avoid confusion
and potential bugs (e.g., use span_row, covered = out) and update subsequent
references in that loop to span_row so the outer row variable remains distinct;
ensure variables like classified_res_ids and rows continue to receive the
correct values (append span_row to rows or adjust as intended) and run tests to
verify behavior unchanged.
134-162: Tighten the pair-array typing.

dict[tuple[str, str], tuple[object, object]] loses all information — every caller immediately uses these as AtomArrays. Annotating as tuple[AtomArray, AtomArray] (with from biotite.structure import AtomArray) makes _mean_residue_lddt_for_pair's signature honest too, and lets the type checker catch the None branch on line 286 rather than relying on the runtime guard inside _mean_residue_lddt_for_pair.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/eval/classify_altloc_regions.py` around lines 134 - 162, Change the
loose return/type annotation on _build_pairwise_altloc_arrays from
dict[tuple[str,str], tuple[object,object]] to dict[tuple[str,str],
tuple[AtomArray, AtomArray]] and import AtomArray from biotite.structure; ensure
the local pairs variable is typed the same way. Also update the callee signature
_mean_residue_lddt_for_pair to accept AtomArray pairs
(tuple[AtomArray,AtomArray]) instead of object so the type checker will flag
None branches earlier; keep the existing runtime logic that uses select_altloc
and filter_to_common_atoms unchanged.
62-63: Import-time sys.path.insert is a hidden side effect; prefer a relative/package import.

Mutating sys.path at module import is exactly the kind of side effect the coding guidelines ask to be explicit about, and it also makes lddt_evaluation_script order-sensitive (anything earlier on sys.path with the same name wins). Since both scripts live under scripts/eval/, either:

extract translate_selection into src/sampleworks/... and import it normally, or

invoke via python -m scripts.eval.classify_altloc_regions with an __init__.py so a proper relative import works.

As per coding guidelines: "Only comment when truly necessary. ALWAYS annotate complex array shapes and note side effects."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/eval/classify_altloc_regions.py` around lines 62 - 63, Remove the
import-time sys.path.insert side-effect and replace it with a proper
package-relative import or move of the utility: either (A) make scripts/eval a
package (add __init__.py) and use a relative import from within
classify_altloc_regions to import translate_selection from
lddt_evaluation_script (e.g., from .lddt_evaluation_script import
translate_selection), or (B) refactor translate_selection into the project src
package (e.g., src.sampleworks...) and import it with an absolute import (e.g.,
from sampleworks.lddt_evaluation_script import translate_selection); update
classify_altloc_regions to remove sys.path mutation and adjust any invocation
(use python -m scripts.eval.classify_altloc_regions if relying on package
execution).
339-344: Fragile heuristic for skipping the upstream union selection.

if " or " in selection_str: continue relies on a formatting contract in find_altloc_selections.py that isn't enforced anywhere — if that script ever emits individual spans with embedded or (e.g., once it grows residue-range syntax or multi-chain clauses), those spans will be silently dropped. Safer options:

have find_altloc_selections.py emit the union under a distinct column (e.g., union_selection) and not splice it into selection, or

tag the union with a sentinel prefix like UNION: that this script can test for explicitly.

At minimum, worth a comment here pointing at the exact contract in find_altloc_selections.py so future edits don't diverge.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/eval/classify_altloc_regions.py` around lines 339 - 344, The current
heuristic in the loop that skips union selections by checking if " or " appears
in selection_str is fragile; update the logic to look for an explicit sentinel
(e.g., prefix "UNION:") that find_altloc_selections.py will emit for the
combined union selection and skip only when selection_str.startswith("UNION:"),
and add a short colocated comment referencing find_altloc_selections.py (and the
sentinel contract) so future changes don't reintroduce the fragile " or " check;
if you cannot change the producer, add a conservative fallback comment
explaining the existing format dependency and assert/raise if a selection
unexpectedly contains " or " but is not prefixed to make the contract explicit.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.gitignore:
- Around line 227-230: The blanket "*.csv" ignore will cause tracked CSVs to be
ignored on adds; update .gitignore to add a negation for the specific tracked
CSV (protein_configs.csv) so that it remains committable despite the "*.csv"
rule—add a "!<path-to>/protein_configs.csv" negation entry immediately after the
"*.csv" pattern or remove the blanket "*.csv" and instead list only undesired
CSV patterns.

In `@scripts/eval/classify_altloc_regions.py`:
- Around line 292-298: The current RuntimeError raised when all values in
pair_lddts are non-finite aborts the whole batch; change _process_structure so
that instead of raising, it logs an error (including protein, selection_str,
backbone_altloc_res_ids and pair_lddts) and either returns a sentinel result
that marks the span classification as "unknown" or simply skips emitting that
span; optionally add a --strict flag checked in main to preserve the current
RuntimeError behavior when requested. Ensure references: pair_lddts,
finite_vals, _process_structure, main, selection_str, and
backbone_altloc_res_ids are updated so downstream CSV writing handles the
sentinel/unknown case gracefully.
- Around line 231-303: The code silently continues when sel_chain_ids contains
multiple chains which collapses residues across chains and miscomputes
backbone_altloc_res_ids and lDDT (affecting sel_chain_ids,
backbone_altloc_res_ids, covered_altloc_residues, _max_contiguous_run and
_mean_residue_lddt_for_pair); fix by making multi-chain selections fail fast:
replace the current warning branch (len(sel_chain_ids) != 1) with an explicit
error (raise ValueError or RuntimeError) that includes sel_chain_ids and
selection_str so callers must split the selection upstream, ensuring downstream
logic always operates on a single chain and that backbone_altloc_res_ids,
covered_altloc_residues and lDDT computations are correct.
- Around line 306-311: Add NumPy-style docstrings to the functions missing them:
_process_structure and main (and also update the helpers _resolve_cif_path,
_build_pairwise_altloc_arrays, _mean_residue_lddt_for_pair, and
_classify_selection which currently have only prose). For each function include
a one-line summary, a Parameters section listing each argument name, type and
brief description, a Returns section describing the return type and structure
(e.g., list[dict] for _process_structure), and any Raises or Notes if relevant;
keep wording consistent with existing docstrings in the file and ensure
parameter names exactly match the function signatures.

---

Nitpick comments:
In `@scripts/eval/classify_altloc_regions.py`:
- Around line 356-360: The loop inside _process_structure shadows the outer
parameter row: pd.Series by rebinding row to a per-span dict; rename the inner
binding to avoid confusion and potential bugs (e.g., use span_row, covered =
out) and update subsequent references in that loop to span_row so the outer row
variable remains distinct; ensure variables like classified_res_ids and rows
continue to receive the correct values (append span_row to rows or adjust as
intended) and run tests to verify behavior unchanged.
- Around line 134-162: Change the loose return/type annotation on
_build_pairwise_altloc_arrays from dict[tuple[str,str], tuple[object,object]] to
dict[tuple[str,str], tuple[AtomArray, AtomArray]] and import AtomArray from
biotite.structure; ensure the local pairs variable is typed the same way. Also
update the callee signature _mean_residue_lddt_for_pair to accept AtomArray
pairs (tuple[AtomArray,AtomArray]) instead of object so the type checker will
flag None branches earlier; keep the existing runtime logic that uses
select_altloc and filter_to_common_atoms unchanged.
- Around line 62-63: Remove the import-time sys.path.insert side-effect and
replace it with a proper package-relative import or move of the utility: either
(A) make scripts/eval a package (add __init__.py) and use a relative import from
within classify_altloc_regions to import translate_selection from
lddt_evaluation_script (e.g., from .lddt_evaluation_script import
translate_selection), or (B) refactor translate_selection into the project src
package (e.g., src.sampleworks...) and import it with an absolute import (e.g.,
from sampleworks.lddt_evaluation_script import translate_selection); update
classify_altloc_regions to remove sys.path mutation and adjust any invocation
(use python -m scripts.eval.classify_altloc_regions if relying on package
execution).
- Around line 339-344: The current heuristic in the loop that skips union
selections by checking if " or " appears in selection_str is fragile; update the
logic to look for an explicit sentinel (e.g., prefix "UNION:") that
find_altloc_selections.py will emit for the combined union selection and skip
only when selection_str.startswith("UNION:"), and add a short colocated comment
referencing find_altloc_selections.py (and the sentinel contract) so future
changes don't reintroduce the fragile " or " check; if you cannot change the
producer, add a conservative fallback comment explaining the existing format
dependency and assert/raise if a selection unexpectedly contains " or " but is
not prefixed to make the contract explicit.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d085183b-96a0-4a58-8a56-0109b95fa812

📥 Commits

Reviewing files that changed from the base of the PR and between 6e7a8cf and a58e610.

📒 Files selected for processing (2)

.gitignore
scripts/eval/classify_altloc_regions.py

marcuscollins · 2026-04-20T22:05:46Z

+]
+
+
+def _max_contiguous_run(sorted_res_ids: list[int]) -> int:


I believe this can be vectorized. Does it get used much?

no but easy to do with claude

marcuscollins · 2026-04-20T22:15:34Z

+    return best
+
+
+def _build_pairwise_altloc_arrays(


You can use grid_search_utils.map_altlocs_to_stack instead, if you apply selections first. Are you trying to avoid using selections? It seems like this is used to filter selections, so you could apply the selection then run map_altlocs_to_stack. It would reduce the number of special functions needed.

that will drop selections where the whole structure has 3 altlocs, but some residues only have 2. Added comment on this.

Right, so if you do the selection first, then construct the stack on just that selection, it would work, wouldn't it?

marcuscollins · 2026-04-20T22:15:58Z

+
+
+def _build_pairwise_altloc_arrays(
+    atom_array, altloc_ids: list[str]


Why not automatically do this over all discovered altlocs?

Do you mean by using detect_altlocs in here?

Yes. I anticipate this is because you are using the entire atom array here, and not a subset. I have always thought about these issues by localizing to the subset first, and only after that worrying about which altlocs are present. I find it easier to think about this way, especially because I don't think there's any clear way to know whether distant altloc stretches are related to each other.

marcuscollins · 2026-04-20T22:27:37Z

+        return float("nan")
+
+    flat = [
+        v[0] if isinstance(v, (list, tuple, np.ndarray)) else v for v in residue_scores.values()


This seems weird, for a couple reasons.

I don't like the use of .values() here--I guess now it should be ordered, but I prefer to know that order.

Is it really the case that the value is sometimes a plain number, and other times some list-like thing? If so, I don't recall any good reason for that, and we should fix it upstream.

OTOH, if there is a good reason for 2), you're silently suppressing that behavior here, which also isn't great.

I tried to make this clearer? There is no reason the value is a list-like thing, that was overly defensive on my part.

marcuscollins · 2026-04-20T23:03:26Z

+        "span_length": int(len(sel_res_ids)),
+        "n_backbone_altloc_residues": n_backbone,
+        "n_altlocs": len(altloc_ids),
+        "pair_lddts": json.dumps({}),


why dump this to a string?

marcuscollins · 2026-04-20T23:11:29Z

+        row["classification"] = "side_chain_only"
+        return row, covered_altloc_residues
+
+    # Domain shift: contiguous backbone-altloc run exceeds threshold (default 50).


50 residues seems quite big to me--what about a single helix realignment? These are the kinds of hyperparameters I get worried about.

There is one "domain shift" in the dataset that I care about right now, and it is more like 130 residues, so this successfully detects it. in the future we definitely should make this more robust

marcuscollins · 2026-04-20T23:14:36Z

+        row["classification"] = "domain_shift"
+        return row, covered_altloc_residues
+
+    # Single residue backbone altlocs cannot yield a meaningful lDDT, since you need at least two


Huh? This comment seems.... wrong. Or I don't understand what it's trying to say.

After looking into this more, this was a misunderstanding on my part of how the LDDT we were calculating worked, since I expected that we were looking within a segment only, not considering the neighborhood. I realize this is not correct and that the regular LDDT calculation is done with the 15A neighborhood around each resiude

marcuscollins · 2026-04-20T23:15:51Z

+    # Single residue backbone altlocs cannot yield a meaningful lDDT, since you need at least two
+    # residues for inter-residue distances. A lone backbone-altloc residue is
+    # a minor local perturbation by definition, which we label as small_loop.
+    if n_backbone < 2:


I had the impression from above that "small_loop" referred to a small backbone displacement between altlocs which might include some more substantial side chain altlocs as well.

Yes, it should include this as well. I think this should be what it is now? If the LDDT calculation is what I think?

marcuscollins · 2026-04-20T23:17:16Z

+            pair_lddts[f"{altloc_ids[i]}-{altloc_ids[j]}"] = _mean_residue_lddt_for_pair(
+                gt, pred, chain, backbone_altloc_res_ids
+            )
+    row["pair_lddts"] = json.dumps(pair_lddts)


again, why dump to a string?

We can then read the json string back in when using the CSV downstream. I suppose we could also store new columns in the CSV?

If this becomes a column in a data frame that you write out to CSV, it will automatically dump a dictionary to a string:

In [1]: import pandas as pd In [2]: data = [{"thing": 1, "other": {'blah': 2, 'huh': "yesa"}}] In [3]: df = pd.DataFrame(data) In [4]: df Out[4]: thing other 0 1 {'blah': 2, 'huh': 'yesa'} In [5]: print(df.to_csv()) Out[5]: ,thing,other 0,1,"{'blah': 2, 'huh': 'yesa'}"

I'm not sure whether it can be coaxed to read it back in as a dictionary.

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@scripts/eval/classify_altloc_regions.py`:
- Around line 174-185: The code is constructing residue lookup keys as str(r)
but AllAtomLDDT uses chain+res_id (e.g. "A123"), so change the keys generation
to include the chain prefix (e.g. keys = [f"{chain}{r}" for r in residues] or
keys = [chain + str(r) for r in residues]) and use those keys for the missing
check, flat extraction (residue_scores[k][0]) and logging; update any references
to keys/residue_scores in this block (symbols: residue_scores, residues, keys,
chain, flat) so lookups match the AllAtomLDDT key format.
- Around line 162-169: AllAtomLDDT.compute is being passed full
predicted_atom_array_stack and ground_truth_atom_array_stack while relying on
the selection string to report "backbone" scores, but selection is applied after
feature construction so side-chain atoms still influence the lDDT; fix by
pre-filtering predicted_atom_array_stack and ground_truth_atom_array_stack to
only backbone atom rows (atom_name in ['C','CA','N','O']) before calling
AllAtomLDDT().compute, keep the residue/chain filtering (res_clause/chain) for
selection of reported residues, and ensure the filtered stacks preserve
corresponding atom ordering and indexing between prediction and ground truth
when passing them to compute.

In `@src/sampleworks/eval/grid_search_eval_utils.py`:
- Around line 27-38: The guard before constructing Path(pattern) needs to
explicitly detect NaN so NaN doesn't get passed to Path and trigger TypeError;
update the check around row["structure_pattern"] in grid_search_eval_utils.py to
use an explicit NaN check (e.g., pandas.isna(row["structure_pattern"]) or
math.isnan for numeric types) in addition to the existing truthiness test, and
raise the same ValueError if the value is missing/NaN; ensure the condition
references row["structure_pattern"] and the subsequent Path(...) call remains
unchanged.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 280a4e1c-c30a-47f2-a7da-afa3a3d742ce

📥 Commits

Reviewing files that changed from the base of the PR and between a58e610 and 9be2ecf.

📒 Files selected for processing (3)

.gitignore
scripts/eval/classify_altloc_regions.py
src/sampleworks/eval/grid_search_eval_utils.py

🚧 Files skipped from review as they are similar to previous changes (1)

.gitignore

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (3)

scripts/eval/classify_altloc_regions.py (3)

308-314: ⚠️ Potential issue | 🟡 Minor

Avoid aborting the whole batch for one unscorable span.

This RuntimeError bubbles out through _process_structure() and main(), so one problematic structure prevents writing classifications for the rest of the CSV. Consider logging and skipping the span, emitting an "unknown" classification, or gating strict failure behind a flag.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@scripts/eval/classify_altloc_regions.py` around lines 308 - 314, The code
currently raises a RuntimeError when no finite lDDT values are found (the
finite_vals check over pair_lddts), which aborts the whole run via
_process_structure() and main(); change this to handle the unscorable span
gracefully by logging a warning that includes protein, selection_str and
backbone_altloc_res_ids, then either emit an "unknown" classification for that
span (so downstream CSV writing can continue) or skip the span and continue
processing the next one; if desired, add a strict/fail-fast flag to preserve the
current RuntimeError behavior when explicitly requested.

87-103: ⚠️ Potential issue | 🟡 Minor

Bring the new function docstrings up to NumPy style.

Several helpers only have prose summaries, and _process_structure() / main() have no docstring. Please add Parameters, Returns, and relevant Raises / Notes sections for the new functions.

As per coding guidelines, "Always include NumPy-style docstrings for every function and class."

Also applies to: 114-134, 152-158, 188-222, 322-327, 400-400

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@scripts/eval/classify_altloc_regions.py` around lines 87 - 103, Several new
helper functions (e.g., _max_contiguous_run and _chain_from_selection) and
higher-level functions (_process_structure and main) lack NumPy-style
docstrings; update each to include a concise summary plus Parameters, Returns,
and Raises (and Notes if relevant). For _max_contiguous_run: document parameter
sorted_res_ids (ndarray | list[int]), describe returned int (length of longest
contiguous run) and any edge-case behavior (empty input returns 0). For
_chain_from_selection: document selection: str, returned Optional[str], and note
supported selection syntaxes. Add NumPy-style docstrings similarly for
_process_structure and main describing inputs, outputs, side effects, and
exceptions; also apply the same docstring format to the other helper functions
flagged in the review (lines 114-134, 152-158, 188-222, 322-327, 400) so every
function/class follows the project's NumPy-style docstring guideline.

162-169: ⚠️ Potential issue | 🟠 Major

Pre-filter to backbone atoms before computing “backbone lDDT.”

The selection string limits reported residues, but the full atom arrays are still passed into lDDT feature construction, so side-chain atoms can influence a score reported as backbone-only.

Proposed direction

+    backbone_mask = np.isin(gt_array.atom_name, BACKBONE_ATOM_TYPES)
+    if not backbone_mask.any():
+        return float("nan")
+
+    gt_array = gt_array[backbone_mask]
+    pred_array = pred_array[backbone_mask]
+
     res_clause = " or ".join(f"res_id == {r}" for r in residues)
     selection = f"chain_id == '{chain}' and ({res_clause}) and atom_name in ['C','CA','N','O']"
     try:
         result = AllAtomLDDT().compute(
             predicted_atom_array_stack=pred_array,

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@scripts/eval/classify_altloc_regions.py` around lines 162 - 169, The compute
call to AllAtomLDDT uses full predicted and ground-truth atom stacks
(predicted_atom_array_stack / ground_truth_atom_array_stack) so side-chain atoms
still affect the "backbone" lDDT; filter those stacks to only backbone atoms
before calling AllAtomLDDT().compute. Specifically, use the existing
res_clause/selection logic (chain, residues, atom_name in ['C','CA','N','O']) to
create filtered_pred_array and filtered_gt_array (keeping the same array shapes
expected by AllAtomLDDT) and pass those filtered stacks as
predicted_atom_array_stack and ground_truth_atom_array_stack to
AllAtomLDDT().compute (you can keep or remove the selection arg afterwards).

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/sampleworks/eval/grid_search_eval_utils.py`:
- Around line 20-26: The docstring for resolve_cif_path is currently prose;
replace it with a NumPy-style docstring that includes a short summary, a
Parameters section documenting row (pd.Series) and cif_root (Path | None), a
Returns section indicating Path, and a Notes (or Returns/Raises) paragraph
describing the resolution behavior (prefer row["structure"] then
row["structure_pattern"], and when using structure_pattern try cif_root/pattern
and cif_root/protein/pattern). Also document any raised exceptions (e.g.,
FileNotFoundError) if applicable and keep the existing descriptive details from
the current prose.

---

Duplicate comments:
In `@scripts/eval/classify_altloc_regions.py`:
- Around line 308-314: The code currently raises a RuntimeError when no finite
lDDT values are found (the finite_vals check over pair_lddts), which aborts the
whole run via _process_structure() and main(); change this to handle the
unscorable span gracefully by logging a warning that includes protein,
selection_str and backbone_altloc_res_ids, then either emit an "unknown"
classification for that span (so downstream CSV writing can continue) or skip
the span and continue processing the next one; if desired, add a
strict/fail-fast flag to preserve the current RuntimeError behavior when
explicitly requested.
- Around line 87-103: Several new helper functions (e.g., _max_contiguous_run
and _chain_from_selection) and higher-level functions (_process_structure and
main) lack NumPy-style docstrings; update each to include a concise summary plus
Parameters, Returns, and Raises (and Notes if relevant). For
_max_contiguous_run: document parameter sorted_res_ids (ndarray | list[int]),
describe returned int (length of longest contiguous run) and any edge-case
behavior (empty input returns 0). For _chain_from_selection: document selection:
str, returned Optional[str], and note supported selection syntaxes. Add
NumPy-style docstrings similarly for _process_structure and main describing
inputs, outputs, side effects, and exceptions; also apply the same docstring
format to the other helper functions flagged in the review (lines 114-134,
152-158, 188-222, 322-327, 400) so every function/class follows the project's
NumPy-style docstring guideline.
- Around line 162-169: The compute call to AllAtomLDDT uses full predicted and
ground-truth atom stacks (predicted_atom_array_stack /
ground_truth_atom_array_stack) so side-chain atoms still affect the "backbone"
lDDT; filter those stacks to only backbone atoms before calling
AllAtomLDDT().compute. Specifically, use the existing res_clause/selection logic
(chain, residues, atom_name in ['C','CA','N','O']) to create filtered_pred_array
and filtered_gt_array (keeping the same array shapes expected by AllAtomLDDT)
and pass those filtered stacks as predicted_atom_array_stack and
ground_truth_atom_array_stack to AllAtomLDDT().compute (you can keep or remove
the selection arg afterwards).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 1fda720c-0dfc-4c3c-bcd2-8305a360aafc

📥 Commits

Reviewing files that changed from the base of the PR and between 9be2ecf and f9be48e.

📒 Files selected for processing (2)

scripts/eval/classify_altloc_regions.py
src/sampleworks/eval/grid_search_eval_utils.py

…tring

marcuscollins

I'm okay with it as is, but I would favor a selection-first approach that re-uses existing methods. I'm 99% sure it would work and be less code to maintain.

marcuscollins · 2026-04-21T22:02:38Z

+    return best
+
+
+def _build_pairwise_altloc_arrays(


Right, so if you do the selection first, then construct the stack on just that selection, it would work, wouldn't it?

marcuscollins · 2026-04-21T22:02:52Z

+
+
+def _build_pairwise_altloc_arrays(
+    atom_array, altloc_ids: list[str]


Yes. I anticipate this is because you are using the entire atom array here, and not a subset. I have always thought about these issues by localizing to the subset first, and only after that worrying about which altlocs are present. I find it easier to think about this way, especially because I don't think there's any clear way to know whether distant altloc stretches are related to each other.

marcuscollins · 2026-04-21T22:11:17Z

+            pair_lddts[f"{altloc_ids[i]}-{altloc_ids[j]}"] = _mean_residue_lddt_for_pair(
+                gt, pred, chain, backbone_altloc_res_ids
+            )
+    row["pair_lddts"] = json.dumps(pair_lddts)


If this becomes a column in a data frame that you write out to CSV, it will automatically dump a dictionary to a string:

In [1]: import pandas as pd In [2]: data = [{"thing": 1, "other": {'blah': 2, 'huh': "yesa"}}] In [3]: df = pd.DataFrame(data) In [4]: df Out[4]: thing other 0 1 {'blah': 2, 'huh': 'yesa'} In [5]: print(df.to_csv()) Out[5]: ,thing,other 0,1,"{'blah': 2, 'huh': 'yesa'}"

I'm not sure whether it can be coaxed to read it back in as a dictionary.

k-chrispens requested review from Copilot and marcuscollins April 17, 2026 20:27

Copilot started reviewing on behalf of k-chrispens April 17, 2026 20:28 View session

Copilot AI reviewed Apr 17, 2026

View reviewed changes

Comment thread scripts/eval/classify_altloc_regions.py Outdated

Comment thread scripts/eval/classify_altloc_regions.py Outdated

Comment thread scripts/eval/classify_altloc_regions.py

coderabbitai Bot reviewed Apr 17, 2026

View reviewed changes

Comment thread .gitignore Outdated

Comment thread scripts/eval/classify_altloc_regions.py Outdated

Comment thread scripts/eval/classify_altloc_regions.py

Comment thread scripts/eval/classify_altloc_regions.py

k-chrispens added 2 commits April 17, 2026 21:03

feat: script for classifying conformational changes

dcb9039

refactor: move resolve_cif_path to grid_search_eval_utils.py

815ae32

k-chrispens force-pushed the kmc/classify-conf-change branch from a58e610 to 815ae32 Compare April 17, 2026 21:05

k-chrispens had a problem deploying to gpu-testing April 17, 2026 21:05 — with GitHub Actions Error

review: addressing bot review fb

4ecbde3

k-chrispens had a problem deploying to gpu-testing April 17, 2026 21:17 — with GitHub Actions Error

k-chrispens mentioned this pull request Apr 20, 2026

feat: RSCC eval updates #224

Open

marcuscollins reviewed Apr 20, 2026

View reviewed changes

k-chrispens had a problem deploying to gpu-testing April 21, 2026 19:27 — with GitHub Actions Error

coderabbitai Bot reviewed Apr 21, 2026

View reviewed changes

Comment thread scripts/eval/classify_altloc_regions.py

Comment thread scripts/eval/classify_altloc_regions.py

Comment thread src/sampleworks/eval/grid_search_eval_utils.py

review: addressing @marcuscollins fb and bot fb

f9be48e

k-chrispens force-pushed the kmc/classify-conf-change branch from 9be2ecf to f9be48e Compare April 21, 2026 21:57

k-chrispens had a problem deploying to gpu-testing April 21, 2026 21:57 — with GitHub Actions Error

coderabbitai Bot reviewed Apr 21, 2026

View reviewed changes

Comment thread src/sampleworks/eval/grid_search_eval_utils.py

docs: add grid_search_eval_utils.py resolve_cif_path NumPy style docs…

da69e98

…tring

k-chrispens requested a deployment to gpu-testing April 21, 2026 22:07 — with GitHub Actions Waiting

marcuscollins approved these changes Apr 21, 2026

View reviewed changes

k-chrispens mentioned this pull request Apr 22, 2026

Migrate to selection-first approach in classify_altloc_selections.py #226

Open

k-chrispens merged commit fbf6d38 into main Apr 22, 2026
8 of 11 checks passed

k-chrispens deleted the kmc/classify-conf-change branch April 22, 2026 21:41



		def _build_pairwise_altloc_arrays(
		atom_array, altloc_ids: list[str]

Conversation

k-chrispens commented Apr 17, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

k-chrispens commented Apr 17, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 17, 2026 •

edited

Loading