Skip to content

eval: script for classifying conformational changes#223

Merged
k-chrispens merged 5 commits intomainfrom
kmc/classify-conf-change
Apr 22, 2026
Merged

eval: script for classifying conformational changes#223
k-chrispens merged 5 commits intomainfrom
kmc/classify-conf-change

Conversation

@k-chrispens
Copy link
Copy Markdown
Collaborator

@k-chrispens k-chrispens commented Apr 17, 2026

Evaluation script for classifying conformational changes, which we use for downstream analysis

Summary by CodeRabbit

  • Chores

    • Broadened repository ignore patterns to cover additional archive/data file types and expanded dataset directory matching; added logic to better resolve structure file paths when locating CIF files.
  • New Features

    • Added a structural-variant region classification tool that categorizes alternate-location spans, computes backbone similarity scores between variants, and outputs per-span scores and detailed CSV reports.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 17, 2026

Warning

Rate limit exceeded

@k-chrispens has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 49 minutes and 57 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 49 minutes and 57 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: cfa4f7f4-d649-43c3-8629-9a60a0db1be8

📥 Commits

Reviewing files that changed from the base of the PR and between f9be48e and da69e98.

📒 Files selected for processing (1)
  • src/sampleworks/eval/grid_search_eval_utils.py
📝 Walkthrough

Walkthrough

Added a new CLI evaluation script to classify contiguous altloc spans in CIF structures by backbone lDDT and span size, a CIF-path resolver helper, and extended .gitignore to ignore archive, CSV, and dataset path patterns.

Changes

Cohort / File(s) Summary
Configuration & Artifacts
\.gitignore
Extended ignore patterns: initial_dataset_40* (prefix match), *.tar.gz, *.tgz, *.csv; re‑included src/sampleworks/data/protein_configs.csv; removed trailing newline.
Altloc classification script
scripts/eval/classify_altloc_regions.py
New CLI script that resolves CIF paths, enumerates altloc IDs, builds pairwise AtomArray inputs, computes per-residue backbone lDDT across altloc pairs, classifies spans as side_chain_only, domain_shift, small_loop, or large_loop, and writes CSV output with worst_pair_mean_backbone_lddt, n_backbone_altloc_residues, n_altlocs, and JSON pair_lddts.
CIF path resolution helper
src/sampleworks/eval/grid_search_eval_utils.py
Added `resolve_cif_path(row: pd.Series, cif_root: Path

Sequence Diagram(s)

sequenceDiagram
  participant CLI as CLI
  participant CSV as Selections CSV
  participant Resolver as resolve_cif_path
  participant Loader as Structure Loader
  participant LDDT as AllAtomLDDT
  participant Output as CSV Writer

  CLI->>CSV: read selection rows
  loop for each row/selection
    CLI->>Resolver: resolve CIF path (row, --cif-root)
    Resolver-->>CLI: cif path
    CLI->>Loader: load CIF, extract altloc atom arrays
    Loader-->>CLI: altloc arrays & metadata
    CLI->>LDDT: build pairwise arrays & compute per-residue backbone lDDT
    LDDT-->>CLI: per-pair per-residue lDDT scores
    CLI->>CLI: classify span (side_chain_only / domain_shift / small_loop / large_loop)
    CLI->>Output: write classification row (incl. pair_lddts JSON)
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Poem

🐰 I hop through CIFs where altlocs play,
Counting backbone runs in a curious way,
I pair each pose, score loops with delight,
Tagging shifts by day and by night,
While .gitignore tucks away the stray.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 75.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'eval: script for classifying conformational changes' directly and specifically describes the main addition: a new evaluation script for classifying conformational changes.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch kmc/classify-conf-change

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an evaluation-time script to classify contiguous altloc (alternate location) regions into conformational-change bins for downstream analysis, based on backbone altloc presence, span length, and pairwise backbone lDDT between altlocs.

Changes:

  • Introduce scripts/eval/classify_altloc_regions.py to load structures with altlocs, score pairwise altloc comparisons, and emit a per-span classification CSV.
  • Update .gitignore to ignore initial_dataset_40* directories and additional artifact patterns.

Reviewed changes

Copilot reviewed 1 out of 2 changed files in this pull request and generated 3 comments.

File Description
scripts/eval/classify_altloc_regions.py New CLI script to classify altloc spans into side-chain-only / loop-size / domain-shift bins using pairwise backbone lDDT.
.gitignore Expands ignore patterns for large datasets and artifacts (including CSVs).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread scripts/eval/classify_altloc_regions.py Outdated
Comment thread scripts/eval/classify_altloc_regions.py Outdated
Comment thread scripts/eval/classify_altloc_regions.py
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (4)
scripts/eval/classify_altloc_regions.py (4)

356-360: Shadowing the outer row: pd.Series with the per-span row: dict is confusing.

_process_structure takes row: pd.Series and then rebinds row on line 358 inside the loop. It happens to be harmless today (the outer row isn't referenced after the loop), but the name collision makes the function harder to read and one accidental reintroduction of row["protein"] below the loop would silently pick up the last span's dict. Rename one of them, e.g. span_row, covered = out.

♻️ Proposed rename
-        row, covered = out
-        rows.append(row)
+        span_row, covered = out
+        rows.append(span_row)
         classified_res_ids.update(covered)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/eval/classify_altloc_regions.py` around lines 356 - 360, The loop
inside _process_structure shadows the outer parameter row: pd.Series by
rebinding row to a per-span dict; rename the inner binding to avoid confusion
and potential bugs (e.g., use span_row, covered = out) and update subsequent
references in that loop to span_row so the outer row variable remains distinct;
ensure variables like classified_res_ids and rows continue to receive the
correct values (append span_row to rows or adjust as intended) and run tests to
verify behavior unchanged.

134-162: Tighten the pair-array typing.

dict[tuple[str, str], tuple[object, object]] loses all information — every caller immediately uses these as AtomArrays. Annotating as tuple[AtomArray, AtomArray] (with from biotite.structure import AtomArray) makes _mean_residue_lddt_for_pair's signature honest too, and lets the type checker catch the None branch on line 286 rather than relying on the runtime guard inside _mean_residue_lddt_for_pair.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/eval/classify_altloc_regions.py` around lines 134 - 162, Change the
loose return/type annotation on _build_pairwise_altloc_arrays from
dict[tuple[str,str], tuple[object,object]] to dict[tuple[str,str],
tuple[AtomArray, AtomArray]] and import AtomArray from biotite.structure; ensure
the local pairs variable is typed the same way. Also update the callee signature
_mean_residue_lddt_for_pair to accept AtomArray pairs
(tuple[AtomArray,AtomArray]) instead of object so the type checker will flag
None branches earlier; keep the existing runtime logic that uses select_altloc
and filter_to_common_atoms unchanged.

62-63: Import-time sys.path.insert is a hidden side effect; prefer a relative/package import.

Mutating sys.path at module import is exactly the kind of side effect the coding guidelines ask to be explicit about, and it also makes lddt_evaluation_script order-sensitive (anything earlier on sys.path with the same name wins). Since both scripts live under scripts/eval/, either:

  • extract translate_selection into src/sampleworks/... and import it normally, or
  • invoke via python -m scripts.eval.classify_altloc_regions with an __init__.py so a proper relative import works.

As per coding guidelines: "Only comment when truly necessary. ALWAYS annotate complex array shapes and note side effects."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/eval/classify_altloc_regions.py` around lines 62 - 63, Remove the
import-time sys.path.insert side-effect and replace it with a proper
package-relative import or move of the utility: either (A) make scripts/eval a
package (add __init__.py) and use a relative import from within
classify_altloc_regions to import translate_selection from
lddt_evaluation_script (e.g., from .lddt_evaluation_script import
translate_selection), or (B) refactor translate_selection into the project src
package (e.g., src.sampleworks...) and import it with an absolute import (e.g.,
from sampleworks.lddt_evaluation_script import translate_selection); update
classify_altloc_regions to remove sys.path mutation and adjust any invocation
(use python -m scripts.eval.classify_altloc_regions if relying on package
execution).

339-344: Fragile heuristic for skipping the upstream union selection.

if " or " in selection_str: continue relies on a formatting contract in find_altloc_selections.py that isn't enforced anywhere — if that script ever emits individual spans with embedded or (e.g., once it grows residue-range syntax or multi-chain clauses), those spans will be silently dropped. Safer options:

  • have find_altloc_selections.py emit the union under a distinct column (e.g., union_selection) and not splice it into selection, or
  • tag the union with a sentinel prefix like UNION: that this script can test for explicitly.

At minimum, worth a comment here pointing at the exact contract in find_altloc_selections.py so future edits don't diverge.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/eval/classify_altloc_regions.py` around lines 339 - 344, The current
heuristic in the loop that skips union selections by checking if " or " appears
in selection_str is fragile; update the logic to look for an explicit sentinel
(e.g., prefix "UNION:") that find_altloc_selections.py will emit for the
combined union selection and skip only when selection_str.startswith("UNION:"),
and add a short colocated comment referencing find_altloc_selections.py (and the
sentinel contract) so future changes don't reintroduce the fragile " or " check;
if you cannot change the producer, add a conservative fallback comment
explaining the existing format dependency and assert/raise if a selection
unexpectedly contains " or " but is not prefixed to make the contract explicit.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.gitignore:
- Around line 227-230: The blanket "*.csv" ignore will cause tracked CSVs to be
ignored on adds; update .gitignore to add a negation for the specific tracked
CSV (protein_configs.csv) so that it remains committable despite the "*.csv"
rule—add a "!<path-to>/protein_configs.csv" negation entry immediately after the
"*.csv" pattern or remove the blanket "*.csv" and instead list only undesired
CSV patterns.

In `@scripts/eval/classify_altloc_regions.py`:
- Around line 292-298: The current RuntimeError raised when all values in
pair_lddts are non-finite aborts the whole batch; change _process_structure so
that instead of raising, it logs an error (including protein, selection_str,
backbone_altloc_res_ids and pair_lddts) and either returns a sentinel result
that marks the span classification as "unknown" or simply skips emitting that
span; optionally add a --strict flag checked in main to preserve the current
RuntimeError behavior when requested. Ensure references: pair_lddts,
finite_vals, _process_structure, main, selection_str, and
backbone_altloc_res_ids are updated so downstream CSV writing handles the
sentinel/unknown case gracefully.
- Around line 231-303: The code silently continues when sel_chain_ids contains
multiple chains which collapses residues across chains and miscomputes
backbone_altloc_res_ids and lDDT (affecting sel_chain_ids,
backbone_altloc_res_ids, covered_altloc_residues, _max_contiguous_run and
_mean_residue_lddt_for_pair); fix by making multi-chain selections fail fast:
replace the current warning branch (len(sel_chain_ids) != 1) with an explicit
error (raise ValueError or RuntimeError) that includes sel_chain_ids and
selection_str so callers must split the selection upstream, ensuring downstream
logic always operates on a single chain and that backbone_altloc_res_ids,
covered_altloc_residues and lDDT computations are correct.
- Around line 306-311: Add NumPy-style docstrings to the functions missing them:
_process_structure and main (and also update the helpers _resolve_cif_path,
_build_pairwise_altloc_arrays, _mean_residue_lddt_for_pair, and
_classify_selection which currently have only prose). For each function include
a one-line summary, a Parameters section listing each argument name, type and
brief description, a Returns section describing the return type and structure
(e.g., list[dict] for _process_structure), and any Raises or Notes if relevant;
keep wording consistent with existing docstrings in the file and ensure
parameter names exactly match the function signatures.

---

Nitpick comments:
In `@scripts/eval/classify_altloc_regions.py`:
- Around line 356-360: The loop inside _process_structure shadows the outer
parameter row: pd.Series by rebinding row to a per-span dict; rename the inner
binding to avoid confusion and potential bugs (e.g., use span_row, covered =
out) and update subsequent references in that loop to span_row so the outer row
variable remains distinct; ensure variables like classified_res_ids and rows
continue to receive the correct values (append span_row to rows or adjust as
intended) and run tests to verify behavior unchanged.
- Around line 134-162: Change the loose return/type annotation on
_build_pairwise_altloc_arrays from dict[tuple[str,str], tuple[object,object]] to
dict[tuple[str,str], tuple[AtomArray, AtomArray]] and import AtomArray from
biotite.structure; ensure the local pairs variable is typed the same way. Also
update the callee signature _mean_residue_lddt_for_pair to accept AtomArray
pairs (tuple[AtomArray,AtomArray]) instead of object so the type checker will
flag None branches earlier; keep the existing runtime logic that uses
select_altloc and filter_to_common_atoms unchanged.
- Around line 62-63: Remove the import-time sys.path.insert side-effect and
replace it with a proper package-relative import or move of the utility: either
(A) make scripts/eval a package (add __init__.py) and use a relative import from
within classify_altloc_regions to import translate_selection from
lddt_evaluation_script (e.g., from .lddt_evaluation_script import
translate_selection), or (B) refactor translate_selection into the project src
package (e.g., src.sampleworks...) and import it with an absolute import (e.g.,
from sampleworks.lddt_evaluation_script import translate_selection); update
classify_altloc_regions to remove sys.path mutation and adjust any invocation
(use python -m scripts.eval.classify_altloc_regions if relying on package
execution).
- Around line 339-344: The current heuristic in the loop that skips union
selections by checking if " or " appears in selection_str is fragile; update the
logic to look for an explicit sentinel (e.g., prefix "UNION:") that
find_altloc_selections.py will emit for the combined union selection and skip
only when selection_str.startswith("UNION:"), and add a short colocated comment
referencing find_altloc_selections.py (and the sentinel contract) so future
changes don't reintroduce the fragile " or " check; if you cannot change the
producer, add a conservative fallback comment explaining the existing format
dependency and assert/raise if a selection unexpectedly contains " or " but is
not prefixed to make the contract explicit.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d085183b-96a0-4a58-8a56-0109b95fa812

📥 Commits

Reviewing files that changed from the base of the PR and between 6e7a8cf and a58e610.

📒 Files selected for processing (2)
  • .gitignore
  • scripts/eval/classify_altloc_regions.py

Comment thread .gitignore Outdated
Comment thread scripts/eval/classify_altloc_regions.py Outdated
Comment thread scripts/eval/classify_altloc_regions.py
Comment thread scripts/eval/classify_altloc_regions.py
Comment thread scripts/eval/classify_altloc_regions.py Outdated
Comment thread scripts/eval/classify_altloc_regions.py Outdated
]


def _max_contiguous_run(sorted_res_ids: list[int]) -> int:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this can be vectorized. Does it get used much?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no but easy to do with claude

return best


def _build_pairwise_altloc_arrays(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use grid_search_utils.map_altlocs_to_stack instead, if you apply selections first. Are you trying to avoid using selections? It seems like this is used to filter selections, so you could apply the selection then run map_altlocs_to_stack. It would reduce the number of special functions needed.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that will drop selections where the whole structure has 3 altlocs, but some residues only have 2. Added comment on this.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, so if you do the selection first, then construct the stack on just that selection, it would work, wouldn't it?



def _build_pairwise_altloc_arrays(
atom_array, altloc_ids: list[str]
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not automatically do this over all discovered altlocs?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean by using detect_altlocs in here?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I anticipate this is because you are using the entire atom array here, and not a subset. I have always thought about these issues by localizing to the subset first, and only after that worrying about which altlocs are present. I find it easier to think about this way, especially because I don't think there's any clear way to know whether distant altloc stretches are related to each other.

Comment thread scripts/eval/classify_altloc_regions.py Outdated
return float("nan")

flat = [
v[0] if isinstance(v, (list, tuple, np.ndarray)) else v for v in residue_scores.values()
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems weird, for a couple reasons.

  1. I don't like the use of .values() here--I guess now it should be ordered, but I prefer to know that order.
  2. Is it really the case that the value is sometimes a plain number, and other times some list-like thing? If so, I don't recall any good reason for that, and we should fix it upstream.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OTOH, if there is a good reason for 2), you're silently suppressing that behavior here, which also isn't great.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to make this clearer? There is no reason the value is a list-like thing, that was overly defensive on my part.

"span_length": int(len(sel_res_ids)),
"n_backbone_altloc_residues": n_backbone,
"n_altlocs": len(altloc_ids),
"pair_lddts": json.dumps({}),
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why dump this to a string?

row["classification"] = "side_chain_only"
return row, covered_altloc_residues

# Domain shift: contiguous backbone-altloc run exceeds threshold (default 50).
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

50 residues seems quite big to me--what about a single helix realignment? These are the kinds of hyperparameters I get worried about.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is one "domain shift" in the dataset that I care about right now, and it is more like 130 residues, so this successfully detects it. in the future we definitely should make this more robust

Comment thread scripts/eval/classify_altloc_regions.py Outdated
row["classification"] = "domain_shift"
return row, covered_altloc_residues

# Single residue backbone altlocs cannot yield a meaningful lDDT, since you need at least two
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh? This comment seems.... wrong. Or I don't understand what it's trying to say.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After looking into this more, this was a misunderstanding on my part of how the LDDT we were calculating worked, since I expected that we were looking within a segment only, not considering the neighborhood. I realize this is not correct and that the regular LDDT calculation is done with the 15A neighborhood around each resiude

Comment thread scripts/eval/classify_altloc_regions.py Outdated
# Single residue backbone altlocs cannot yield a meaningful lDDT, since you need at least two
# residues for inter-residue distances. A lone backbone-altloc residue is
# a minor local perturbation by definition, which we label as small_loop.
if n_backbone < 2:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had the impression from above that "small_loop" referred to a small backbone displacement between altlocs which might include some more substantial side chain altlocs as well.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it should include this as well. I think this should be what it is now? If the LDDT calculation is what I think?

pair_lddts[f"{altloc_ids[i]}-{altloc_ids[j]}"] = _mean_residue_lddt_for_pair(
gt, pred, chain, backbone_altloc_res_ids
)
row["pair_lddts"] = json.dumps(pair_lddts)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again, why dump to a string?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can then read the json string back in when using the CSV downstream. I suppose we could also store new columns in the CSV?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this becomes a column in a data frame that you write out to CSV, it will automatically dump a dictionary to a string:

In [1]: import pandas as pd

In [2]: data = [{"thing": 1, "other": {'blah': 2, 'huh': "yesa"}}]

In [3]: df = pd.DataFrame(data)

In [4]: df
Out[4]: 
   thing                       other
0      1  {'blah': 2, 'huh': 'yesa'}

In [5]: print(df.to_csv())
Out[5]:
,thing,other
0,1,"{'blah': 2, 'huh': 'yesa'}"

I'm not sure whether it can be coaxed to read it back in as a dictionary.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@scripts/eval/classify_altloc_regions.py`:
- Around line 174-185: The code is constructing residue lookup keys as str(r)
but AllAtomLDDT uses chain+res_id (e.g. "A123"), so change the keys generation
to include the chain prefix (e.g. keys = [f"{chain}{r}" for r in residues] or
keys = [chain + str(r) for r in residues]) and use those keys for the missing
check, flat extraction (residue_scores[k][0]) and logging; update any references
to keys/residue_scores in this block (symbols: residue_scores, residues, keys,
chain, flat) so lookups match the AllAtomLDDT key format.
- Around line 162-169: AllAtomLDDT.compute is being passed full
predicted_atom_array_stack and ground_truth_atom_array_stack while relying on
the selection string to report "backbone" scores, but selection is applied after
feature construction so side-chain atoms still influence the lDDT; fix by
pre-filtering predicted_atom_array_stack and ground_truth_atom_array_stack to
only backbone atom rows (atom_name in ['C','CA','N','O']) before calling
AllAtomLDDT().compute, keep the residue/chain filtering (res_clause/chain) for
selection of reported residues, and ensure the filtered stacks preserve
corresponding atom ordering and indexing between prediction and ground truth
when passing them to compute.

In `@src/sampleworks/eval/grid_search_eval_utils.py`:
- Around line 27-38: The guard before constructing Path(pattern) needs to
explicitly detect NaN so NaN doesn't get passed to Path and trigger TypeError;
update the check around row["structure_pattern"] in grid_search_eval_utils.py to
use an explicit NaN check (e.g., pandas.isna(row["structure_pattern"]) or
math.isnan for numeric types) in addition to the existing truthiness test, and
raise the same ValueError if the value is missing/NaN; ensure the condition
references row["structure_pattern"] and the subsequent Path(...) call remains
unchanged.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 280a4e1c-c30a-47f2-a7da-afa3a3d742ce

📥 Commits

Reviewing files that changed from the base of the PR and between a58e610 and 9be2ecf.

📒 Files selected for processing (3)
  • .gitignore
  • scripts/eval/classify_altloc_regions.py
  • src/sampleworks/eval/grid_search_eval_utils.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • .gitignore

Comment thread scripts/eval/classify_altloc_regions.py
Comment thread scripts/eval/classify_altloc_regions.py
Comment thread src/sampleworks/eval/grid_search_eval_utils.py
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (3)
scripts/eval/classify_altloc_regions.py (3)

308-314: ⚠️ Potential issue | 🟡 Minor

Avoid aborting the whole batch for one unscorable span.

This RuntimeError bubbles out through _process_structure() and main(), so one problematic structure prevents writing classifications for the rest of the CSV. Consider logging and skipping the span, emitting an "unknown" classification, or gating strict failure behind a flag.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/eval/classify_altloc_regions.py` around lines 308 - 314, The code
currently raises a RuntimeError when no finite lDDT values are found (the
finite_vals check over pair_lddts), which aborts the whole run via
_process_structure() and main(); change this to handle the unscorable span
gracefully by logging a warning that includes protein, selection_str and
backbone_altloc_res_ids, then either emit an "unknown" classification for that
span (so downstream CSV writing can continue) or skip the span and continue
processing the next one; if desired, add a strict/fail-fast flag to preserve the
current RuntimeError behavior when explicitly requested.

87-103: ⚠️ Potential issue | 🟡 Minor

Bring the new function docstrings up to NumPy style.

Several helpers only have prose summaries, and _process_structure() / main() have no docstring. Please add Parameters, Returns, and relevant Raises / Notes sections for the new functions.

As per coding guidelines, "Always include NumPy-style docstrings for every function and class."

Also applies to: 114-134, 152-158, 188-222, 322-327, 400-400

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/eval/classify_altloc_regions.py` around lines 87 - 103, Several new
helper functions (e.g., _max_contiguous_run and _chain_from_selection) and
higher-level functions (_process_structure and main) lack NumPy-style
docstrings; update each to include a concise summary plus Parameters, Returns,
and Raises (and Notes if relevant). For _max_contiguous_run: document parameter
sorted_res_ids (ndarray | list[int]), describe returned int (length of longest
contiguous run) and any edge-case behavior (empty input returns 0). For
_chain_from_selection: document selection: str, returned Optional[str], and note
supported selection syntaxes. Add NumPy-style docstrings similarly for
_process_structure and main describing inputs, outputs, side effects, and
exceptions; also apply the same docstring format to the other helper functions
flagged in the review (lines 114-134, 152-158, 188-222, 322-327, 400) so every
function/class follows the project's NumPy-style docstring guideline.

162-169: ⚠️ Potential issue | 🟠 Major

Pre-filter to backbone atoms before computing “backbone lDDT.”

The selection string limits reported residues, but the full atom arrays are still passed into lDDT feature construction, so side-chain atoms can influence a score reported as backbone-only.

Proposed direction
+    backbone_mask = np.isin(gt_array.atom_name, BACKBONE_ATOM_TYPES)
+    if not backbone_mask.any():
+        return float("nan")
+
+    gt_array = gt_array[backbone_mask]
+    pred_array = pred_array[backbone_mask]
+
     res_clause = " or ".join(f"res_id == {r}" for r in residues)
     selection = f"chain_id == '{chain}' and ({res_clause}) and atom_name in ['C','CA','N','O']"
     try:
         result = AllAtomLDDT().compute(
             predicted_atom_array_stack=pred_array,
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/eval/classify_altloc_regions.py` around lines 162 - 169, The compute
call to AllAtomLDDT uses full predicted and ground-truth atom stacks
(predicted_atom_array_stack / ground_truth_atom_array_stack) so side-chain atoms
still affect the "backbone" lDDT; filter those stacks to only backbone atoms
before calling AllAtomLDDT().compute. Specifically, use the existing
res_clause/selection logic (chain, residues, atom_name in ['C','CA','N','O']) to
create filtered_pred_array and filtered_gt_array (keeping the same array shapes
expected by AllAtomLDDT) and pass those filtered stacks as
predicted_atom_array_stack and ground_truth_atom_array_stack to
AllAtomLDDT().compute (you can keep or remove the selection arg afterwards).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/sampleworks/eval/grid_search_eval_utils.py`:
- Around line 20-26: The docstring for resolve_cif_path is currently prose;
replace it with a NumPy-style docstring that includes a short summary, a
Parameters section documenting row (pd.Series) and cif_root (Path | None), a
Returns section indicating Path, and a Notes (or Returns/Raises) paragraph
describing the resolution behavior (prefer row["structure"] then
row["structure_pattern"], and when using structure_pattern try cif_root/pattern
and cif_root/protein/pattern). Also document any raised exceptions (e.g.,
FileNotFoundError) if applicable and keep the existing descriptive details from
the current prose.

---

Duplicate comments:
In `@scripts/eval/classify_altloc_regions.py`:
- Around line 308-314: The code currently raises a RuntimeError when no finite
lDDT values are found (the finite_vals check over pair_lddts), which aborts the
whole run via _process_structure() and main(); change this to handle the
unscorable span gracefully by logging a warning that includes protein,
selection_str and backbone_altloc_res_ids, then either emit an "unknown"
classification for that span (so downstream CSV writing can continue) or skip
the span and continue processing the next one; if desired, add a
strict/fail-fast flag to preserve the current RuntimeError behavior when
explicitly requested.
- Around line 87-103: Several new helper functions (e.g., _max_contiguous_run
and _chain_from_selection) and higher-level functions (_process_structure and
main) lack NumPy-style docstrings; update each to include a concise summary plus
Parameters, Returns, and Raises (and Notes if relevant). For
_max_contiguous_run: document parameter sorted_res_ids (ndarray | list[int]),
describe returned int (length of longest contiguous run) and any edge-case
behavior (empty input returns 0). For _chain_from_selection: document selection:
str, returned Optional[str], and note supported selection syntaxes. Add
NumPy-style docstrings similarly for _process_structure and main describing
inputs, outputs, side effects, and exceptions; also apply the same docstring
format to the other helper functions flagged in the review (lines 114-134,
152-158, 188-222, 322-327, 400) so every function/class follows the project's
NumPy-style docstring guideline.
- Around line 162-169: The compute call to AllAtomLDDT uses full predicted and
ground-truth atom stacks (predicted_atom_array_stack /
ground_truth_atom_array_stack) so side-chain atoms still affect the "backbone"
lDDT; filter those stacks to only backbone atoms before calling
AllAtomLDDT().compute. Specifically, use the existing res_clause/selection logic
(chain, residues, atom_name in ['C','CA','N','O']) to create filtered_pred_array
and filtered_gt_array (keeping the same array shapes expected by AllAtomLDDT)
and pass those filtered stacks as predicted_atom_array_stack and
ground_truth_atom_array_stack to AllAtomLDDT().compute (you can keep or remove
the selection arg afterwards).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 1fda720c-0dfc-4c3c-bcd2-8305a360aafc

📥 Commits

Reviewing files that changed from the base of the PR and between 9be2ecf and f9be48e.

📒 Files selected for processing (2)
  • scripts/eval/classify_altloc_regions.py
  • src/sampleworks/eval/grid_search_eval_utils.py

Comment thread src/sampleworks/eval/grid_search_eval_utils.py
Copy link
Copy Markdown
Collaborator

@marcuscollins marcuscollins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm okay with it as is, but I would favor a selection-first approach that re-uses existing methods. I'm 99% sure it would work and be less code to maintain.

return best


def _build_pairwise_altloc_arrays(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, so if you do the selection first, then construct the stack on just that selection, it would work, wouldn't it?



def _build_pairwise_altloc_arrays(
atom_array, altloc_ids: list[str]
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I anticipate this is because you are using the entire atom array here, and not a subset. I have always thought about these issues by localizing to the subset first, and only after that worrying about which altlocs are present. I find it easier to think about this way, especially because I don't think there's any clear way to know whether distant altloc stretches are related to each other.

pair_lddts[f"{altloc_ids[i]}-{altloc_ids[j]}"] = _mean_residue_lddt_for_pair(
gt, pred, chain, backbone_altloc_res_ids
)
row["pair_lddts"] = json.dumps(pair_lddts)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this becomes a column in a data frame that you write out to CSV, it will automatically dump a dictionary to a string:

In [1]: import pandas as pd

In [2]: data = [{"thing": 1, "other": {'blah': 2, 'huh': "yesa"}}]

In [3]: df = pd.DataFrame(data)

In [4]: df
Out[4]: 
   thing                       other
0      1  {'blah': 2, 'huh': 'yesa'}

In [5]: print(df.to_csv())
Out[5]:
,thing,other
0,1,"{'blah': 2, 'huh': 'yesa'}"

I'm not sure whether it can be coaxed to read it back in as a dictionary.

@k-chrispens k-chrispens merged commit fbf6d38 into main Apr 22, 2026
8 of 11 checks passed
@k-chrispens k-chrispens deleted the kmc/classify-conf-change branch April 22, 2026 21:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants