fix issue 17 by k-chrispens · Pull Request #115 · diff-use/sampleworks

k-chrispens · 2026-02-28T02:56:00Z

If we're going to run this script ever again, it would be worth having the occupancy_to_str function generalized to more altlocs, so I did that and updated the scripts. This will now not be compatible with the old 6B8X occupancy strings, which we have abandoned with the new grid search set anyway.

Summary by CodeRabbit

Breaking Changes
- Occupancy handling now uses per-location occupancy maps (dict) instead of a single scalar; public APIs, data payloads and cache keys changed accordingly.
- A legacy grid-search analysis module and some legacy fallbacks were removed.
Refactor
- Unified occupancy parsing, keying, caching and lookup behavior across evaluation tools to use composite occupancy keys.
Bug Fixes
- Stricter occupancy validation and clearer, occupancy-aware error/warning messages.
Tests
- Tests updated to use occupancy maps and revised return structures.

coderabbitai · 2026-02-28T02:56:16Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Replace scalar occupancy with per-altloc occupancy mappings across dataclasses, utilities, reference-loading, scripts, and tests; add deterministic occupancy formatting/validation helpers; and remove the RSCC grid-search analysis module.

Changes

Cohort / File(s)	Summary
Dataclass & scripts `src/sampleworks/eval/eval_dataclasses.py`, `scripts/eval/lddt_evaluation_script.py`, `scripts/eval/rscc_grid_search_script.py`	Replaced `occ_a: float` with `altloc_occupancies: dict[str,float]`; added `Experiment.occ_key` property; scripts updated to use `occ_key`/`altloc_occupancies` for cache keys, lookups, and logging.
Occupancy parsing & formatting `src/sampleworks/eval/occupancy_utils.py`	`extract_protein_and_occupancy(dir_name: str) -> tuple[str, dict[str,float]]` now returns per-altloc occupancy dicts; added `occupancy_to_str(**altloc_occupancies)` with validation, deterministic sorting, rounding, and omission of zero entries.
Reference & structure helpers `src/sampleworks/eval/structure_utils.py`, `src/sampleworks/eval/grid_search_eval_utils.py`	Signatures updated to accept `altloc_occupancies` or sequences thereof; reference-loading now uses `load_structure_with_altlocs`; aggregation across occupancy dictionaries added; typing/overloads adjusted.
Evaluation analysis removal `src/sampleworks/eval/rscc_grid_search_analysis.py`	Deleted entire RSCC/grid-search analysis module (scanning, RSCC computation, plotting, export routines).
Tests `tests/eval/test_structure_utils.py`	Updated tests to pass per-chain occupancy dicts/lists instead of scalar occupancies and adjusted assertions/expected results accordingly.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Creates a new way to test multiple selections for each protein; parallelizes LDDT script #116: Modifies eval_dataclasses.py and evaluation scripts—overlaps with Experiment field/usage changes.
Remove phenix clashscore script and add occupancy/density utility improvements #112: Touches occupancy_utils.py and occupancy parsing/formatting—strong overlap with the parser rewrite.
Align sampled structure to reference before computing maps in RSCC eval script; fixing tests. #127: Updates structure_utils signatures and overloads—related to reference-loading API changes.

Suggested reviewers

marcuscollins

Poem

🐰 I nibble tokens, trade one number for a map,

A and B now balanced in a tidy little wrap,
Parsers hum, tests hop through the new door,
Old analysis folded, neat on the floor,
From my burrow I twitch — hop, hop, encore!

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 51.85% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check	❓ Inconclusive	The PR title 'fix issue 17' is vague and non-descriptive, failing to convey the actual changes made (generalization of occupancy handling for multiple alternate locations).	Consider using a more descriptive title like 'Generalize occupancy handling to support multiple alternate locations' or 'Refactor occupancy parsing and validation for multi-altloc support'.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch kmc/issue-17

Tip

Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs).
Share your feedback on Discord.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Copilot

Pull request overview

This PR updates occupancy filename-string generation to support arbitrary altloc sets and refactors the RSCC grid search analysis notebook to use shared eval utilities/config loading, aligning the analysis workflow with the newer grid search dataset conventions (and dropping legacy 6B8X occupancy-string compatibility).

Changes:

Generalize occupancy_to_str() to accept arbitrary altloc occupancies via keyword args and omit zero-occupancy altlocs.
Update ProteinConfig path generation to use the new occupancy string format (removing legacy 6B8X-specific fallbacks).
Refactor rscc_grid_search_analysis.py to rely on shared constants/utilities (scan_grid_search_results, rscc, compute_density_from_atomarray, CSV-driven protein configs) and export results under GRID_SEARCH_DIR.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File	Description
`src/sampleworks/eval/rscc_grid_search_analysis.py`	Refactors RSCC analysis to use shared eval utilities/dataclasses/constants and CSV-driven configs; updates map loading/density computation and output pathing.
`src/sampleworks/eval/occupancy_utils.py`	Replaces two-altloc occupancy string helper with a generalized altloc-kwargs implementation.
`src/sampleworks/eval/eval_dataclasses.py`	Updates occupancy string generation in `ProteinConfig` to match the new generalized format and removes legacy 6B8X fallbacks.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/sampleworks/eval/rscc_grid_search_analysis.py (1)

219-239: ⚠️ Potential issue | 🟠 Major

Guard summary aggregation when no experiments are processed.

If results is empty, pd.DataFrame([r.__dict__ for r in results]) can be columnless, and Line 234 then raises on missing protein/scaler/rscc.

🛠️ Suggested change

 def _(pd, results):
     df = pd.DataFrame([r.__dict__ for r in results])
@@
     print("Results Summary:")
     print(df.drop(drop_cols, axis=1, errors="ignore").head(20).to_string())
 
-    print("\n\nSummary Statistics by Protein and Scaler:")
-    summary = (
-        df.groupby(["protein", "scaler"])["rscc"]
-        .agg(["count", "mean", "std", "min", "max"])
-        .round(4)
-    )
-    print(summary)
+    _required = {"protein", "scaler", "rscc"}
+    if df.empty or not _required.issubset(df.columns):
+        print("\n\nSummary Statistics by Protein and Scaler: no rows to summarize")
+    else:
+        print("\n\nSummary Statistics by Protein and Scaler:")
+        summary = (
+            df.groupby(["protein", "scaler"])["rscc"]
+            .agg(["count", "mean", "std", "min", "max"])
+            .round(4)
+        )
+        print(summary)
     return (df,)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/sampleworks/eval/rscc_grid_search_analysis.py` around lines 219 - 239,
The aggregation fails when results is empty; modify the logic around df and
summary to guard that case: after building df from results, check if df is empty
(or if results is falsy) and either initialize df with the expected columns
("protein", "scaler", "rscc", plus any dropped cols like "exp_dir",
"refined_cif_path", "base_map_path", "error", "protein_dir_name") or skip the
groupby and set summary to an empty DataFrame/appropriate placeholder; ensure
the prints still run without causing KeyError (use df.empty or "protein" in
df.columns to decide), and return the same tuple shape (return (df,)) so callers
get a consistent return type.

🧹 Nitpick comments (2)

src/sampleworks/eval/occupancy_utils.py (1)

72-77: Canonicalize altloc ordering before assembling the occupancy string.

Line 72 preserves caller insertion order, so equivalent occupancy sets can produce different strings (A,B vs B,A) and cause avoidable path/cache misses.

♻️ Suggested change

-    for label, occ in altloc_occupancies.items():
+    for label in sorted(altloc_occupancies):
+        occ = altloc_occupancies[label]
         if occ < 0:
             raise ValueError(f"Negative occupancy for altloc {label}: {occ}")
         occ = round(occ, 2)
         if abs(occ) > 1e-6:
             parts.append(f"{occ}occ{label}")

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/sampleworks/eval/occupancy_utils.py` around lines 72 - 77, The loop over
altloc_occupancies is using caller insertion order which yields
non-deterministic occupancy strings; iterate over a canonical sorted order
instead (e.g., iterate over sorted(altloc_occupancies.items()) or
sorted(altloc_occupancies.keys())) before applying the existing checks and
formatting so parts are appended deterministically; keep the ValueError for
negative occupancies, the rounding to two decimals, and the abs(occ) > 1e-6
check but perform them in the sorted iteration that references
altloc_occupancies and parts.

src/sampleworks/eval/rscc_grid_search_analysis.py (1)

681-686: Normalize occupancy precision before merging on float keys.

Line 684 merges on raw occ_a floats; tiny precision differences can silently drop correlation columns.

♻️ Suggested change

     if not df_base_vs_pure.empty:
+        _export_df = _export_df.copy()
+        _corr_df = df_base_vs_pure.copy()
+        _export_df["occ_a"] = _export_df["occ_a"].round(2)
+        _corr_df["occ_a"] = _corr_df["occ_a"].round(2)
         _export_df = pd.merge(
             _export_df,
-            df_base_vs_pure[["protein", "occ_a", "base_vs_1occA", "base_vs_1occB"]],
+            _corr_df[["protein", "occ_a", "base_vs_1occA", "base_vs_1occB"]],
             on=["protein", "occ_a"],
             how="left",
         )

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/sampleworks/eval/rscc_grid_search_analysis.py` around lines 681 - 686,
The merge uses raw float key occ_a which can misalign due to precision; before
calling pd.merge on ["protein","occ_a"], normalize/quantize occ_a in both
_export_df and df_base_vs_pure (e.g., round to a consistent number of decimals
or format/astype string) so the keys match reliably, then perform the pd.merge
to bring in base_vs_1occA and base_vs_1occB. Ensure you update the same column
name occ_a in both dataframes (or add a normalized key like occ_a_key) so
pd.merge(_export_df, df_base_vs_pure[...], on=["protein","occ_a"]) succeeds
without silent drops.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/sampleworks/eval/eval_dataclasses.py`:
- Around line 55-56: The calls to occupancy_to_str (e.g., where occ_str =
occupancy_to_str(A=occupancy_a, B=1.0 - occupancy_a) that feed map_path =
self.base_map_dir / self.map_pattern.format(occ_str=occ_str)) can raise
ValueError and must be handled locally to preserve the declared Path | None
return; wrap the occupancy_to_str call in a try/except that catches ValueError
and returns/assigns None for map_path (or otherwise yields the Path | None
fallback) instead of allowing the exception to propagate, updating both sites
(the occ_str usage around map_path and the similar usage at lines ~78-79) to
follow this pattern.

---

Outside diff comments:
In `@src/sampleworks/eval/rscc_grid_search_analysis.py`:
- Around line 219-239: The aggregation fails when results is empty; modify the
logic around df and summary to guard that case: after building df from results,
check if df is empty (or if results is falsy) and either initialize df with the
expected columns ("protein", "scaler", "rscc", plus any dropped cols like
"exp_dir", "refined_cif_path", "base_map_path", "error", "protein_dir_name") or
skip the groupby and set summary to an empty DataFrame/appropriate placeholder;
ensure the prints still run without causing KeyError (use df.empty or "protein"
in df.columns to decide), and return the same tuple shape (return (df,)) so
callers get a consistent return type.

---

Nitpick comments:
In `@src/sampleworks/eval/occupancy_utils.py`:
- Around line 72-77: The loop over altloc_occupancies is using caller insertion
order which yields non-deterministic occupancy strings; iterate over a canonical
sorted order instead (e.g., iterate over sorted(altloc_occupancies.items()) or
sorted(altloc_occupancies.keys())) before applying the existing checks and
formatting so parts are appended deterministically; keep the ValueError for
negative occupancies, the rounding to two decimals, and the abs(occ) > 1e-6
check but perform them in the sorted iteration that references
altloc_occupancies and parts.

In `@src/sampleworks/eval/rscc_grid_search_analysis.py`:
- Around line 681-686: The merge uses raw float key occ_a which can misalign due
to precision; before calling pd.merge on ["protein","occ_a"], normalize/quantize
occ_a in both _export_df and df_base_vs_pure (e.g., round to a consistent number
of decimals or format/astype string) so the keys match reliably, then perform
the pd.merge to bring in base_vs_1occA and base_vs_1occB. Ensure you update the
same column name occ_a in both dataframes (or add a normalized key like
occ_a_key) so pd.merge(_export_df, df_base_vs_pure[...], on=["protein","occ_a"])
succeeds without silent drops.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 14c38a2 and e0915dd.

📒 Files selected for processing (3)

src/sampleworks/eval/eval_dataclasses.py
src/sampleworks/eval/occupancy_utils.py
src/sampleworks/eval/rscc_grid_search_analysis.py

marcuscollins

I see at least one place where we don't handle the possibility of more than two altlocs. I think we probably should deal with that, since you partially handle it in this PR.

k-chrispens · 2026-03-02T18:56:33Z

Because there are two altloc dependencies in the LDDT eval script, I will postpone updating and merging this until that is merged

coderabbitai

Actionable comments posted: 4

♻️ Duplicate comments (2)

src/sampleworks/eval/structure_utils.py (1)

468-469: ⚠️ Potential issue | 🟠 Major

Default occupancy list is still hardcoded to two altlocs (A/B).

This fallback prevents true multi-altloc behavior and will miss valid reference combinations for proteins with 3+ altlocs.

🛠️ Proposed fix (fail fast instead of silently assuming A/B)

-    if occ_list is None:
-        occ_list = [{"A": 0.0, "B": 1.0}, {"A": 1.0, "B": 0.0}]
+    if occ_list is None:
+        raise ValueError(
+            "occ_list must be provided explicitly for multi-altloc proteins; "
+            "no safe default occupancy set exists."
+        )

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/sampleworks/eval/structure_utils.py` around lines 468 - 469, The code
currently hardcodes occ_list to two altlocs
[{"A":0.0,"B":1.0},{"A":1.0,"B":0.0}] when occ_list is None, which prevents
correct handling of proteins with 3+ altlocs; instead remove the A/B fallback
and fail-fast: if occ_list is None raise a ValueError (or TypeError) explaining
that occ_list must be provided (or compute it dynamically from the structure's
altloc identifiers if that logic exists), and update callers to pass a generated
occ_list; specifically change the occ_list default handling (the occ_list
assignment block) to raise an error or call a utility that enumerates occupancy
combinations from the structure's altloc set rather than assuming "A"/"B".

scripts/eval/lddt_evaluation_script.py (1)

223-225: ⚠️ Potential issue | 🟠 Major

Preloading is still hardcoded to two altlocs (A/B).

This loop cannot represent experiments with 3+ altloc occupancies, so generalized occupancy experiments will not be preloaded correctly.

🛠️ Suggested direction

-        for occ in args.occupancies:
-            altloc_occ = {"A": occ, "B": 1.0 - occ}
-            occ_key = tuple(sorted(altloc_occ.items()))
+        protein_occ_maps = {
+            exp.occ_key: dict(exp.occ_key)
+            for exp in all_experiments
+            if exp.protein.lower() == protein_key.lower()
+        }
+        for occ_key, altloc_occ in protein_occ_maps.items():

🧹 Nitpick comments (1)

scripts/eval/lddt_evaluation_script.py (1)

226-229: Load each reference structure once per occupancy, not once per selection.

get_reference_atomarraystack() is inside the selection loop, causing repeated disk loads of the same file.

♻️ Proposed refactor

-            for sel in protein_config.selection:
-                ref_path, reference_proteins = get_reference_atomarraystack(
-                    protein_config, altloc_occ
-                )
-                if reference_proteins is None:
-                    logger.warning(
-                        f"Could not find ref structure for {protein_key} "
-                        f"and occupancies {altloc_occ}"
-                    )
-                    continue
+            ref_path, reference_proteins = get_reference_atomarraystack(
+                protein_config, altloc_occ
+            )
+            if reference_proteins is None:
+                logger.warning(
+                    f"Could not find ref structure for {protein_key} "
+                    f"and occupancies {altloc_occ}"
+                )
+                continue
+            for sel in protein_config.selection:
                 try:

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@scripts/eval/lddt_evaluation_script.py` around lines 226 - 229,
get_reference_atomarraystack() is being called inside the loop over
protein_config.selection causing repeated disk loads; move the call out of the
selection loop so each reference structure is loaded once per occupancy
(altloc_occ) and reused for all selections—e.g., compute ref_path and
reference_proteins once before iterating over protein_config.selection (or cache
them keyed by altloc_occ) and then use those variables inside the loop that
processes each sel.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@scripts/eval/lddt_evaluation_script.py`:
- Around line 281-283: The cache lookup uses _exp.protein (original casing)
instead of the normalized protein variable, causing misses; update the
atom_array_key construction to use the normalized protein (the variable named
protein) together with _exp.occ_key when checking/reference_atom_arrays and
anywhere atom_array_key is created so preloaded references match the normalized
keys.

In `@src/sampleworks/eval/grid_search_eval_utils.py`:
- Around line 102-103: The current validation only checks for an empty
altloc_occupancies but allows mappings with all-zero occupancies; update the
condition around altloc_occupancies (the variable used in the if) to also reject
mappings where every value is zero by adding a check such as not any(v for v in
altloc_occupancies.values()) or all(v == 0.0 for v in
altloc_occupancies.values()); keep the existing checks for model ==
StructurePredictor.BOLTZ_2 and method is None unchanged (i.e., combine the new
all-zero check with the existing or clauses so all-zero mappings are treated as
invalid).

In `@src/sampleworks/eval/occupancy_utils.py`:
- Around line 7-9: The docstring and implementation disagree: the code currently
extracts the protein name with dir_name.split("_")[0] but the docstring claims
it should be "the first underscore-delimited token that does not match the
occupancy pattern." Update the implementation in occupancy_utils.py to actually
find the first token that does not match the occupancy regex (use the same
occupancy matching logic used elsewhere in this module) instead of always taking
the zeroth token, or alternatively change the docstring to state that the
protein name is simply the first underscore-delimited token; reference the
existing dir_name.split("_")[0] usage and the occupancy matching logic in this
module when making the change.
- Line 79: The lambda parameter name `l` in the sorted call over
`altloc_occupancies` triggers Ruff E741; update the key function in the loop
(the sorted(..., key=lambda l: str(l).upper())) to use a non-ambiguous parameter
name such as `label` or `item` (e.g., key=lambda label: str(label).upper()),
keeping the same logic and the loop header `for label in
sorted(altloc_occupancies, ...)` intact.

---

Duplicate comments:
In `@src/sampleworks/eval/structure_utils.py`:
- Around line 468-469: The code currently hardcodes occ_list to two altlocs
[{"A":0.0,"B":1.0},{"A":1.0,"B":0.0}] when occ_list is None, which prevents
correct handling of proteins with 3+ altlocs; instead remove the A/B fallback
and fail-fast: if occ_list is None raise a ValueError (or TypeError) explaining
that occ_list must be provided (or compute it dynamically from the structure's
altloc identifiers if that logic exists), and update callers to pass a generated
occ_list; specifically change the occ_list default handling (the occ_list
assignment block) to raise an error or call a utility that enumerates occupancy
combinations from the structure's altloc set rather than assuming "A"/"B".

---

Nitpick comments:
In `@scripts/eval/lddt_evaluation_script.py`:
- Around line 226-229: get_reference_atomarraystack() is being called inside the
loop over protein_config.selection causing repeated disk loads; move the call
out of the selection loop so each reference structure is loaded once per
occupancy (altloc_occ) and reused for all selections—e.g., compute ref_path and
reference_proteins once before iterating over protein_config.selection (or cache
them keyed by altloc_occ) and then use those variables inside the loop that
processes each sel.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: db0c9f95-2c33-4a5b-bc5f-693de9f973a1

📥 Commits

Reviewing files that changed from the base of the PR and between e0915dd and bffa7e6.

📒 Files selected for processing (7)

scripts/eval/lddt_evaluation_script.py
src/sampleworks/eval/eval_dataclasses.py
src/sampleworks/eval/grid_search_eval_utils.py
src/sampleworks/eval/occupancy_utils.py
src/sampleworks/eval/rscc_grid_search_analysis.py
src/sampleworks/eval/structure_utils.py
tests/eval/test_structure_utils.py

💤 Files with no reviewable changes (1)

src/sampleworks/eval/rscc_grid_search_analysis.py

marcuscollins · 2026-03-05T02:57:39Z

    logger.info("Pre-loading reference structures for each protein for coordinate extraction")
    reference_atom_arrays = {}
    for protein_key, protein_config in protein_configs.items():
-        for occ, sel in itertools.product(args.occupancies, protein_config.selection):


Is there a reason not to use itertools.product? I used it to facilitate possible future parallelization with joblib.

marcuscollins · 2026-03-05T03:06:24Z

-        occ_str = occupancy_to_str(occupancy_a, use_6b8x_format=self.protein == "6b8x")
+        try:
+            occ_str = occupancy_to_str(**altloc_occupancies)
+        except ValueError as e:


when does this happen? Can we prevent it with input validation?

this happens when the altlocs have negative or >1 occupancy

This is where input validation can really help. Maybe make an issue to add this feature.

marcuscollins · 2026-03-05T03:15:58Z

@@ -242,7 +242,7 @@ class TestGetReferenceAtomArrayStack:

    def test_returns_none_when_not_found(self, mock_protein_config):


(Not specific to this line): can you add tests with more than two altlocs?

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (1)

src/sampleworks/eval/occupancy_utils.py (1)

7-9: ⚠️ Potential issue | 🟡 Minor

Docstring and parser behavior still diverge.

Line 34 always takes the first underscore token, but Lines 7-9 document selecting the first token that is not an occupancy token.

🛠️ Proposed fix

-    protein = dir_name.split("_")[0].lower()
+    tokens = dir_name.split("_")
+    occ_token = re.compile(r"^\d+\.?\d*occ[A-Za-z]$", re.IGNORECASE)
+    protein = next((tok for tok in tokens if not occ_token.match(tok)), tokens[0]).lower()

Also applies to: 34-34

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/sampleworks/eval/occupancy_utils.py` around lines 7 - 9, The parser that
extracts protein name and occupancy tokens from dir_name currently always
selects the first underscore-delimited token as the protein name; update the
logic so it instead scans dir_name.split("_") and selects the first token that
does NOT match the occupancy pattern (the "{value}occ{label}" pattern), using
the same regex or helper (e.g., is_occupancy_token) you use elsewhere to detect
occupancy tokens, and keep the rest of tokens for occupancy parsing; also update
the function docstring to match this behavior so it states "first
underscore-delimited token that does not match the occupancy pattern."

🧹 Nitpick comments (1)

scripts/eval/lddt_evaluation_script.py (1)

225-228: Load each occupancy-specific reference once, then fan out by selection.

get_reference_atomarraystack() is currently inside the selection loop, so the same reference file is read repeatedly for each selection.

♻️ Proposed refactor

         for occ in args.occupancies:
             altloc_occ = {"A": occ, "B": 1.0 - occ}
             occ_key = tuple(sorted(altloc_occ.items()))
-            for sel in protein_config.selection:
-                ref_path, reference_proteins = get_reference_atomarraystack(
-                    protein_config, altloc_occ
-                )
-                if reference_proteins is None:
-                    logger.warning(
-                        f"Could not find ref structure for {protein_key} "
-                        f"and occupancies {altloc_occ}"
-                    )
-                    continue
+            ref_path, reference_proteins = get_reference_atomarraystack(
+                protein_config, altloc_occ
+            )
+            if reference_proteins is None:
+                logger.warning(
+                    f"Could not find ref structure for {protein_key} "
+                    f"and occupancies {altloc_occ}"
+                )
+                continue
+            for sel in protein_config.selection:
                 try:
                     logger.info(
                         f"Loaded ref structure for {protein_key} "
                         f"and occupancies {altloc_occ}: {ref_path}"
                     )

Also applies to: 246-249

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@scripts/eval/lddt_evaluation_script.py` around lines 225 - 228, The reference
loader get_reference_atomarraystack(...) is being called inside the
per-selection loop (for sel in protein_config.selection), causing the same file
to be read multiple times; move the call out of that loop so you call
get_reference_atomarraystack(protein_config, altloc_occ) once per occupancy and
reuse the returned ref_path and reference_proteins for each sel, and apply the
same change to the other duplicated block (the similar call at lines 246-249) so
selection iterations only fan out from the already-loaded reference instead of
reloading it.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@scripts/eval/lddt_evaluation_script.py`:
- Around line 223-224: The cache key construction keeps zero-valued
alternate-location entries which can differ from experiment keys parsed from
filenames; update the altloc_occ -> occ_key creation so you drop any entries
with zero (or negligibly small) occupancy before building occ_key (e.g., build
altloc_occ with only occ > 0 then set occ_key =
tuple(sorted(altloc_occ.items()))); apply the same canonicalization in the other
key-creation spot that mirrors this logic so the lookup that uses occ_key will
match preloaded data.

In `@src/sampleworks/eval/occupancy_utils.py`:
- Around line 79-97: The sum check on altloc_occupancies happens on the raw
input but later you round occupancies for output (in the loop that builds
parts), which can produce a filename whose rounded occupancies sum to >1; fix by
canonicalizing occupancies before validation and output: create a canonical_occ
dict (e.g., canonical_occ = {label: round(float(val), 2) for label, val in
altloc_occupancies.items()}) and use canonical_occ for the sum check, the
per-value 0..1 range check, and when building parts (replace uses of occ =
round(...) with occ = canonical_occ[label]), keeping the small tolerance check
(abs(occ) > 1e-6) as needed.

---

Duplicate comments:
In `@src/sampleworks/eval/occupancy_utils.py`:
- Around line 7-9: The parser that extracts protein name and occupancy tokens
from dir_name currently always selects the first underscore-delimited token as
the protein name; update the logic so it instead scans dir_name.split("_") and
selects the first token that does NOT match the occupancy pattern (the
"{value}occ{label}" pattern), using the same regex or helper (e.g.,
is_occupancy_token) you use elsewhere to detect occupancy tokens, and keep the
rest of tokens for occupancy parsing; also update the function docstring to
match this behavior so it states "first underscore-delimited token that does not
match the occupancy pattern."

---

Nitpick comments:
In `@scripts/eval/lddt_evaluation_script.py`:
- Around line 225-228: The reference loader get_reference_atomarraystack(...) is
being called inside the per-selection loop (for sel in
protein_config.selection), causing the same file to be read multiple times; move
the call out of that loop so you call
get_reference_atomarraystack(protein_config, altloc_occ) once per occupancy and
reuse the returned ref_path and reference_proteins for each sel, and apply the
same change to the other duplicated block (the similar call at lines 246-249) so
selection iterations only fan out from the already-loaded reference instead of
reloading it.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: aa4ef3b6-db08-4207-9969-a4060d9c77ab

📥 Commits

Reviewing files that changed from the base of the PR and between bffa7e6 and 8bf46d7.

📒 Files selected for processing (7)

scripts/eval/lddt_evaluation_script.py
src/sampleworks/eval/eval_dataclasses.py
src/sampleworks/eval/grid_search_eval_utils.py
src/sampleworks/eval/occupancy_utils.py
src/sampleworks/eval/rscc_grid_search_analysis.py
src/sampleworks/eval/structure_utils.py
tests/eval/test_structure_utils.py

💤 Files with no reviewable changes (1)

src/sampleworks/eval/rscc_grid_search_analysis.py

🚧 Files skipped from review as they are similar to previous changes (1)

src/sampleworks/eval/grid_search_eval_utils.py

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

src/sampleworks/eval/structure_utils.py (1)
467-468: Default occ_list assumes two altlocs (A and B).

The default value [{"A": 0.0, "B": 1.0}, {"A": 1.0, "B": 0.0}] is consistent with the hardcoded pattern in lddt_evaluation_script.py. Per issue #138 mentioned in past comments, this limitation is tracked for future enhancement.

Consider extracting this default to a module-level constant (e.g., DEFAULT_TWO_ALTLOC_OCCUPANCIES) shared across files to ensure consistency and simplify future updates when supporting more altlocs.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/sampleworks/eval/structure_utils.py` around lines 467 - 468, The
hardcoded default occ_list ( [{"A": 0.0, "B": 1.0}, {"A": 1.0, "B": 0.0}] )
should be extracted to a module-level constant named
DEFAULT_TWO_ALTLOC_OCCUPANCIES and referenced instead of being inlined where
occ_list is set (e.g., in structure_utils.py where occ_list is initialized) so
the same constant can be imported and used by other modules (like
lddt_evaluation_script.py) to guarantee consistency; add the constant near
top-level of the module, replace the inline literal with
DEFAULT_TWO_ALTLOC_OCCUPANCIES, and update other files to import and use that
constant.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/sampleworks/eval/eval_dataclasses.py`:
- Line 16: The code still accesses the removed Experiment attribute _exp.occ_a;
update rscc_grid_search_script.py to read the renamed field
_exp.altloc_occupancies everywhere you reference occ_a (right now at the several
occurrences noted), ensuring any downstream logic expecting a mapping uses the
dict[str,float] shape of altloc_occupancies; search for _exp.occ_a and replace
with _exp.altloc_occupancies and verify any unpacking, defaulting, or key access
matches the new dict structure (adjust variable names or iteration if needed).

---

Nitpick comments:
In `@src/sampleworks/eval/structure_utils.py`:
- Around line 467-468: The hardcoded default occ_list ( [{"A": 0.0, "B": 1.0},
{"A": 1.0, "B": 0.0}] ) should be extracted to a module-level constant named
DEFAULT_TWO_ALTLOC_OCCUPANCIES and referenced instead of being inlined where
occ_list is set (e.g., in structure_utils.py where occ_list is initialized) so
the same constant can be imported and used by other modules (like
lddt_evaluation_script.py) to guarantee consistency; add the constant near
top-level of the module, replace the inline literal with
DEFAULT_TWO_ALTLOC_OCCUPANCIES, and update other files to import and use that
constant.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 60918848-18d8-4441-8fd0-6410247f1db0

📥 Commits

Reviewing files that changed from the base of the PR and between 8bf46d7 and 314483e.

📒 Files selected for processing (7)

scripts/eval/lddt_evaluation_script.py
src/sampleworks/eval/eval_dataclasses.py
src/sampleworks/eval/grid_search_eval_utils.py
src/sampleworks/eval/occupancy_utils.py
src/sampleworks/eval/rscc_grid_search_analysis.py
src/sampleworks/eval/structure_utils.py
tests/eval/test_structure_utils.py

💤 Files with no reviewable changes (1)

src/sampleworks/eval/rscc_grid_search_analysis.py

🚧 Files skipped from review as they are similar to previous changes (2)

tests/eval/test_structure_utils.py
src/sampleworks/eval/grid_search_eval_utils.py

…have

…of altlocs and associated occupancies

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (2)

src/sampleworks/eval/occupancy_utils.py (1)

79-97: ⚠️ Potential issue | 🟠 Major

Canonicalize occupancies before validating and formatting them.

This still validates the raw float inputs and only rounds inside the output loop. Inputs like A=0.335, B=0.335, C=0.33 pass the sum check, but serialize to 0.34occA_0.34occB_0.33occC, which now represents a total occupancy of 1.01. Please normalize once first, then use that canonical mapping for both validation and string generation.

Proposed fix

-    if sum(altloc_occupancies.values()) > 1:
-        raise ValueError(
-            "Altloc occupancies cannot sum to more than 1, currently "
-            f"they sum to {sum(altloc_occupancies.values())}: {altloc_occupancies}"
-        )
-    if any(occ < 0 or occ > 1 for occ in altloc_occupancies.values()):
-        raise ValueError(
-            "Altloc occupancies must be between 0 and 1, currently "
-            f"they don't: {altloc_occupancies}"
-        )
-    parts = []
-    for label in sorted(altloc_occupancies, key=lambda label_name: str(label_name).upper()):
-        occ = altloc_occupancies[label]
-        occ = round(occ, 2)
-        if abs(occ) > 1e-6:
-            label_str = str(label).upper()
-            parts.append(f"{occ}occ{label_str}")
+    canonical_occupancies: dict[str, float] = {}
+    for label, occ in altloc_occupancies.items():
+        canonical_occ = round(float(occ), 2)
+        if canonical_occ < 0 or canonical_occ > 1:
+            raise ValueError(
+                "Altloc occupancies must be between 0 and 1, currently "
+                f"they don't: {altloc_occupancies}"
+            )
+        if abs(canonical_occ) > 1e-6:
+            canonical_occupancies[str(label).upper()] = canonical_occ
+
+    if not canonical_occupancies:
+        raise ValueError("At least one altloc must have non-zero occupancy")
+
+    total_occ = sum(canonical_occupancies.values())
+    if total_occ > 1 + 1e-6:
+        raise ValueError(
+            "Altloc occupancies cannot sum to more than 1, currently "
+            f"they sum to {total_occ}: {canonical_occupancies}"
+        )
+
+    parts = []
+    for label in sorted(canonical_occupancies):
+        parts.append(f"{canonical_occupancies[label]}occ{label}")
-    if not parts:
-        raise ValueError("At least one altloc must have non-zero occupancy")
     return "_".join(parts)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/sampleworks/eval/occupancy_utils.py` around lines 79 - 97, The code must
canonicalize (normalize) altloc_occupancies before validating and formatting:
compute total = sum(altloc_occupancies.values()), if total == 0 raise, then
create a normalized mapping (e.g., norm = {label: val/total for label,val in
altloc_occupancies.items()}) and use that mapping for the subsequent checks
(sum, bounds check using norm values) and for generating parts; keep rounding
(round(occ, 2)) only when producing the output strings (the parts list) so the
validation uses canonical values and the serialized labels reflect the
normalized occupancies; reference the existing variables altloc_occupancies,
occ, parts and label when making these changes.

scripts/eval/lddt_evaluation_script.py (1)

222-224: ⚠️ Potential issue | 🟠 Major

Canonicalize the occupancy cache key, and stop hard-coding an A/B pair here.

This preload path still stores references under raw {"A": occ, "B": 1.0 - occ} keys. At the 0.0/1.0 boundaries that produces keys like (("A", 0.0), ("B", 1.0)), but experiments parsed from filenames like 1.0occB use _exp.occ_key == (("B", 1.0),), so the later lookup misses a valid cached reference. This block also still only models two altlocs, which means the generalized 3+ altloc path is still incomplete here.

Also applies to: 280-281
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/eval/lddt_evaluation_script.py` around lines 222 - 224, The occupancy
cache key currently hard-codes an A/B pair (altloc_occ = {"A": occ, "B": 1.0 -
occ}) and creates occ_key without normalizing absent/zero entries, causing
mismatches with keys like (("B", 1.0),). Change the construction so you derive
the altloc->occupancy mapping from the actual altloc labels present (or the keys
of the occupancy dict) rather than assuming "A"/"B", then create occ_key as a
canonical, sorted tuple of (altloc, occ) pairs filtering out entries with zero
occupancy (e.g., occ > 0) so keys like (("B", 1.0),) and (("A",0.0),("B",1.0))
canonicalize to the same representation; apply the same fix to the other
occurrence of this logic around the later block (lines ~280-281).

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/sampleworks/eval/structure_utils.py`:
- Around line 464-465: get_reference_structure_coords currently defaults
occ_list to two hard-coded endpoints [{"A":0.0,"B":1.0}, {"A":1.0,"B":0.0}]
which misses additional altlocs; change the function to either require occ_list
be passed or, better, derive occ_list from the reference structure’s altloc set
(e.g., inspect the reference occupancy keys or altloc identifiers present and
generate a sweep of occupancy dicts covering all present altlocs) so that all
altlocs (A/B/C/...) are considered when building the reference mask used by
scripts like rscc_grid_search_script.py; update the
get_reference_structure_coords signature and callers to ensure occ_list is
provided or autogenerated from the structure’s available occupancies.

---

Duplicate comments:
In `@scripts/eval/lddt_evaluation_script.py`:
- Around line 222-224: The occupancy cache key currently hard-codes an A/B pair
(altloc_occ = {"A": occ, "B": 1.0 - occ}) and creates occ_key without
normalizing absent/zero entries, causing mismatches with keys like (("B",
1.0),). Change the construction so you derive the altloc->occupancy mapping from
the actual altloc labels present (or the keys of the occupancy dict) rather than
assuming "A"/"B", then create occ_key as a canonical, sorted tuple of (altloc,
occ) pairs filtering out entries with zero occupancy (e.g., occ > 0) so keys
like (("B", 1.0),) and (("A",0.0),("B",1.0)) canonicalize to the same
representation; apply the same fix to the other occurrence of this logic around
the later block (lines ~280-281).

In `@src/sampleworks/eval/occupancy_utils.py`:
- Around line 79-97: The code must canonicalize (normalize) altloc_occupancies
before validating and formatting: compute total =
sum(altloc_occupancies.values()), if total == 0 raise, then create a normalized
mapping (e.g., norm = {label: val/total for label,val in
altloc_occupancies.items()}) and use that mapping for the subsequent checks
(sum, bounds check using norm values) and for generating parts; keep rounding
(round(occ, 2)) only when producing the output strings (the parts list) so the
validation uses canonical values and the serialized labels reflect the
normalized occupancies; reference the existing variables altloc_occupancies,
occ, parts and label when making these changes.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 8ea7be87-1598-484e-ba57-8d92bf40b0e4

📥 Commits

Reviewing files that changed from the base of the PR and between 314483e and 6b76fd7.

📒 Files selected for processing (8)

scripts/eval/lddt_evaluation_script.py
scripts/eval/rscc_grid_search_script.py
src/sampleworks/eval/eval_dataclasses.py
src/sampleworks/eval/grid_search_eval_utils.py
src/sampleworks/eval/occupancy_utils.py
src/sampleworks/eval/rscc_grid_search_analysis.py
src/sampleworks/eval/structure_utils.py
tests/eval/test_structure_utils.py

💤 Files with no reviewable changes (1)

src/sampleworks/eval/rscc_grid_search_analysis.py

🚧 Files skipped from review as they are similar to previous changes (2)

src/sampleworks/eval/grid_search_eval_utils.py
src/sampleworks/eval/eval_dataclasses.py

coderabbitai · 2026-03-10T00:00:31Z

+    if occ_list is None:
+        occ_list = [{"A": 0.0, "B": 1.0}, {"A": 1.0, "B": 0.0}]


⚠️ Potential issue | 🟠 Major

The default occupancy sweep still hard-codes exactly two altlocs.

get_reference_structure_coords() is now part of the generalized altloc path, but its default still only loads {"A": 0.0, "B": 1.0} and {"A": 1.0, "B": 0.0}. scripts/eval/rscc_grid_search_script.py uses this default path, so any protein with a C/D/… altloc never contributes to the reference mask and the RSCC region is built from incomplete coordinates. Please either require occ_list explicitly here or derive it from the available reference occupancies instead of assuming two endpoints.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/sampleworks/eval/structure_utils.py` around lines 464 - 465, get_reference_structure_coords currently defaults occ_list to two hard-coded endpoints [{"A":0.0,"B":1.0}, {"A":1.0,"B":0.0}] which misses additional altlocs; change the function to either require occ_list be passed or, better, derive occ_list from the reference structure’s altloc set (e.g., inspect the reference occupancy keys or altloc identifiers present and generate a sweep of occupancy dicts covering all present altlocs) so that all altlocs (A/B/C/...) are considered when building the reference mask used by scripts like rscc_grid_search_script.py; update the get_reference_structure_coords signature and callers to ensure occ_list is provided or autogenerated from the structure’s available occupancies.

leaving this one for now.

@marcuscollins, understood — I'll leave this one for now. Feel free to revisit the hard-coded two-altloc default in get_reference_structure_coords in a future PR when it becomes relevant.

(╯°□°）╯

🧠 Learnings used

Learnt from: CR Repo: diff-use/sampleworks PR: 0 File: AGENTS.md:0-0 Timestamp: 2026-03-04T20:19:01.356Z Learning: Applies to src/sampleworks/**/*.py : Use AtomReconciler to handle atom count mismatches between model and structure representations

Learnt from: k-chrispens Repo: diff-use/sampleworks PR: 73 File: src/sampleworks/core/samplers/protocol.py:1-18 Timestamp: 2026-02-02T19:13:51.594Z Learning: In Python files across the repo, do not suggest or add inline or file-level suppressions for Ruff F722 (jaxtyping dimension strings). Instead rely on a global ignore configured in pyproject.toml (as done by the sampleworks project). Verify that F722 is included in the global ignore, and avoid recommending per-file # ruff: noqa: F722 directives unless explicitly documented as an exception.

Learnt from: marcuscollins Repo: diff-use/sampleworks PR: 132 File: src/sampleworks/utils/guidance_script_utils.py:585-586 Timestamp: 2026-03-05T16:30:40.514Z Learning: In Python code, if enums subclassing StrEnum are used (e.g., GuidanceType, StructurePredictor), they serialize to plain strings with json.dump and pickle without special handling. Do not flag these as non-serializable in code reviews. Treat StrEnum values as strings for JSON serialization and ensure tests cover that behavior; no extra pickle handling needed.

marcuscollins

There are a couple CodeRabbit comments that you should probably double check, although I don't think they're big issues. Otherwise looks okay. I will leave it to you to judge and merge when ready.

marcuscollins · 2026-03-10T00:14:18Z

+    Returns
+    -------
+    tuple[str, dict[str, float]]
+        ``(protein, altloc_occupancies)`` where *altloc_occupancies* maps


Oh, I discovered that and meant to remove this comment. Sorry.

marcuscollins · 2026-03-10T00:15:06Z

+    if occ_list is None:
+        occ_list = [{"A": 0.0, "B": 1.0}, {"A": 1.0, "B": 0.0}]


leaving this one for now.

Copilot AI review requested due to automatic review settings February 28, 2026 02:56

Copilot started reviewing on behalf of k-chrispens February 28, 2026 02:56 View session

Copilot AI reviewed Feb 28, 2026

View reviewed changes

Comment thread src/sampleworks/eval/occupancy_utils.py Outdated

Comment thread src/sampleworks/eval/occupancy_utils.py Outdated

Comment thread src/sampleworks/eval/rscc_grid_search_analysis.py Outdated

coderabbitai Bot reviewed Feb 28, 2026

View reviewed changes

Comment thread src/sampleworks/eval/eval_dataclasses.py Outdated

marcuscollins reviewed Mar 1, 2026

View reviewed changes

Comment thread src/sampleworks/eval/eval_dataclasses.py Outdated

Comment thread src/sampleworks/eval/rscc_grid_search_analysis.py Outdated

k-chrispens force-pushed the kmc/issue-17 branch from e0915dd to bffa7e6 Compare March 4, 2026 23:25

coderabbitai Bot reviewed Mar 4, 2026

View reviewed changes

Comment thread scripts/eval/lddt_evaluation_script.py Outdated

Comment thread src/sampleworks/eval/grid_search_eval_utils.py

Comment thread src/sampleworks/eval/occupancy_utils.py

Comment thread src/sampleworks/eval/occupancy_utils.py Outdated

marcuscollins reviewed Mar 5, 2026

View reviewed changes

k-chrispens force-pushed the kmc/issue-17 branch from 23017f4 to 8bf46d7 Compare March 5, 2026 19:52

k-chrispens requested a review from marcuscollins March 5, 2026 19:53

coderabbitai Bot reviewed Mar 5, 2026

View reviewed changes

Comment thread scripts/eval/lddt_evaluation_script.py Outdated

Comment thread src/sampleworks/eval/occupancy_utils.py Outdated

k-chrispens force-pushed the kmc/issue-17 branch from 8bf46d7 to 314483e Compare March 9, 2026 19:11

coderabbitai Bot reviewed Mar 9, 2026

View reviewed changes

Comment thread src/sampleworks/eval/eval_dataclasses.py

k-chrispens added 7 commits March 9, 2026 23:39

chore(occupancy_utils): generalize occupancy_to_str

f36ab63

fix(eval): update eval_dataclasses.py to use the new occupancy_to_str

6e3010d

fix(rscc_notebook): update the rscc notebook to use all the utils we …

e8cbcb4

…have

feat(altlocs): update all usage of only two altlocs to handle a dict …

91c5d96

…of altlocs and associated occupancies

chore: remove old analysis marimo notebook

73e4098

fix(cr): address coderabbit and @marcuscollins fb

f1de0e7

fix(cr): addressed coderabbit final comment

6b76fd7

k-chrispens force-pushed the kmc/issue-17 branch from 314483e to 6b76fd7 Compare March 9, 2026 23:54

coderabbitai Bot reviewed Mar 10, 2026

View reviewed changes

perf(lddt): revert itertools change (sorry!)

c37069b

marcuscollins approved these changes Mar 10, 2026

View reviewed changes

fix(cr): final coderabbit comments

4e92fe9

k-chrispens merged commit e7eda42 into main Mar 10, 2026
1 check was pending

k-chrispens deleted the kmc/issue-17 branch March 10, 2026 16:53

k-chrispens mentioned this pull request Mar 10, 2026

fix file paths for the 6b8x occ sweep results used for computing RSCC metrics #17

Closed

This was referenced Mar 11, 2026

refactor(eval): rename Experiment → Trial #142

Merged

fix(log): pass and log altloc_occ dict rather than float to get base map #169

Merged

coderabbitai Bot mentioned this pull request Mar 28, 2026

refactor(eval): unify rscc script structure loading with generate_synthetic_density.py #196

Closed

		@@ -242,7 +242,7 @@ class TestGetReferenceAtomArrayStack:

		def test_returns_none_when_not_found(self, mock_protein_config):

		if occ_list is None:
		occ_list = [{"A": 0.0, "B": 1.0}, {"A": 1.0, "B": 0.0}]

Conversation

k-chrispens commented Feb 28, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Feb 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (1 warning, 1 inconclusive)

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

marcuscollins left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

k-chrispens commented Mar 2, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

k-chrispens commented Feb 28, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Feb 28, 2026 •

edited

Loading

coderabbitai Bot Mar 10, 2026 •

edited

Loading