fix: quick patch to enforce that the ref chain is the one assigned to…#237
Conversation
📝 WalkthroughWalkthroughThis PR makes two independent changes: it expands ChangesOutput handling improvements
🎯 2 (Simple) | ⏱️ ~8 minutes
🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
scripts/patch_output_cif_files.py (1)
113-115: 🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick winAdd NumPy-style docstring to document behavior and limitations.
As per coding guidelines, every function should include a NumPy-style docstring. This is particularly important here given the chain mismatch handling and the TODO indicating broken multi-chain behavior.
📝 Suggested docstring structure
def patch_individual_cif_file( cif_file: Path, rcsb_regex: str, reference_dir: Path, input_pdb_pattern: str -) -> str | None: # returns an error message if there was one +) -> str | None: + """ + Patch a CIF file with metadata and residue numbering from reference structure. + + Parameters + ---------- + cif_file : Path + Path to the CIF file to patch. + rcsb_regex : str + Regex pattern to extract RCSB ID from the CIF file path. + reference_dir : Path + Directory containing reference PDB structures. + input_pdb_pattern : str + Pattern to locate input PDB file, with {pdb_id} placeholder. + + Returns + ------- + str | None + Error message if patching failed, None if successful. + + Notes + ----- + Chain mismatch handling is currently broken for multi-chain structures. + When chain IDs don't match between reference and CIF, an error is logged + but processing continues, which may result in incorrect residue mapping. + """As per coding guidelines, Python files should always include NumPy-style docstrings for every function and class.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@scripts/patch_output_cif_files.py` around lines 113 - 115, Add a NumPy-style docstring to the function patch_individual_cif_file describing its purpose, parameters (cif_file: Path, rcsb_regex: str, reference_dir: Path, input_pdb_pattern: str), return value (str | None indicating an error message or None), and key behavior/limitations including how chain-mismatch handling works and the current TODO about broken multi-chain behavior; ensure the docstring documents side-effects, exceptions raised (if any), and any preconditions/assumptions so future readers understand the chain-mismatch edge case and known limitation.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@scripts/patch_output_cif_files.py`:
- Around line 162-164: The code currently comments out the early "return msg" on
chain ID mismatch which lets execution continue and later updates
asym_unit.res_id using a possibly incorrect mapping; restore proper error
propagation by reinstating the original early return behavior (return msg) in
the chain-mismatch detection so callers receive the error string, or
alternatively change the function signature to return a tuple like (None,
warning_msg) / (error_msg, None) and update callers accordingly; locate the
chain mismatch check and the mapping code that leads to asym_unit.res_id to
apply this fix.
---
Outside diff comments:
In `@scripts/patch_output_cif_files.py`:
- Around line 113-115: Add a NumPy-style docstring to the function
patch_individual_cif_file describing its purpose, parameters (cif_file: Path,
rcsb_regex: str, reference_dir: Path, input_pdb_pattern: str), return value (str
| None indicating an error message or None), and key behavior/limitations
including how chain-mismatch handling works and the current TODO about broken
multi-chain behavior; ensure the docstring documents side-effects, exceptions
raised (if any), and any preconditions/assumptions so future readers understand
the chain-mismatch edge case and known limitation.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: a28e2a66-7930-476d-8603-cd544f67ceb2
📒 Files selected for processing (2)
.gitignorescripts/patch_output_cif_files.py
There was a problem hiding this comment.
Pull request overview
This PR tweaks the CIF post-processing patch script to no longer abort on reference/output chain mismatches, and updates .gitignore patterns to more broadly ignore large run-result directories.
Changes:
- Disable early-return failure on chain mismatches during residue remapping in
patch_output_cif_files.py. - Broaden
.gitignorepatterns for run outputs (grid_search_results*,output*) to avoid accidentally committing large result folders.
Reviewed changes
Copilot reviewed 1 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| scripts/patch_output_cif_files.py | Stops returning an error on chain mismatch during residue remapping (continues patching). |
| .gitignore | Expands ignored run-result directory patterns to match more output folder name variants. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if cif_key[0] != ref_key[0]: | ||
| msg = f"Chain mismatch while remapping residues for {cif_path} vs {reference_path}" | ||
| logger.error(msg) | ||
| return msg | ||
| # return msg | ||
| # TODO: fix chain mismatches upstream (protenix json creation needs update) |
There was a problem hiding this comment.
This is fine for now
marcuscollins
left a comment
There was a problem hiding this comment.
Approved but please do not forget to file that issue and follow up to get it fixed, otherwise this will cause us other problems down the road.
| logger.error(msg) | ||
| return msg | ||
| # return msg | ||
| # TODO: fix chain mismatches upstream (protenix json creation needs update) |
There was a problem hiding this comment.
Can you make sure to add an issue, please mark it as a bug actually, Can Michael or Justin take this one on?
There was a problem hiding this comment.
Added it! And probably yes. Will add tags now
… the cif file
Summary by CodeRabbit