Add AI contributor skills, per-script structure, and CI pipeline by timosachsenberg · Pull Request #2 · OpenMS/agentomics

timosachsenberg · 2026-03-24T19:44:46Z

Summary

Migrate 7 pyopenms scripts from feature branch into per-script directories, each with its own requirements.txt, README.md, tests/conftest.py, and test file (28 tests total, all passing)
Add ruff.toml (E/F/W/I rules, line-length 120) and ensure all scripts pass lint
Add two Claude Code skills: contribute-script (guided new tool creation) and validate-script (isolated venv validation)
Add AGENTS.md — platform-agnostic contributor guide for any AI agent (Copilot, Cursor, Gemini, etc.)
Add GitHub Actions CI (.github/workflows/validate.yml) that detects changed scripts on PR and validates each in a parallel isolated venv
Update CLAUDE.md and README.md to reflect the new structure
Fix deprecated pyopenms 3.5.0 APIs in feature_detection_proteomics and metabolite_feature_detection

Test plan

All 7 scripts pass ruff check with zero errors
All 28 tests pass via PYTHONPATH=<script> python -m pytest <script>/tests/ -v
CI workflow triggers correctly on a PR that touches scripts/
Verify contribute-script skill works end-to-end by adding a new tool

🤖 Generated with Claude Code

Summary by CodeRabbit

New Features
- Added 7 new pyopenms CLI tools for metabolomics and proteomics: isotope pattern matching, mass accuracy calculation, metabolite feature detection, proteomics feature detection, peptide mass calculation, protein digestion, and spectrum file analysis.
Documentation
- Added contributor guidelines, Claude Code integration instructions, and repository setup documentation for AI-assisted tool development.
Chores
- Added linting configuration, automated validation workflow, and skill templates for contributing and validating new tools.

Defines per-script directory structure, Claude Code skills (contribute-script, validate-script), AGENTS.md contributor guide, and GitHub Actions CI for validating pyopenms script contributions in isolated venvs. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…mplate - Resolve PYTHONPATH vs sys.path contradiction: PYTHONPATH is primary, conftest.py provides fallback for direct pytest invocation - Add concrete conftest.py template to reduce implementation ambiguity - Clarify migration: copy files from feature branch, decompose monolithic test files and shared requirements.txt into per-script structure - Use consistent language for branch migration (copy + restructure) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…line 14 tasks covering: ruff.toml, migration of 7 scripts to per-script directories, Claude Code skills, AGENTS.md, GitHub Actions CI, and documentation updates. Reviewed and approved with minor fixes applied. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…t data Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…thetic test data Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…thetic test data Fix FeatureFindingMetabo.run() API change in pyopenms >= 3.x: third argument is now output_chromatograms list, not a chrom_fwhm float. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- validate-script: isolated venv validation (ruff + pytest) for any script - contribute-script: guided workflow for creating new pyopenms tools Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Detects changed script directories on PR, validates each in parallel with its own venv: pip install from script's requirements.txt, ruff lint, pytest. Uses Python 3.11 for pyopenms wheel availability. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Add 'run all tests' one-liner to CLAUDE.md - Fix feature_detection_proteomics README to reference FeatureFinderAlgorithmPicked (pyopenms 3.5.0+ API) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-03-24T19:53:27Z

📝 Walkthrough

Walkthrough

This pull request establishes a comprehensive AI-assisted contribution and validation framework for the agentomics repository. It introduces standardized directory structure, validation procedures, and CI/CD automation for stand-alone pyopenms-based CLI tools, along with seven new functional scripts for proteomics and metabolomics analysis.

Changes

Cohort / File(s)	Summary
Documentation & Guidance `AGENTS.md`, `CLAUDE.md`, `README.md`	New contributor guidance defining per-script directory layout, required files, code patterns, and validation workflow. Updated README with project overview, requirements, scripts table, and validation summary.
Configuration & Infrastructure `ruff.toml`, `.github/workflows/validate.yml`	Added linting rules (E, F, W, I lint categories with 120-char line length) and GitHub Actions workflow that detects changed script directories on pull requests, then validates each via isolated venv with ruff and pytest checks.
Claude Code Skills `.claude/skills/contribute-script.md`, `.claude/skills/validate-script.md`	Defined two rigid, step-by-step skills: `contribute-script` scaffolds new tool directories with boilerplate, and `validate-script` runs linting and tests in a fresh temporary virtual environment.
Planning & Design Documentation `docs/superpowers/plans/2026-03-24-ai-contributor-skills.md`, `docs/superpowers/specs/2026-03-24-ai-contributor-skills-design.md`	Comprehensive implementation plan and design specification documenting the full contribution/validation infrastructure, migration workflow, and delivery requirements.
Metabolomics Scripts `scripts/metabolomics/isotope_pattern_matcher/`, `scripts/metabolomics/mass_accuracy_calculator/`, `scripts/metabolomics/metabolite_feature_detection/*`	Three new metabolomics tools: isotope pattern generation/matching, ppm mass accuracy calculation, and metabolite feature detection from LC-MS data. Each includes script, requirements, README, conftest with conditional skip marker, and test suite.
Proteomics Scripts `scripts/proteomics/feature_detection_proteomics/`, `scripts/proteomics/peptide_mass_calculator/`, `scripts/proteomics/protein_digest/`, `scripts/proteomics/spectrum_file_info/`	Four new proteomics tools: LC-MS feature detection, peptide mass and fragment ion calculation, in-silico protein digestion, and mzML file summary statistics. Each follows the same structure: script with pyopenms import guard, isolated dependencies, README documentation, and pyopenms-conditional tests.

Sequence Diagram(s)

sequenceDiagram
    participant PR as Pull Request
    participant GHA as GitHub Actions
    participant Detect as detect-changes Job
    participant Validate as validate Job
    participant Env as Isolated venv
    participant Ruff as Ruff Linter
    participant Pytest as Pytest Runner

    PR->>GHA: Trigger on scripts/** changes
    GHA->>Detect: Run detect-changes
    Detect->>Detect: Diff against base branch
    Detect->>Detect: Extract script directories
    Detect->>GHA: Output matrix with script_dir list
    
    GHA->>Validate: Run validate (if has_changes)
    loop For each script_dir in matrix
        Validate->>Env: Create isolated venv
        Validate->>Env: Install script's requirements.txt
        Validate->>Env: Install pytest and ruff
        Validate->>Ruff: Run ruff check on script_dir
        Ruff->>Validate: Return lint results
        Validate->>Pytest: Run pytest on script_dir/tests/
        Pytest->>Validate: Return test results
        Validate->>Env: Cleanup venv
    end
    Validate->>PR: Report pass/fail status

sequenceDiagram
    participant Dev as Developer
    participant Claude as Claude (contribute-script)
    participant Repo as Repository
    participant Venv as Isolated venv
    participant Validate as validate-script Skill

    Dev->>Claude: Run contribute-script skill
    Claude->>Dev: Prompt for tool domain & name
    Dev->>Claude: Provide inputs
    Claude->>Repo: Create script directory structure
    Claude->>Repo: Write scaffolded script.py (with pyopenms guard)
    Claude->>Repo: Write requirements.txt, README.md
    Claude->>Repo: Create tests/conftest.py (with skip marker)
    Claude->>Repo: Write test_*.py files
    
    Claude->>Validate: Invoke validate-script
    Validate->>Venv: Create temporary venv
    Validate->>Venv: Install script dependencies + pytest + ruff
    Validate->>Venv: Run ruff check
    Validate->>Venv: Run pytest
    Venv->>Validate: Return lint and test results
    Validate->>Venv: Cleanup
    Validate->>Claude: Report validation outcome
    Claude->>Dev: Confirm ready for commit

Poem

🐰 Hops excitedly
Scripts now scaffold with a hop and a bound,
Validation venvs spin and dance all around,
Seven new tools in their neat little holes,
With tests that skip smartly and lint that controls!
Agentomics blooms—oh what joy to behold! 🌱

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 32.69% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'Add AI contributor skills, per-script structure, and CI pipeline' directly and clearly summarizes the main changes: introduction of AI skills, restructured per-script organization, and CI/CD automation.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch add/ai-contributor-skills

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 15

🧹 Nitpick comments (7)

scripts/proteomics/feature_detection_proteomics/requirements.txt (1)
1-1: Constrain pyopenms to a validated range.

Line 1 is unpinned; future upstream changes can break this script unexpectedly in CI and local setups.
Suggested change
-pyopenms
+pyopenms>=3.5,<4
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/proteomics/feature_detection_proteomics/requirements.txt` at line 1,
The requirements file currently lists an unpinned dependency "pyopenms"; update
that entry in requirements.txt to a validated version or range (for example pin
to the exact tested version or a bounded range like >=X.Y.Z,<A.B.C) so CI and
local environments remain reproducible; after changing the "pyopenms" line, run
the proteomics feature_detection_proteomics tests or the project's dependency
verification job to confirm the chosen version works.
scripts/proteomics/peptide_mass_calculator/requirements.txt (1)
1-1: Pin pyopenms to a tested range for reproducibility.

Line 1 is currently unbounded, so CI behavior can drift as new releases land. Please constrain this to a validated range.
Suggested change
-pyopenms
+pyopenms>=3.5,<4
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/proteomics/peptide_mass_calculator/requirements.txt` at line 1,
Replace the unbounded pyopenms entry in requirements.txt with a pinned, tested
version range (or exact version) to ensure reproducible CI; update the single
line "pyopenms" to a constrained specification such as a specific version or a
range that your tests validate (for example, pyopenms>=2.7.0,<2.8.0) and ensure
this matches the version used in your CI/test matrix.
scripts/proteomics/feature_detection_proteomics/tests/test_feature_detection_proteomics.py (1)
32-34: Assert persisted output content, not just path existence.

This test would be stronger if it verifies the written featureXML can be loaded and matches the returned map size, instead of only checking that the file exists.
🔧 Suggested test hardening
             fm = detect_features(input_path, output_path)
             assert isinstance(fm, oms.FeatureMap)
             assert os.path.exists(output_path)
+            reloaded = oms.FeatureMap()
+            oms.FeatureXMLFile().load(output_path, reloaded)
+            assert reloaded.size() == fm.size()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@scripts/proteomics/feature_detection_proteomics/tests/test_feature_detection_proteomics.py`
around lines 32 - 34, The test currently only checks that the output_path file
exists; instead, load the written featureXML into a new oms.FeatureMap (use
oms.FeatureXMLFile().load or the appropriate loader) and assert the loaded map
is an oms.FeatureMap and that its feature count matches the returned fm (compare
fm.size() or len(fm) to loaded_map.size() or len(loaded_map)); replace the
assert os.path.exists(output_path) with these checks so the persisted content is
validated against the returned fm from detect_features.
scripts/metabolomics/mass_accuracy_calculator/tests/test_mass_accuracy_calculator.py (1)
9-40: Please add invalid-charge tests.

Given the public API takes charge: int, add coverage for charge <= 0 to protect against division-by-zero/regression paths.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@scripts/metabolomics/mass_accuracy_calculator/tests/test_mass_accuracy_calculator.py`
around lines 9 - 40, Add tests that pass invalid charges (0 and a negative int)
to the public APIs theoretical_mz_from_sequence and theoretical_mz_from_formula
to guard against division-by-zero/regression; for each function call with charge
0 and charge -1 assert that the call raises a clear exception (e.g. ValueError)
using pytest.raises, and include both sequence- and formula-based paths
(theoretical_mz_from_sequence("PEPTIDEK", 0),
theoretical_mz_from_sequence("PEPTIDEK", -1),
theoretical_mz_from_formula("C6H12O6", 0),
theoretical_mz_from_formula("C6H12O6", -1)) so the test suite ensures
invalid-charge handling is enforced.
scripts/proteomics/spectrum_file_info/tests/test_spectrum_file_info.py (1)
25-57: Add a regression test for spectra that have no peaks.

Current tests don’t cover the edge case where spectra exist but get_peaks() is empty. That case is important for validating tic_per_spectrum length and mz_range fallback behavior.
Suggested test addition
+    def test_summarise_spectra_with_no_peaks(self):
+        import pyopenms as oms
+        from spectrum_file_info import summarise_experiment
+
+        exp = oms.MSExperiment()
+        for i in range(3):
+            spec = oms.MSSpectrum()
+            spec.setMSLevel(1)
+            spec.setRT(float(i))
+            exp.addSpectrum(spec)
+
+        summary = summarise_experiment(exp)
+        assert summary["n_spectra"] == 3
+        assert len(summary["tic_per_spectrum"]) == 3
+        assert all(v == 0.0 for v in summary["tic_per_spectrum"])
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/proteomics/spectrum_file_info/tests/test_spectrum_file_info.py`
around lines 25 - 57, Add a regression test that builds an MSExperiment
containing spectra whose get_peaks() returns empty (i.e., spectra with no
peaks), call summarise_experiment on it, then assert the experiment-level
counters still report the correct number of spectra (summary["n_spectra"] == n)
and that len(summary["tic_per_spectrum"]) == n to ensure TIC is recorded per
spectrum, and verify the mz_range fallback is handled by asserting
summary["mz_range"] is falsy (e.g., None or an empty/None tuple) when no peaks
exist; reference summarise_experiment, tic_per_spectrum, get_peaks(), and
mz_range to locate the behavior to test.
docs/superpowers/plans/2026-03-24-ai-contributor-skills.md (1)
64-68: Consider adding language specifiers to fenced code blocks.

Several code blocks (e.g., lines 66-68 for requirements.txt content) lack language specifiers. While this doesn't affect execution, adding txt or text improves markdown rendering and satisfies linters.
📝 Example fix
- ```
+ ```txt
  pyopenms
</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>
Verify each finding against the current code and only fix it if needed.

In @docs/superpowers/plans/2026-03-24-ai-contributor-skills.md around lines 64 -
68, Update the fenced code block under "Step 3: Create requirements.txt" that
currently contains "pyopenms" so it includes a language specifier (e.g., change
totxt or ```text) to improve markdown rendering and satisfy linters;
locate the fenced block associated with the "Step 3: Create requirements.txt"
heading and add the specifier to the opening backticks.
</details>

</blockquote></details>
<details>
<summary>scripts/metabolomics/isotope_pattern_matcher/isotope_pattern_matcher.py (1)</summary><blockquote>

`82-89`: **Consider using closest match instead of first match within tolerance.**

The current implementation takes the first observed peak within the m/z tolerance window (line 86 `break`). If multiple observed peaks fall within the tolerance, using the closest match would be more accurate. However, for typical isotope patterns where peaks are well-separated (>1 Da), this is unlikely to cause issues.


<details>
<summary>♻️ Optional enhancement for closest-match selection</summary>

```diff
     for tmz, tint in theoretical:
         matched = 0.0
+        best_delta = float('inf')
         for omz, oint in observed:
-            if abs(omz - tmz) <= mz_tolerance:
+            delta = abs(omz - tmz)
+            if delta <= mz_tolerance and delta < best_delta:
+                best_delta = delta
                 matched = oint
-                break
         theo_vec.append(tint)
         obs_vec.append(matched)
```
</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against the current code and only fix it if needed.

In `@scripts/metabolomics/isotope_pattern_matcher/isotope_pattern_matcher.py`
around lines 82 - 89, The loop that matches theoretical peaks to observed picks
the first observed peak within mz_tolerance (variables tmz/tint and omz/oint
with the early break), which can select a non-closest match when multiple
observed peaks fall inside the window; change the logic in the matching block so
you search all observed peaks for the minimum abs(omz - tmz) and only accept it
if that minimum <= mz_tolerance (leave matched = 0.0 if none found), then append
tint and the chosen matched intensity to theo_vec and obs_vec — update the
matching code that currently sets matched and uses break to instead compute and
use the closest match.
```

</details>

</blockquote></details>

</blockquote></details>

<details>
<summary>🤖 Prompt for all review comments with AI agents</summary>
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.claude/skills/contribute-script.md:

Around line 75-80: The snippet uses sys.exit in the ImportError handler but
never imports sys, which will raise NameError on failure; add an import sys at
the top of the block so the try/except around importing pyopenms (import
pyopenms as oms) can call sys.exit(...) safely when ImportError is caught.

In @.claude/skills/validate-script.md:

Around line 21-30: Add a trap-based cleanup to guarantee removal of the
temporary virtualenv: immediately after VENV_DIR=$(mktemp -d) register trap 'rm
-rf "$VENV_DIR"' EXIT (so it runs on any exit/failure) and delete the final
explicit rm -rf "$VENV_DIR" line at the end; ensure the trap references the same
VENV_DIR variable and uses the EXIT signal so the venv is removed on both
success and failure.

In @.github/workflows/validate.yml:

Around line 24-27: The CHANGED assignment pipeline can fail when grep finds no
matches; modify the command that pipes to grep (the part using grep -oP
'scripts/[^/]+/[^/]+/') to add a fallback (append || true) so grep's non-zero
exit doesn't break the step, ensuring CHANGED becomes an empty list instead of
causing the job to fail; update the pipeline around the CHANGED variable
assignment and keep the rest of the processing (sort -u | jq -R -s -c ...)
unchanged.

In @docs/superpowers/specs/2026-03-24-ai-contributor-skills-design.md:

Around line 28-37: The conftest.py template currently combines imports on one
line; update the template so imports follow PEP8/ruff isort style by splitting
"import sys, os, pytest" into separate, ordered imports (import os; import sys;
import pytest) with appropriate blank lines, keeping the existing
sys.path.insert(0, os.path.join(os.path.dirname(file), "..")) and preserving
the try/except that sets HAS_PYOPENMS; modify the template's import block to
match the actual conftest.py pattern used in
scripts/metabolomics/metabolite_feature_detection/tests/conftest.py and ensure
symbols like HAS_PYOPENMS and sys.path.insert remain unchanged.

In @scripts/metabolomics/isotope_pattern_matcher/isotope_pattern_matcher.py:

Around line 53-56: The code computes max_ab = max(ab for _, ab in peaks) and
divides by it, which can raise ZeroDivisionError if all intensities are zero;
fix by checking if max_ab == 0 and in that case return the peaks with normalized
intensity 0.0 (e.g. return [(mz, 0.0) for mz, _ in peaks]) instead of performing
the division; update the branch that currently builds the normalized list (the
list comprehension using mz, ab in peaks) to handle this zero-max_ab case.

In @scripts/metabolomics/mass_accuracy_calculator/mass_accuracy_calculator.py:

Around line 37-55: Validate and reject non-positive charge values to avoid
divide-by-zero: in the importable helper theoretical_mz_from_sequence(sequence:
str, charge: int) add an explicit guard that raises a clear exception (e.g.,
ValueError) when charge <= 0, and apply the same validation to any other helper
functions that compute (mass + charge * PROTON) / charge; additionally enforce
and validate the CLI argument for --charge so it cannot accept 0 or negative
values (use argparse type/check or a custom validator) and return a helpful
message for users when an invalid charge is provided.

In
@scripts/metabolomics/metabolite_feature_detection/metabolite_feature_detection.py:

Around line 118-122: The current output default logic (declared in
parser.add_argument("--output", ...)) incorrectly documents ".featureXML"
and uses a brittle string replace that fails for inputs without a .mzML
extension or with different casing; change the code that computes the default
output (where args.output is set/constructed around line 139) to use
pathlib.Path: derive output by taking Path(args.input), if path.suffix.lower()
== ".mzml" replace the suffix with "_metabolites.featureXML", otherwise append
"_metabolites.featureXML" to the stem; also update the parser help text to
reflect the actual default naming (e.g., "(default:
_metabolites.featureXML)").

In
@scripts/proteomics/feature_detection_proteomics/feature_detection_proteomics.py:

Around line 17-22: The module currently calls sys.exit() in the module-level
try/except that imports pyopenms (the try: import pyopenms as oms except
ImportError: sys.exit(...)), which will terminate the interpreter on import;
instead either move the pyopenms import and dependency check into the main()
function so missing dependency only impacts CLI invocation, or replace the
sys.exit() with raising ImportError (e.g., re-raise or raise ImportError with
the same message) so importers/tests can catch it; update the import usage to
reference oms after the deferred import or ensure tests that import this module
can handle the ImportError from the module-level raise.

Line 93: The default output path calculation for output_path (using
args.output or args.input.replace(".mzML", ".featureXML")) is case-sensitive and
can produce the same path as args.input (e.g., "sample.mzml" or no matching
suffix), risking overwrite; update the logic that sets output_path in
feature_detection_proteomics.py to perform a case-insensitive extension swap
(use os.path.splitext on args.input and append ".featureXML") and ensure the
resulting output_path is different from args.input (if identical, add a suffix
like ".featureXML" or ".featureXML.1" to avoid clobbering); reference the
output_path assignment where args.output and args.input are used.

In @scripts/proteomics/peptide_mass_calculator/peptide_mass_calculator.py:

Around line 32-59: In peptide_masses, validate the charge parameter before
performing m/z calculations: in the function peptide_masses(sequence: str,
charge: int = 1) check that charge is a positive integer (e.g. if not
isinstance(charge, int) or charge <= 0) and raise a clear ValueError such as
"charge must be a positive integer" to prevent ZeroDivisionError when computing
mz_monoisotopic and mz_average using PROTON.

Around line 22-27: Current module-level sys.exit on ImportError terminates the
whole test run; replace the immediate exit with deferred error handling by
capturing the optional import (set oms = None) and adding a helper function
_require_pyopenms() that attempts to import/validate pyopenms or raises
ImportError with the same install message; then call _require_pyopenms() at the
start of the public functions peptide_masses() and fragment_ions() so the
ImportError is raised at runtime and the @requires_pyopenms test decorator can
skip tests instead of killing the process.

In @scripts/proteomics/protein_digest/protein_digest.py:

Around line 38-44: The digest_protein function currently allows invalid
digestion params; add explicit validation at the top of digest_protein to ensure
missed_cleavages is a non-negative integer and min_length and max_length are
positive integers with min_length <= max_length (raise ValueError with clear
message on invalid input), and clamp or reject values outside sensible bounds;
apply the same validation logic to the other digestion functions that accept the
same parameters (any other functions that take missed_cleavages, min_length,
max_length in this module) so invalid inputs are caught before digestion logic
runs.

Around line 62-64: The docstring describing the returned list-of-dicts is out
of sync: it mentions keys start, end, and missed_cleavages while the
actual payload contains sequence, length, and monoisotopic_mass. Update
the function's return docstring (the docstring block that currently lists
sequence, start, end, monoisotopic_mass, missed_cleavages) to
accurately describe the actual returned keys (sequence, length,
monoisotopic_mass) or alternatively modify the function that builds the dicts
so they include start, end, and missed_cleavages; ensure the docstring and
the returned dict structure (produced by the code that constructs each peptide
dict) match exactly.

In @scripts/proteomics/spectrum_file_info/spectrum_file_info.py:

Around line 4-5: The docstring/help claims support for "mzML (or mzXML)" but
the loader call MzMLFile().load(...) only handles mzML; update the code to make
the behavior match the docs by either changing the top-level help text to "mzML
only" or adding explicit mzXML handling: detect the input file extension inside
spectrum_file_info.py and call the appropriate loader (e.g., use
MzXMLFile().load(...) for .mzXML files) so that the loader selection (MzMLFile
vs MzXMLFile) aligns with the advertised formats.

Around line 56-70: summarise_experiment currently skips spectra with zero
peaks which misaligns tic_per_spectrum and can leave mz_min/mz_max as inf/-inf;
change the loop that calls spec.get_peaks() so you always append a TIC value
(append 0.0 when intensities is empty) and only update mz_min/mz_max when peaks
exist, then after the loop replace any leftover inf/-inf mz bounds with a safe
default (e.g., (0.0, 0.0) or None) before returning; apply the same fix in the
duplicate block around the second occurrence (lines noted as 124-126) so
functions/methods summarise_experiment, spec.get_peaks(), mz_min, mz_max, and
tic_per_spectrum are updated accordingly.

Nitpick comments:
In @docs/superpowers/plans/2026-03-24-ai-contributor-skills.md:

Around line 64-68: Update the fenced code block under "Step 3: Create
requirements.txt" that currently contains "pyopenms" so it includes a language
specifier (e.g., change totxt or ```text) to improve markdown rendering
and satisfy linters; locate the fenced block associated with the "Step 3: Create
requirements.txt" heading and add the specifier to the opening backticks.

In @scripts/metabolomics/isotope_pattern_matcher/isotope_pattern_matcher.py:

Around line 82-89: The loop that matches theoretical peaks to observed picks
the first observed peak within mz_tolerance (variables tmz/tint and omz/oint
with the early break), which can select a non-closest match when multiple
observed peaks fall inside the window; change the logic in the matching block so
you search all observed peaks for the minimum abs(omz - tmz) and only accept it
if that minimum <= mz_tolerance (leave matched = 0.0 if none found), then append
tint and the chosen matched intensity to theo_vec and obs_vec — update the
matching code that currently sets matched and uses break to instead compute and
use the closest match.

In
@scripts/metabolomics/mass_accuracy_calculator/tests/test_mass_accuracy_calculator.py:

Around line 9-40: Add tests that pass invalid charges (0 and a negative int)
to the public APIs theoretical_mz_from_sequence and theoretical_mz_from_formula
to guard against division-by-zero/regression; for each function call with charge
0 and charge -1 assert that the call raises a clear exception (e.g. ValueError)
using pytest.raises, and include both sequence- and formula-based paths
(theoretical_mz_from_sequence("PEPTIDEK", 0),
theoretical_mz_from_sequence("PEPTIDEK", -1),
theoretical_mz_from_formula("C6H12O6", 0),
theoretical_mz_from_formula("C6H12O6", -1)) so the test suite ensures
invalid-charge handling is enforced.

In @scripts/proteomics/feature_detection_proteomics/requirements.txt:

Line 1: The requirements file currently lists an unpinned dependency
"pyopenms"; update that entry in requirements.txt to a validated version or
range (for example pin to the exact tested version or a bounded range like

=X.Y.Z,<A.B.C) so CI and local environments remain reproducible; after changing
the "pyopenms" line, run the proteomics feature_detection_proteomics tests or
the project's dependency verification job to confirm the chosen version works.

In
@scripts/proteomics/feature_detection_proteomics/tests/test_feature_detection_proteomics.py:

Around line 32-34: The test currently only checks that the output_path file
exists; instead, load the written featureXML into a new oms.FeatureMap (use
oms.FeatureXMLFile().load or the appropriate loader) and assert the loaded map
is an oms.FeatureMap and that its feature count matches the returned fm (compare
fm.size() or len(fm) to loaded_map.size() or len(loaded_map)); replace the
assert os.path.exists(output_path) with these checks so the persisted content is
validated against the returned fm from detect_features.

In @scripts/proteomics/peptide_mass_calculator/requirements.txt:

Line 1: Replace the unbounded pyopenms entry in requirements.txt with a
pinned, tested version range (or exact version) to ensure reproducible CI;
update the single line "pyopenms" to a constrained specification such as a
specific version or a range that your tests validate (for example,
pyopenms>=2.7.0,<2.8.0) and ensure this matches the version used in your CI/test
matrix.

In @scripts/proteomics/spectrum_file_info/tests/test_spectrum_file_info.py:

Around line 25-57: Add a regression test that builds an MSExperiment
containing spectra whose get_peaks() returns empty (i.e., spectra with no
peaks), call summarise_experiment on it, then assert the experiment-level
counters still report the correct number of spectra (summary["n_spectra"] == n)
and that len(summary["tic_per_spectrum"]) == n to ensure TIC is recorded per
spectrum, and verify the mz_range fallback is handled by asserting
summary["mz_range"] is falsy (e.g., None or an empty/None tuple) when no peaks
exist; reference summarise_experiment, tic_per_spectrum, get_peaks(), and
mz_range to locate the behavior to test.
</details>

---

<details>
<summary>ℹ️ Review info</summary>

<details>
<summary>⚙️ Run configuration</summary>

**Configuration used**: Organization UI

**Review profile**: CHILL

**Plan**: Pro

**Run ID**: `99ad1eec-e6f7-4632-9c4b-a6b7038cfb99`

</details>

<details>
<summary>📥 Commits</summary>

Reviewing files that changed from the base of the PR and between a7ade618be33016aa93f8fed05ea900091fbf767 and 6ca1237b84940522ea34f992d9d1f1c108cb0b5e.

</details>

<details>
<summary>📒 Files selected for processing (44)</summary>

* `.claude/skills/contribute-script.md`
* `.claude/skills/validate-script.md`
* `.github/workflows/validate.yml`
* `AGENTS.md`
* `CLAUDE.md`
* `README.md`
* `docs/superpowers/plans/2026-03-24-ai-contributor-skills.md`
* `docs/superpowers/specs/2026-03-24-ai-contributor-skills-design.md`
* `ruff.toml`
* `scripts/metabolomics/isotope_pattern_matcher/README.md`
* `scripts/metabolomics/isotope_pattern_matcher/isotope_pattern_matcher.py`
* `scripts/metabolomics/isotope_pattern_matcher/requirements.txt`
* `scripts/metabolomics/isotope_pattern_matcher/tests/conftest.py`
* `scripts/metabolomics/isotope_pattern_matcher/tests/test_isotope_pattern_matcher.py`
* `scripts/metabolomics/mass_accuracy_calculator/README.md`
* `scripts/metabolomics/mass_accuracy_calculator/mass_accuracy_calculator.py`
* `scripts/metabolomics/mass_accuracy_calculator/requirements.txt`
* `scripts/metabolomics/mass_accuracy_calculator/tests/conftest.py`
* `scripts/metabolomics/mass_accuracy_calculator/tests/test_mass_accuracy_calculator.py`
* `scripts/metabolomics/metabolite_feature_detection/README.md`
* `scripts/metabolomics/metabolite_feature_detection/metabolite_feature_detection.py`
* `scripts/metabolomics/metabolite_feature_detection/requirements.txt`
* `scripts/metabolomics/metabolite_feature_detection/tests/conftest.py`
* `scripts/metabolomics/metabolite_feature_detection/tests/test_metabolite_feature_detection.py`
* `scripts/proteomics/feature_detection_proteomics/README.md`
* `scripts/proteomics/feature_detection_proteomics/feature_detection_proteomics.py`
* `scripts/proteomics/feature_detection_proteomics/requirements.txt`
* `scripts/proteomics/feature_detection_proteomics/tests/conftest.py`
* `scripts/proteomics/feature_detection_proteomics/tests/test_feature_detection_proteomics.py`
* `scripts/proteomics/peptide_mass_calculator/README.md`
* `scripts/proteomics/peptide_mass_calculator/peptide_mass_calculator.py`
* `scripts/proteomics/peptide_mass_calculator/requirements.txt`
* `scripts/proteomics/peptide_mass_calculator/tests/conftest.py`
* `scripts/proteomics/peptide_mass_calculator/tests/test_peptide_mass_calculator.py`
* `scripts/proteomics/protein_digest/README.md`
* `scripts/proteomics/protein_digest/protein_digest.py`
* `scripts/proteomics/protein_digest/requirements.txt`
* `scripts/proteomics/protein_digest/tests/conftest.py`
* `scripts/proteomics/protein_digest/tests/test_protein_digest.py`
* `scripts/proteomics/spectrum_file_info/README.md`
* `scripts/proteomics/spectrum_file_info/requirements.txt`
* `scripts/proteomics/spectrum_file_info/spectrum_file_info.py`
* `scripts/proteomics/spectrum_file_info/tests/conftest.py`
* `scripts/proteomics/spectrum_file_info/tests/test_spectrum_file_info.py`

</details>

</details>

coderabbitai · 2026-03-24T19:53:30Z

.claude/skills/contribute-script.md

+  ```python
+  try:
+      import pyopenms as oms
+  except ImportError:
+      sys.exit("pyopenms is required. Install it with:  pip install pyopenms")
+  ```


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Verify whether script modules using sys.exit also import sys. # Expected: no output (or review any printed files). fd -e py . scripts | while read -r f; do if rg -n '\bsys\.exit\s*\(' "$f" >/dev/null; then if ! rg -n '^\s*import\s+sys\b|^\s*from\s+sys\s+import\b' "$f" >/dev/null; then echo "Missing import sys: $f" fi fi done

Repository: OpenMS/agentomics

Length of output: 43

🏁 Script executed:

cat -n .claude/skills/contribute-script.md | sed -n '70,85p'

Repository: OpenMS/agentomics

Length of output: 773

🏁 Script executed:

cat -n .claude/skills/contribute-script.md | sed -n '60,90p'

Repository: OpenMS/agentomics

Length of output: 1231

Add import sys to the pyopenms import-guard pattern.

The documented pattern at lines 75–80 uses sys.exit() without showing import sys. Contributors copying this exact snippet will encounter a NameError on the import failure path.

Suggested doc fix

- pyopenms import guard: ```python + import sys try: import pyopenms as oms except ImportError: sys.exit("pyopenms is required. Install it with: pip install pyopenms")

</details>  <details> <summary>📝 Committable suggestion</summary> > ‼️ **IMPORTANT** > Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements. ```suggestion

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In @.claude/skills/contribute-script.md around lines 75 - 80, The snippet uses sys.exit in the ImportError handler but never imports sys, which will raise NameError on failure; add an import sys at the top of the block so the try/except around importing pyopenms (import pyopenms as oms) can call sys.exit(...) safely when ImportError is caught.

coderabbitai · 2026-03-24T19:53:30Z

.claude/skills/validate-script.md

+   ```bash
+   SCRIPT_DIR=<path-to-script-directory>
+   VENV_DIR=$(mktemp -d)
+   python -m venv "$VENV_DIR"
+   "$VENV_DIR/bin/python" -m pip install -r "$SCRIPT_DIR/requirements.txt"
+   "$VENV_DIR/bin/python" -m pip install pytest ruff
+   "$VENV_DIR/bin/python" -m ruff check "$SCRIPT_DIR/"
+   PYTHONPATH="$SCRIPT_DIR" "$VENV_DIR/bin/python" -m pytest "$SCRIPT_DIR/tests/" -v
+   rm -rf "$VENV_DIR"
+   ```


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Demonstrate cleanup behavior with and without trap. # Expected: without trap, temp dir remains after failure; with trap, it is removed. set -euo pipefail echo "Without trap:" if bash -c 'set -e; V=$(mktemp -d); false; rm -rf "$V"' 2>/dev/null; then :; fi # Can't read V from subshell directly; show equivalent reproducible pattern: bash -c 'set -e; V=$(mktemp -d); echo "$V" > /tmp/venv_path_no_trap; false; rm -rf "$V"' || true V1=$(cat /tmp/venv_path_no_trap) if [ -d "$V1" ]; then echo "Leaked: $V1"; rm -rf "$V1"; fi echo "With trap:" bash -c 'set -e; V=$(mktemp -d); echo "$V" > /tmp/venv_path_trap; trap "rm -rf \"$V\"" EXIT; false' || true V2=$(cat /tmp/venv_path_trap) if [ -d "$V2" ]; then echo "Unexpected leak: $V2"; else echo "Cleaned as expected"; fi

Repository: OpenMS/agentomics

Length of output: 133

🏁 Script executed:

# Check if the file exists and read the relevant lines if [ -f ".claude/skills/validate-script.md" ]; then echo "=== File exists, reading lines 15-40 for context ===" sed -n '15,40p' ".claude/skills/validate-script.md" | cat -n else echo "File not found at .claude/skills/validate-script.md" # Search for similar files find . -name "*validate*script*" -o -name "*.md" -path "*claude*" 2>/dev/null | head -20 fi

Repository: OpenMS/agentomics

Length of output: 1064

🏁 Script executed:

# Get actual line numbers and check for line 34 reference wc -l ".claude/skills/validate-script.md" echo "=== Lines 20-35 (to see full context and line 34) ===" sed -n '20,35p' ".claude/skills/validate-script.md" | cat -n

Repository: OpenMS/agentomics

Length of output: 886

Add trap-based cleanup to ensure venv removal on failure.

The current code does not guarantee cleanup if any command fails before the final rm -rf. Step 5 (line 34) explicitly requires cleanup even when validation fails. Add trap 'rm -rf "$VENV_DIR"' EXIT after VENV_DIR=$(mktemp -d) and remove the final rm -rf "$VENV_DIR" line.

Suggested command block update

```bash SCRIPT_DIR=<path-to-script-directory> VENV_DIR=$(mktemp -d) + trap 'rm -rf "$VENV_DIR"' EXIT python -m venv "$VENV_DIR" "$VENV_DIR/bin/python" -m pip install -r "$SCRIPT_DIR/requirements.txt" "$VENV_DIR/bin/python" -m pip install pytest ruff "$VENV_DIR/bin/python" -m ruff check "$SCRIPT_DIR/" PYTHONPATH="$SCRIPT_DIR" "$VENV_DIR/bin/python" -m pytest "$SCRIPT_DIR/tests/" -v - rm -rf "$VENV_DIR"

</details> Also applies to: 34-34 <details> <summary>🤖 Prompt for AI Agents</summary>

Verify each finding against the current code and only fix it if needed.

In @.claude/skills/validate-script.md around lines 21 - 30, Add a trap-based
cleanup to guarantee removal of the temporary virtualenv: immediately after
VENV_DIR=$(mktemp -d) register trap 'rm -rf "$VENV_DIR"' EXIT (so it runs on any
exit/failure) and delete the final explicit rm -rf "$VENV_DIR" line at the end;
ensure the trap references the same VENV_DIR variable and uses the EXIT signal
so the venv is removed on both success and failure.

</details>  

coderabbitai · 2026-03-24T19:53:30Z

.github/workflows/validate.yml

+          CHANGED=$(git diff --name-only origin/${{ github.base_ref }}...HEAD -- 'scripts/' \
+            | grep -oP 'scripts/[^/]+/[^/]+/' \
+            | sort -u \
+            | jq -R -s -c 'split("\n") | map(select(length > 0))')


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Reproduce grep no-match behavior under bash -e in a command-substitution pipeline. # Expected: first block fails; second block with `|| true` succeeds. set -euo pipefail echo "a/b/c.txt" > /tmp/changed_paths.txt echo "Case 1 (without || true) - should fail:" if bash -e -c 'X=$(cat /tmp/changed_paths.txt | grep -oP "scripts/[^/]+/[^/]+/" | sort -u); echo "$X"'; then echo "Unexpected success" else echo "Failed as expected" fi echo "Case 2 (with || true) - should succeed:" bash -e -c 'X=$(cat /tmp/changed_paths.txt | grep -oP "scripts/[^/]+/[^/]+/" || true); echo "X=${X:-<empty>}"'

Repository: OpenMS/agentomics

Length of output: 170

🏁 Script executed:

cat -n .github/workflows/validate.yml | head -50

Repository: OpenMS/agentomics

Length of output: 1897

🏁 Script executed:

git ls-files '.github/workflows/'

Repository: OpenMS/agentomics

Length of output: 91

🌐 Web query:

GitHub Actions bash run shell set -e default behavior

💡 Result:

In GitHub Actions, for run steps on Linux/macOS runners (non-Windows), the default shell behavior when no shell is explicitly specified is to use bash -e {0} if bash is available, or sh -e {0} otherwise. The -e flag enables set -e semantics, causing the script to exit immediately if any command returns a non-zero exit code (fail-fast behavior). When shell: bash is explicitly specified, it uses bash --noprofile --norc -eo pipefail {0}, which includes set -e, set -o pipefail (fail on pipeline failure), and suppresses profile/rc loading. This is the default to enforce fail-fast behavior in CI scripts, ensuring steps fail early on errors. To disable, use shell: bash {0} or set +e in the script. On Windows, the default is PowerShell, not bash.

Citations:

1: https://stackoverflow.com/questions/75419587/does-a-github-action-step-use-set-e-semantics-by-default

2: https://stackoverflow.com/questions/73066461/github-actions-why-an-intermediate-command-failure-in-shell-script-would-cause/73069669

3: https://github.com/actions/runner/blob/main/docs/adrs/0277-run-action-shell-options.md

4: https://docs.github.com/actions/using-workflows/workflow-syntax-for-github-actions

5: https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions

🏁 Script executed:

fd -t f '.github/workflows/' -name '*.yml' -o -name '*.yaml' | xargs grep -l 'git diff' | head -5

Repository: OpenMS/agentomics

Length of output: 291

🏁 Script executed:

#!/bin/bash # Test grep behavior with pipefail (which GitHub Actions uses by default) echo "a/b/c.txt" > /tmp/changed_paths.txt echo "Case 1 (bash -eo pipefail, without || true) - should fail:" if bash -eo pipefail -c 'X=$(cat /tmp/changed_paths.txt | grep -oP "scripts/[^/]+/[^/]+/" | sort -u); echo "$X"'; then echo "Unexpected success" else echo "Failed as expected (exit code: $?)" fi echo "" echo "Case 2 (bash -eo pipefail, with || true) - should succeed:" if bash -eo pipefail -c 'X=$(cat /tmp/changed_paths.txt | grep -oP "scripts/[^/]+/[^/]+/" || true | sort -u); echo "X=${X:-<empty>}"'; then echo "Succeeded as expected" else echo "Unexpected failure" fi

Repository: OpenMS/agentomics

Length of output: 245

🏁 Script executed:

fd '.github/workflows' -type f

Repository: OpenMS/agentomics

Length of output: 231

🏁 Script executed:

find .github/workflows -type f -name '*.yml' -o -name '*.yaml'

Repository: OpenMS/agentomics

Length of output: 91

🏁 Script executed:

grep -r "grep -oP" .github/workflows/

Repository: OpenMS/agentomics

Length of output: 139

Add fallback to grep to prevent pipeline failure on zero matches.

In GitHub Actions, the default shell (bash -eo pipefail) will fail the entire step if grep finds no matches and returns exit code 1. When no script directories change, this pipeline fails before reaching the error-handling check at line 29. Add || true after grep to ensure the pipeline continues.

Suggested fix

CHANGED=$(git diff --name-only origin/${{ github.base_ref }}...HEAD -- 'scripts/' \ | grep -oP 'scripts/[^/]+/[^/]+/' \ + || true \ | sort -u \ | jq -R -s -c 'split("\n") | map(select(length > 0))')

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In @.github/workflows/validate.yml around lines 24 - 27, The CHANGED assignment pipeline can fail when grep finds no matches; modify the command that pipes to grep (the part using grep -oP 'scripts/[^/]+/[^/]+/') to add a fallback (append || true) so grep's non-zero exit doesn't break the step, ensuring CHANGED becomes an empty list instead of causing the job to fail; update the pipeline around the CHANGED variable assignment and keep the rest of the processing (sort -u | jq -R -s -c ...) unchanged.

coderabbitai · 2026-03-24T19:53:30Z

docs/superpowers/specs/2026-03-24-ai-contributor-skills-design.md

+  ```python
+  import sys, os, pytest
+  sys.path.insert(0, os.path.join(os.path.dirname(__file__), ".."))
+  try:
+      import pyopenms
+      HAS_PYOPENMS = True
+  except ImportError:
+      HAS_PYOPENMS = False
+  requires_pyopenms = pytest.mark.skipif(not HAS_PYOPENMS, reason="pyopenms not installed")
+  ```


⚠️ Potential issue | 🟡 Minor

Inconsistency: conftest.py template differs from actual implementation.

The template shows import sys, os, pytest on a single line (line 29), which violates PEP 8 and the ruff isort rules configured in ruff.toml. The actual implementation in scripts/metabolomics/metabolite_feature_detection/tests/conftest.py correctly separates imports.

Update the template to match the actual implementation pattern:

📝 Proposed fix

- ```python - import sys, os, pytest - sys.path.insert(0, os.path.join(os.path.dirname(__file__), "..")) + ```python + import os + import sys + + import pytest + + sys.path.insert(0, os.path.join(os.path.dirname(__file__), ".."))

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

```python

import sys, os, pytest

sys.path.insert(0, os.path.join(os.path.dirname(__file__), ".."))

try:

import pyopenms

HAS_PYOPENMS = True

except ImportError:

HAS_PYOPENMS = False

requires_pyopenms = pytest.mark.skipif(not HAS_PYOPENMS, reason="pyopenms not installed")

```

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@docs/superpowers/specs/2026-03-24-ai-contributor-skills-design.md` around lines 28 - 37, The conftest.py template currently combines imports on one line; update the template so imports follow PEP8/ruff isort style by splitting "import sys, os, pytest" into separate, ordered imports (import os; import sys; import pytest) with appropriate blank lines, keeping the existing sys.path.insert(0, os.path.join(os.path.dirname(__file__), "..")) and preserving the try/except that sets HAS_PYOPENMS; modify the template's import block to match the actual conftest.py pattern used in scripts/metabolomics/metabolite_feature_detection/tests/conftest.py and ensure symbols like HAS_PYOPENMS and sys.path.insert remain unchanged.

coderabbitai · 2026-03-24T19:53:30Z

scripts/metabolomics/isotope_pattern_matcher/isotope_pattern_matcher.py

+    if not peaks:
+        return []
+    max_ab = max(ab for _, ab in peaks)
+    return [(mz, ab / max_ab * 100) for mz, ab in peaks]


⚠️ Potential issue | 🟡 Minor

Potential division by zero when all peak intensities are zero.

If max_ab is 0 (all peaks have zero intensity), line 56 will raise a ZeroDivisionError. While rare, this could occur with unusual formula inputs.

🛡️ Proposed fix

if not peaks: return [] max_ab = max(ab for _, ab in peaks) + if max_ab == 0: + return [(mz, 0.0) for mz, _ in peaks] return [(mz, ab / max_ab * 100) for mz, ab in peaks]

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if not peaks:

return []

max_ab = max(ab for _, ab in peaks)

return [(mz, ab / max_ab * 100) for mz, ab in peaks]

if not peaks:

return []

max_ab = max(ab for _, ab in peaks)

if max_ab == 0:

return [(mz, 0.0) for mz, _ in peaks]

return [(mz, ab / max_ab * 100) for mz, ab in peaks]

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@scripts/metabolomics/isotope_pattern_matcher/isotope_pattern_matcher.py` around lines 53 - 56, The code computes max_ab = max(ab for _, ab in peaks) and divides by it, which can raise ZeroDivisionError if all intensities are zero; fix by checking if max_ab == 0 and in that case return the peaks with normalized intensity 0.0 (e.g. return [(mz, 0.0) for mz, _ in peaks]) instead of performing the division; update the branch that currently builds the normalized list (the list comprehension using mz, ab in peaks) to handle this zero-max_ab case.

coderabbitai · 2026-03-24T19:53:31Z

scripts/proteomics/peptide_mass_calculator/peptide_mass_calculator.py

+def peptide_masses(sequence: str, charge: int = 1) -> dict:
+    """Return monoisotopic and average masses for the given peptide sequence.
+
+    Parameters
+    ----------
+    sequence:
+        Amino acid sequence, optionally with bracket-enclosed modifications,
+        e.g. ``"PEPTM[147]IDEK"``.
+    charge:
+        Desired charge state for m/z calculation (default 1).
+
+    Returns
+    -------
+    dict
+        Dictionary with keys ``monoisotopic_mass``, ``average_mass``,
+        ``mz_monoisotopic``, and ``mz_average``.
+    """
+    aa_seq = oms.AASequence.fromString(sequence)
+    mono = aa_seq.getMonoWeight()
+    avg = aa_seq.getAverageWeight()
+    return {
+        "sequence": sequence,
+        "charge": charge,
+        "monoisotopic_mass": mono,
+        "average_mass": avg,
+        "mz_monoisotopic": (mono + charge * PROTON) / charge,
+        "mz_average": (avg + charge * PROTON) / charge,
+    }


⚠️ Potential issue | 🟠 Major

Validate charge before m/z calculation.

Line 57 and Line 58 divide by charge without bounds checks; --charge 0 will crash with ZeroDivisionError.

Suggested fix

def peptide_masses(sequence: str, charge: int = 1) -> dict: @@ - aa_seq = oms.AASequence.fromString(sequence) + if charge <= 0: + raise ValueError("charge must be >= 1") + aa_seq = oms.AASequence.fromString(sequence)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@scripts/proteomics/peptide_mass_calculator/peptide_mass_calculator.py` around lines 32 - 59, In peptide_masses, validate the charge parameter before performing m/z calculations: in the function peptide_masses(sequence: str, charge: int = 1) check that charge is a positive integer (e.g. if not isinstance(charge, int) or charge <= 0) and raise a clear ValueError such as "charge must be a positive integer" to prevent ZeroDivisionError when computing mz_monoisotopic and mz_average using PROTON.

coderabbitai · 2026-03-24T19:53:31Z

scripts/proteomics/protein_digest/protein_digest.py

+def digest_protein(
+    sequence: str,
+    enzyme: str = "Trypsin",
+    missed_cleavages: int = 0,
+    min_length: int = 6,
+    max_length: int = 40,
+) -> list:


⚠️ Potential issue | 🟠 Major

Missing boundary validation for digestion parameters.

missed_cleavages, min_length, and max_length accept invalid values (e.g., negatives or min_length > max_length) and are passed through directly. Add explicit validation before digestion.

Suggested fix

@@ def digest_protein( @@ ) -> list: @@ + if missed_cleavages < 0: + raise ValueError("missed_cleavages must be >= 0") + if min_length < 1 or max_length < 1: + raise ValueError("min_length and max_length must be >= 1") + if min_length > max_length: + raise ValueError("min_length must be <= max_length") + protein_seq = oms.AASequence.fromString(sequence) @@ args = parser.parse_args() @@ if not args.sequence: parser.error("--sequence is required unless --list-enzymes is used.") + if args.missed_cleavages < 0: + parser.error("--missed-cleavages must be >= 0.") + if args.min_length < 1 or args.max_length < 1: + parser.error("--min-length and --max-length must be >= 1.") + if args.min_length > args.max_length: + parser.error("--min-length must be <= --max-length.")

Also applies to: 66-73, 101-121, 137-146

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@scripts/proteomics/protein_digest/protein_digest.py` around lines 38 - 44, The digest_protein function currently allows invalid digestion params; add explicit validation at the top of digest_protein to ensure missed_cleavages is a non-negative integer and min_length and max_length are positive integers with min_length <= max_length (raise ValueError with clear message on invalid input), and clamp or reject values outside sensible bounds; apply the same validation logic to the other digestion functions that accept the same parameters (any other functions that take missed_cleavages, min_length, max_length in this module) so invalid inputs are caught before digestion logic runs.

coderabbitai · 2026-03-24T19:53:31Z

scripts/proteomics/protein_digest/protein_digest.py

+    list of dict
+        Each entry contains ``sequence``, ``start``, ``end``,
+        ``monoisotopic_mass``, and ``missed_cleavages``.


⚠️ Potential issue | 🟡 Minor

Return docstring is out of sync with actual payload.

Line [63]-[64] documents start, end, and missed_cleavages, but returned dicts only contain sequence, length, and monoisotopic_mass.

Also applies to: 79-83

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@scripts/proteomics/protein_digest/protein_digest.py` around lines 62 - 64, The docstring describing the returned list-of-dicts is out of sync: it mentions keys `start`, `end`, and `missed_cleavages` while the actual payload contains `sequence`, `length`, and `monoisotopic_mass`. Update the function's return docstring (the docstring block that currently lists `sequence`, `start`, `end`, `monoisotopic_mass`, `missed_cleavages`) to accurately describe the actual returned keys (`sequence`, `length`, `monoisotopic_mass`) or alternatively modify the function that builds the dicts so they include `start`, `end`, and `missed_cleavages`; ensure the docstring and the returned dict structure (produced by the code that constructs each peptide dict) match exactly.

coderabbitai · 2026-03-24T19:53:31Z

scripts/proteomics/spectrum_file_info/spectrum_file_info.py

+Read an mzML (or mzXML) file and print a summary of its contents:
+number of spectra, MS levels, retention time range, m/z range,


⚠️ Potential issue | 🟡 Minor

Input-format claim is broader than the implemented loader.

Line [4] advertises mzML/mzXML, but Line [86] only uses MzMLFile().load(...). Please either narrow the docs/help text to mzML-only or add explicit mzXML handling.

Also applies to: 73-87

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@scripts/proteomics/spectrum_file_info/spectrum_file_info.py` around lines 4 - 5, The docstring/help claims support for "mzML (or mzXML)" but the loader call MzMLFile().load(...) only handles mzML; update the code to make the behavior match the docs by either changing the top-level help text to "mzML only" or adding explicit mzXML handling: detect the input file extension inside spectrum_file_info.py and call the appropriate loader (e.g., use MzXMLFile().load(...) for .mzXML files) so that the loader selection (MzMLFile vs MzXMLFile) aligns with the advertised formats.

coderabbitai · 2026-03-24T19:53:31Z

scripts/proteomics/spectrum_file_info/spectrum_file_info.py

+        mzs, intensities = spec.get_peaks()
+        if len(mzs) > 0:
+            mz_min = min(mz_min, float(mzs.min()))
+            mz_max = max(mz_max, float(mzs.max()))
+            tic_values.append(float(intensities.sum()))
+
+    return {
+        "n_spectra": len(spectra),
+        "ms_levels": ms_levels,
+        "rt_range_sec": (rt_min, rt_max),
+        "mz_range": (mz_min, mz_max),
+        "tic_total": sum(tic_values),
+        "tic_max": max(tic_values) if tic_values else 0.0,
+        "tic_per_spectrum": tic_values,
+    }


⚠️ Potential issue | 🟠 Major

summarise_experiment drops zero-peak spectra and can emit invalid m/z ranges.

At Line [60], TIC is appended only when peaks exist, so tic_per_spectrum may not align with spectrum count/indexing. Also, if all spectra have no peaks, Line [66] returns inf/-inf for mz_range, which is misleading in CLI output.

Suggested fix

@@ - mz_min = float("inf") - mz_max = float("-inf") + mz_min = float("inf") + mz_max = float("-inf") + has_peak_data = False @@ - mzs, intensities = spec.get_peaks() - if len(mzs) > 0: + mzs, intensities = spec.get_peaks() + tic_values.append(float(intensities.sum())) + if len(mzs) > 0: + has_peak_data = True mz_min = min(mz_min, float(mzs.min())) mz_max = max(mz_max, float(mzs.max())) - tic_values.append(float(intensities.sum())) @@ + mz_range = (mz_min, mz_max) if has_peak_data else (None, None) return { @@ - "mz_range": (mz_min, mz_max), + "mz_range": mz_range, @@ - mz_lo, mz_hi = summary["mz_range"] - print(f"{'m/z range':<22}: {mz_lo:.4f} – {mz_hi:.4f}") + mz_lo, mz_hi = summary["mz_range"] + if mz_lo is None: + print(f"{'m/z range':<22}: n/a (no centroid/profile peaks)") + else: + print(f"{'m/z range':<22}: {mz_lo:.4f} – {mz_hi:.4f}")

Also applies to: 124-126

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@scripts/proteomics/spectrum_file_info/spectrum_file_info.py` around lines 56 - 70, summarise_experiment currently skips spectra with zero peaks which misaligns tic_per_spectrum and can leave mz_min/mz_max as inf/-inf; change the loop that calls spec.get_peaks() so you always append a TIC value (append 0.0 when intensities is empty) and only update mz_min/mz_max when peaks exist, then after the loop replace any leftover inf/-inf mz bounds with a safe default (e.g., (0.0, 0.0) or None) before returning; apply the same fix in the duplicate block around the second occurrence (lines noted as 124-126) so functions/methods summarise_experiment, spec.get_peaks(), mz_min, mz_max, and tic_per_spectrum are updated accordingly.

timosachsenberg and others added 16 commits March 24, 2026 20:07

Add ruff.toml with E/F/W/I rule set, line-length 120

5a38336

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Migrate peptide_mass_calculator to per-script directory structure

1b94ea9

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Migrate protein_digest to per-script directory structure

bd44bf7

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Migrate spectrum_file_info to per-script directory with synthetic tes…

a5ec12f

…t data Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Migrate feature_detection_proteomics to per-script directory with syn…

2436785

…thetic test data Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Migrate mass_accuracy_calculator to per-script directory structure

70019e4

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Migrate isotope_pattern_matcher to per-script directory structure

b575f69

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Add validate-script and contribute-script Claude Code skills

7313299

- validate-script: isolated venv validation (ruff + pytest) for any script - contribute-script: guided workflow for creating new pyopenms tools Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Add AGENTS.md platform-agnostic AI contributor guide

9c280ad

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Update CLAUDE.md and README.md for per-script directory structure

9f3e0e1

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai bot reviewed Mar 24, 2026

View reviewed changes

ypriverol merged commit e4c5f86 into main Mar 24, 2026
9 checks passed

		Read an mzML (or mzXML) file and print a summary of its contents:
		number of spectra, MS levels, retention time range, m/z range,

Conversation

timosachsenberg commented Mar 24, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 24, 2026

Walkthrough

Changes

Sequence Diagram(s)

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

timosachsenberg commented Mar 24, 2026 •

edited by coderabbitai bot

Loading