Skip to content

feat: pyopenms add str repr#8429

Merged
timosachsenberg merged 18 commits intodevelopfrom
pr-8332
Dec 5, 2025
Merged

feat: pyopenms add str repr#8429
timosachsenberg merged 18 commits intodevelopfrom
pr-8332

Conversation

@timosachsenberg
Copy link
Copy Markdown
Contributor

@timosachsenberg timosachsenberg commented Dec 4, 2025

This pull request adds custom string representations (__str__ and __repr__ methods) to a wide range of core pyOpenMS data classes. These representations provide concise, readable summaries of key properties for each object, making debugging and interactive exploration much easier. The changes ensure that printing or inspecting these objects gives meaningful information about their contents, such as sequence, mass, intensity, charge, and other relevant attributes.

Object string representation improvements

  • Added __str__ and __repr__ methods to AASequence, EmpiricalFormula, IsotopeDistribution, MSChromatogram, MSExperiment, MSSpectrum, Peak1D, Peak2D, MobilityPeak1D, ChromatogramPeak, Feature, ConsensusFeature, MRMFeature, PeptideEvidence, PeptideHit, PeptideIdentification, PeptideIdentificationList, and ProteinHit, each returning a formatted string summarizing key properties. [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18]

API consistency

  • Ensured that __str__ delegates to __repr__ for all affected classes, so both print(obj) and repr(obj) yield the same informative output. (All references above)

Enhanced usability for interactive analysis

  • The new representations include truncation for long strings (such as sequences and descriptions) and conditional display of optional attributes, making them suitable for quick inspection in notebooks and debugging sessions. [1] [2]

These changes significantly improve the developer experience when working with pyOpenMS objects by providing clear, context-rich summaries of their state.## Description

Checklist

  • Make sure that you are listed in the AUTHORS file
  • Add relevant changes and new features to the CHANGELOG file
  • I have commented my code, particularly in hard-to-understand areas
  • New and existing unit tests pass locally with my changes
  • Updated or added python bindings for changed or new classes (Tick if no updates were necessary.)

How can I get additional information on failed tests during CI

Click to expand If your PR is failing you can check out
  • The details of the action statuses at the end of the PR or the "Checks" tab.
  • http://cdash.seqan.de/index.php?project=OpenMS and look for your PR. Use the "Show filters" capability on the top right to search for your PR number.
    If you click in the column that lists the failed tests you will get detailed error messages.

Advanced commands (admins / reviewer only)

Click to expand
  • /reformat (experimental) applies the clang-format style changes as additional commit. Note: your branch must have a different name (e.g., yourrepo:feature/XYZ) than the receiving branch (e.g., OpenMS:develop). Otherwise, reformat fails to push.
  • setting the label "NoJenkins" will skip tests for this PR on jenkins (saves resources e.g., on edits that do not affect tests)
  • commenting with rebuild jenkins will retrigger Jenkins-based CI builds

⚠️ Note: Once you opened a PR try to minimize the number of pushes to it as every push will trigger CI (automated builds and test) and is rather heavy on our infrastructure (e.g., if several pushes per day are performed).

Summary by CodeRabbit

  • New Features

    • Added human-readable string representations (str/repr) to many data classes, showing key properties (RT, m/z, intensity, mass, charge, counts) with concise formatting and sensible truncation for improved debugging, logging and display.
    • Added Python container semantics for identification lists (len()).
  • Tests

    • Added comprehensive unit tests validating the new string representations and ensuring str matches repr.

✏️ Tip: You can customize this high-level summary in your review settings.

Copilot AI and others added 14 commits October 27, 2025 15:13
Co-authored-by: poshul <146827+poshul@users.noreply.github.com>
Co-authored-by: poshul <146827+poshul@users.noreply.github.com>
Co-authored-by: poshul <146827+poshul@users.noreply.github.com>
Co-authored-by: poshul <146827+poshul@users.noreply.github.com>
Co-authored-by: poshul <146827+poshul@users.noreply.github.com>
Co-authored-by: poshul <146827+poshul@users.noreply.github.com>
Co-authored-by: timosachsenberg <5803621+timosachsenberg@users.noreply.github.com>
Adds human-readable string representations to pyOpenMS classes for better
debugging and Jupyter notebook display. Implements __repr__ and __str__
for:

- AASequence (sequence, length, mono_mass, modified)
- PeptideHit (score, sequence, charge, num_evidences)
- PeptideIdentification (rt, mz, score_type, num_hits, top_hit)
- Feature (rt, mz, intensity, charge, quality, subordinates, peptide_ids)
- IsotopeDistribution (num_isotopes, mass_range, most_abundant)
- MRMFeature (rt, mz, intensity, charge, num_transitions)
- MSSpectrum (ms_level, rt, num_peaks, mz_range, drift_time)
- MSChromatogram (name, num_peaks, rt_range)
- MSExperiment (num_spectra, num_chromatograms, ms_levels, rt_range)
- ReactionMonitoringTransition (id, precursor_mz, product_mz, peptide_ref)
- ConsensusFeature, EmpiricalFormula, Peak1D, Peak2D, MobilityPeak1D,
  ChromatogramPeak, ResidueModification, ProteinHit

Also adds __len__ to PeptideIdentificationList for Pythonic interface.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@timosachsenberg timosachsenberg changed the title Pr 8332 feat: pyopenms add str rep Dec 4, 2025
@timosachsenberg timosachsenberg changed the title feat: pyopenms add str rep feat: pyopenms add str repr Dec 4, 2025
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Dec 4, 2025

Walkthrough

Adds Python string representations (__str__ and __repr__) to many pyOpenMS binding classes (22 files), an idiomatic __len__ for PeptideIdentificationList, and unit tests validating the new formatted outputs and __str__/__repr__ consistency.

Changes

Cohort / File(s) Summary
Peak & Point Classes
src/pyOpenMS/addons/Peak1D.pyx, src/pyOpenMS/addons/Peak2D.pyx, src/pyOpenMS/addons/MobilityPeak1D.pyx, src/pyOpenMS/addons/ChromatogramPeak.pyx
Added __str__ (delegates to __repr__) and __repr__ returning formatted strings with rt/mz/mobility and intensity using standardized decimal precision.
Spectrum & Chromatogram Classes
src/pyOpenMS/addons/MSSpectrum.pyx, src/pyOpenMS/addons/MSChromatogram.pyx
Added __str__ and __repr__ summarizing ms_level, rt, num_peaks, optional mz/drift_time, name and rt_range when applicable.
Feature & MRM Classes
src/pyOpenMS/addons/Feature.pyx, src/pyOpenMS/addons/MRMFeature.pyx, src/pyOpenMS/addons/ConsensusFeature.pyx
Added __str__ and __repr__ including rt/mz/intensity/quality/charge and counts (subordinates, peptide_ids, transitions) with conditional inclusion.
Molecular / Chemistry Classes
src/pyOpenMS/addons/AASequence.pyx, src/pyOpenMS/addons/EmpiricalFormula.pyx, src/pyOpenMS/addons/IsotopeDistribution.pyx, src/pyOpenMS/addons/ResidueModification.pyx
Added __str__ and __repr__ showing sequence/formula/mono_mass/charge, isotope counts/ranges, and residue modification metadata with formatting/truncation rules.
Peptide Identification Classes
src/pyOpenMS/addons/PeptideHit.pyx, src/pyOpenMS/addons/PeptideEvidence.pyx, src/pyOpenMS/addons/PeptideIdentification.pyx, src/pyOpenMS/addons/PeptideIdentificationList.pyx
Added __str__ and __repr__ for hits/evidence/identifications (showing scores, sequences, rt/mz, top_hit info) and __len__ + __repr__ for the list.
Protein & Transition Classes
src/pyOpenMS/addons/ProteinHit.pyx, src/pyOpenMS/addons/ReactionMonitoringTransition.pyx
Added __str__ and __repr__ showing accession/score/coverage/description, precursor/product m/z, library intensity, native_id, and peptide_ref as applicable.
MSExperiment
src/pyOpenMS/addons/MSExperiment.pyx
Added __str__ and __repr__ summarizing num_spectra, num_chromatograms, optional ms_levels and rt_range when spectra exist.
Unit Tests
src/pyOpenMS/tests/unittests/test000.py
Added tests asserting presence and formatting of class names and key fields in __repr__ and parity of __str__ with __repr__ across the modified classes.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • Areas to inspect closely:
    • Decimal formatting consistency across classes (rt: 2, mz: 4, intensity: 1, mono_mass: 4, mobility: 4)
    • Conditional inclusion/exclusion of optional fields (charge, coverage, ranges, empty strings)
    • Correct use of getters and byte → UTF‑8 conversions where applicable
    • Tests covering edge cases (empty collections, zero/absent values, truncation behavior)

Suggested reviewers

  • cbielow

Poem

🐰 I nibble code with gentle care,
Dunder names now sparkle in the air,
Rt and m/z and intensity too,
All neatly printed — the output's new,
Hop, inspect, and cheer — the bindings grew!

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat: pyopenms add str repr' accurately describes the main change: adding str and repr methods to multiple pyOpenMS classes for improved string representations.
Docstring Coverage ✅ Passed Docstring coverage is 96.67% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch pr-8332

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (7)
src/pyOpenMS/addons/MobilityPeak1D.pyx (1)

4-21: Align intensity type with other peak classes for consistency

intensity is currently declared as float while other peak-like wrappers (e.g., ChromatogramPeak) use double for intensity. This works but introduces unnecessary narrowing and inconsistency.

Consider changing to:

-        cdef double mobility = self.getMobility()
-        cdef float intensity = self.getIntensity()
+        cdef double mobility = self.getMobility()
+        cdef double intensity = self.getIntensity()

This avoids any precision loss and keeps the representation consistent across peak types.

src/pyOpenMS/addons/AASequence.pyx (1)

28-60: Use !r for sequence to avoid quoting/escaping issues and simplify truncation

The overall logic (truncation, mono mass formatting, modified flag) looks good. The only fragile part is manually embedding the sequence inside single quotes, which can misbehave if the sequence string ever contains quotes or unusual characters.

You can both simplify and harden this block by using a display variable plus !r:

-        parts = []
-        if seq_decoded:
-            # Truncate very long sequences
-            if len(seq_decoded) > 50:
-                parts.append(f"sequence='{seq_decoded[:47]}...'")
-            else:
-                parts.append(f"sequence='{seq_decoded}'")
+        parts = []
+        if seq_decoded:
+            # Truncate very long sequences
+            if len(seq_decoded) > 50:
+                seq_display = seq_decoded[:47] + "..."
+            else:
+                seq_display = seq_decoded
+            parts.append(f"sequence={seq_display!r}")

This avoids manual quoting/escaping and keeps the example representation unchanged for typical inputs. The same !r pattern could be applied to other string/ID fields in this PR (e.g., EmpiricalFormula.formula, ResidueModification.id/name) if you want fully robust string handling.

src/pyOpenMS/addons/ResidueModification.pyx (1)

4-36: ResidueModification repr logic is solid; consider !r for robust ID/name rendering

The selection of fields (id, non-redundant name, non-zero mass_diff, non-default origin) is well thought out and the guards around origin are safe.

If you want to harden the string rendering against unusual characters in IDs/names, you could switch to !r instead of manual single quotes:

-        if mod_id_str:
-            parts.append(f"id='{mod_id_str}'")
-        if name_str and name_str != mod_id_str:
-            parts.append(f"name='{name_str}'")
+        if mod_id_str:
+            parts.append(f"id={mod_id_str!r}")
+        if name_str and name_str != mod_id_str:
+            parts.append(f"name={name_str!r}")

Not required for correctness, but slightly more robust and consistent with Python repr conventions.

src/pyOpenMS/addons/ReactionMonitoringTransition.pyx (1)

4-36: Use !r for string fields to avoid broken/ambiguous repr output

If native_id or peptide_ref contain quotes or special characters, the current formatting (id='...') can produce misleading or syntactically broken representations. Using !r makes the output more robust and closer to typical Python __repr__ semantics.

Suggested change:

-        parts = []
-        if native_id_str:
-            parts.append(f"id='{native_id_str}'")
+        parts = []
+        if native_id_str:
+            parts.append(f"id={native_id_str!r}")
         parts.append(f"precursor_mz={precursor_mz:.4f}")
         parts.append(f"product_mz={product_mz:.4f}")
-        if peptide_ref_str:
-            parts.append(f"peptide_ref='{peptide_ref_str}'")
+        if peptide_ref_str:
+            parts.append(f"peptide_ref={peptide_ref_str!r}")
         if lib_intensity > 0:
             parts.append(f"library_intensity={lib_intensity:.1f}")
src/pyOpenMS/addons/ProteinHit.pyx (1)

3-39: Escape string fields in ProteinHit repr using !r

The core logic (decoding, truncation, conditional coverage) looks good, but directly interpolating strings inside single quotes can yield confusing output if values contain ' or other special characters. Using !r produces safer, more conventional reprs.

Recommended diff:

-        parts = []
-        if acc_str:
-            parts.append(f"accession='{acc_str}'")
-        parts.append(f"score={score}")
+        parts = []
+        if acc_str:
+            parts.append(f"accession={acc_str!r}")
+        parts.append(f"score={score}")
@@
-        if desc_str:
-            # Truncate description if too long
-            if len(desc_str) > 50:
-                desc_str = desc_str[:47] + "..."
-            parts.append(f"description='{desc_str}'")
+        if desc_str:
+            # Truncate description if too long
+            if len(desc_str) > 50:
+                desc_str = desc_str[:47] + "..."
+            parts.append(f"description={desc_str!r}")

If you want full consistency with other classes, you could also consider formatting score/coverage with a fixed precision (e.g. score={score:.3f}, coverage={coverage:.1f}), but that’s purely cosmetic.

src/pyOpenMS/addons/MSSpectrum.pyx (1)

203-237: Consider avoiding full mz array allocation for mz_range in repr

The representation logic is correct and nicely formatted, but computing mz_range via get_mz_array() means allocating and filling an array of size num_peaks just to access the first and last entries. For large spectra this can make repr(spectrum) unexpectedly expensive.

If _MSSpectrum exposes min/max m/z accessors (e.g. getMinMZ()/getMaxMZ() or equivalent), or if there’s a cheaper way in the bindings to read just the first/last peak, consider switching to that instead of materializing the full numpy array. This keeps the repr usable even for very large spectra.

src/pyOpenMS/addons/MSChromatogram.pyx (1)

121-153: Escape chromatogram name and consider a cheaper rt_range computation

Two minor points here:

  1. String quoting for name
    Embedding the name as "name='...'" can produce confusing reprs if the name contains quotes or special characters. Using !r is safer and matches typical __repr__ style:
-        # Add name if set
-        name = self.getName()
-        if name:
-            name_str = name.decode('utf-8') if isinstance(name, bytes) else str(name)
-            if name_str:
-                parts.append(f"name='{name_str}'")
+        # Add name if set
+        name = self.getName()
+        if name:
+            name_str = name.decode('utf-8') if isinstance(name, bytes) else str(name)
+            if name_str:
+                parts.append(f"name={name_str!r}")
  1. Performance of rt_range
    rt_range is currently derived via get_peaks(), which allocates and fills two numpy arrays of length num_peaks just to read rts[0] and rts[-1]. For large chromatograms this can make repr(chrom) relatively expensive. If the underlying C++ API or bindings expose direct access to the first/last RT (or min/max RT), consider using that instead of materializing the full arrays for the repr.
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9e92c1b and 00be8ab.

📒 Files selected for processing (21)
  • src/pyOpenMS/addons/AASequence.pyx (2 hunks)
  • src/pyOpenMS/addons/ChromatogramPeak.pyx (1 hunks)
  • src/pyOpenMS/addons/ConsensusFeature.pyx (1 hunks)
  • src/pyOpenMS/addons/EmpiricalFormula.pyx (1 hunks)
  • src/pyOpenMS/addons/Feature.pyx (1 hunks)
  • src/pyOpenMS/addons/IsotopeDistribution.pyx (1 hunks)
  • src/pyOpenMS/addons/MRMFeature.pyx (1 hunks)
  • src/pyOpenMS/addons/MSChromatogram.pyx (1 hunks)
  • src/pyOpenMS/addons/MSExperiment.pyx (1 hunks)
  • src/pyOpenMS/addons/MSSpectrum.pyx (1 hunks)
  • src/pyOpenMS/addons/MobilityPeak1D.pyx (1 hunks)
  • src/pyOpenMS/addons/Peak1D.pyx (1 hunks)
  • src/pyOpenMS/addons/Peak2D.pyx (1 hunks)
  • src/pyOpenMS/addons/PeptideEvidence.pyx (1 hunks)
  • src/pyOpenMS/addons/PeptideHit.pyx (1 hunks)
  • src/pyOpenMS/addons/PeptideIdentification.pyx (1 hunks)
  • src/pyOpenMS/addons/PeptideIdentificationList.pyx (1 hunks)
  • src/pyOpenMS/addons/ProteinHit.pyx (1 hunks)
  • src/pyOpenMS/addons/ReactionMonitoringTransition.pyx (1 hunks)
  • src/pyOpenMS/addons/ResidueModification.pyx (1 hunks)
  • src/pyOpenMS/tests/unittests/test000.py (17 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
src/pyOpenMS/tests/**/*.py

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Write pyOpenMS tests for new Python bindings

Files:

  • src/pyOpenMS/tests/unittests/test000.py
🧠 Learnings (4)
📓 Common learnings
Learnt from: CR
Repo: OpenMS/OpenMS PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-08-05T12:43:11.681Z
Learning: Applies to src/pyOpenMS/**/*.h : Add Doxygen comments for new public methods and classes in pyOpenMS bindings
📚 Learning: 2025-08-05T12:43:11.681Z
Learnt from: CR
Repo: OpenMS/OpenMS PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-08-05T12:43:11.681Z
Learning: Applies to src/pyOpenMS/**/*.h : Add Doxygen comments for new public methods and classes in pyOpenMS bindings

Applied to files:

  • src/pyOpenMS/addons/MobilityPeak1D.pyx
  • src/pyOpenMS/addons/MSSpectrum.pyx
  • src/pyOpenMS/addons/ResidueModification.pyx
  • src/pyOpenMS/addons/ChromatogramPeak.pyx
  • src/pyOpenMS/addons/ProteinHit.pyx
  • src/pyOpenMS/addons/Peak1D.pyx
  • src/pyOpenMS/addons/PeptideEvidence.pyx
  • src/pyOpenMS/tests/unittests/test000.py
  • src/pyOpenMS/addons/PeptideHit.pyx
  • src/pyOpenMS/addons/IsotopeDistribution.pyx
📚 Learning: 2025-08-05T12:43:11.681Z
Learnt from: CR
Repo: OpenMS/OpenMS PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-08-05T12:43:11.681Z
Learning: Applies to src/pyOpenMS/tests/**/*.py : Write pyOpenMS tests for new Python bindings

Applied to files:

  • src/pyOpenMS/tests/unittests/test000.py
📚 Learning: 2025-08-05T12:43:11.681Z
Learnt from: CR
Repo: OpenMS/OpenMS PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-08-05T12:43:11.681Z
Learning: Applies to src/tests/class_tests/openms/**/*.cpp : Write unit tests for new functionality

Applied to files:

  • src/pyOpenMS/tests/unittests/test000.py
🧬 Code graph analysis (1)
src/pyOpenMS/tests/unittests/test000.py (1)
src/openms/source/CHEMISTRY/AASequence.cpp (8)
  • AASequence (110-113)
  • AASequence (115-118)
  • AASequence (120-123)
  • AASequence (125-128)
  • fromString (1691-1696)
  • fromString (1691-1691)
  • fromString (1698-1703)
  • fromString (1698-1698)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: build-and-test (ubuntu-22.04, clang++, 15)
  • GitHub Check: build-and-test (windows-2022, cl.exe, 14.36)
  • GitHub Check: build-and-test (ubuntu-22.04, g++, 11)
  • GitHub Check: build-and-test (ubuntu-22.04-arm, g++, 11)
  • GitHub Check: build-lnx-arm
  • GitHub Check: build-lnx
🔇 Additional comments (11)
src/pyOpenMS/addons/MSExperiment.pyx (1)

136-169: MSExperiment repr is safe, concise, and correctly guarded

The representation correctly aggregates only cheap, global properties and guards getMinRT/getMaxRT behind num_spectra > 0, avoiding invalid access on empty experiments. __str__ delegating to __repr__ keeps the API consistent.

src/pyOpenMS/addons/ChromatogramPeak.pyx (1)

4-21: ChromatogramPeak string representation looks correct and idiomatic

Using getRT()/getIntensity() with fixed decimal formatting is straightforward and safe, and delegating __str__ to __repr__ matches the pattern used across the PR.

src/pyOpenMS/addons/PeptideIdentificationList.pyx (1)

4-22: PeptideIdentificationList length and representation are appropriate

__len__ correctly forwards to size(), and the minimalist PeptideIdentificationList(size=...) repr is cheap and predictable while still useful in debugging. __str__ delegating to __repr__ keeps the interface consistent with other classes in this PR.

src/pyOpenMS/addons/Feature.pyx (1)

4-46: Feature string representations look solid and consistent

__str__ delegating to __repr__ and the selected fields/precisions make sense; optional attributes are handled cleanly and safely.

src/pyOpenMS/addons/ConsensusFeature.pyx (1)

4-32: ConsensusFeature str/repr are straightforward and correct

Field selection and formatting are consistent with other peak/feature types, and the implementation is safe and cheap.

src/pyOpenMS/addons/Peak2D.pyx (1)

4-22: Peak2D str/repr are clean and idiomatic

Implementation is minimal, uses appropriate formatting for rt/mz/intensity, and safely delegates __str__ to __repr__.

src/pyOpenMS/addons/Peak1D.pyx (1)

4-21: Peak1D representations are simple and correct

Delegation pattern and numeric formatting are consistent with the rest of the API; no issues spotted.

src/pyOpenMS/addons/MRMFeature.pyx (1)

4-35: LGTM! Clean implementation of string representations.

The __repr__ method provides a comprehensive view of the MRMFeature with appropriate precision for each field type. The conditional inclusion of charge (only when non-zero) keeps the output clean.

src/pyOpenMS/addons/PeptideHit.pyx (1)

4-37: LGTM! Well-structured representation.

The implementation correctly uses toString() for sequence conversion and conditionally includes num_evidences only when present. Good use of cdef declarations for type safety.

src/pyOpenMS/addons/PeptideIdentification.pyx (1)

4-47: LGTM! Excellent use of conditional field inclusion.

The implementation properly checks for RT/MZ availability using hasRT() and hasMZ() before including them in the representation. The top hit information is a nice addition that provides context without overwhelming the output.

src/pyOpenMS/tests/unittests/test000.py (1)

281-297: LGTM! Comprehensive test coverage for new representations.

The tests appropriately verify that:

  1. __repr__ output contains the class name and expected fields
  2. __str__ delegates to __repr__ (str_str == repr_str)
  3. Tests populate objects with data before checking representations

The pattern of checking for presence of key strings rather than exact formats is appropriate, as it allows flexibility in formatting details while ensuring essential information is included.

Also applies to: 3845-3864, 4818-4849, 4876-4896, 4966-4991

timosachsenberg and others added 3 commits December 5, 2025 08:06
- Fix __str__ to return just the formula string for backwards compatibility
  with existing code that uses str(formula) in expressions/assertions
- Fix __repr__ to not use .decode() since toString() already returns Python string
- Now str(EmpiricalFormula) returns "C6H12O6" (formula string)
- And repr(EmpiricalFormula) returns detailed format for debugging

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Update example to match actual implementation:
- most_abundant -> most_abundant_mass
- Use correct decimal formatting (.2f for mass_range, .4f for most_abundant_mass)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@poshul poshul added this to the 3.5 milestone Dec 5, 2025
@timosachsenberg
Copy link
Copy Markdown
Contributor Author

fixing tests right now

autowrap methods return Python types (str, int), not C++ types
(String, char). Removed cdef String/char declarations that caused
TypeError when accessing return values from wrapped methods.

Also fixed test assertions:
- EmpiricalFormula: __str__ now returns formula only, __repr__ returns details
- PeptideEvidence: setAABefore/setAAAfter expect bytes, not int
- PeptideHit: __repr__ now includes full evidence details

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
src/pyOpenMS/addons/EmpiricalFormula.pyx (1)

11-32: __repr__ formatting is clear; consider using !r for the formula field

The overall structure is good: you collect key properties, conditionally include charge, and the docstring example now matches the actual output (4-decimal mono mass, charge omitted when zero).

If you want __repr__ to be a bit more robust for “odd” formula strings (e.g., containing quotes or whitespace), you could render the formula via repr() instead of manually adding single quotes:

-        if formula_str:
-            parts.append(f"formula='{formula_str}'")
+        if formula_str:
+            parts.append(f"formula={formula_str!r}")

This keeps normal cases readable while avoiding edge cases where the embedded quotes could make the repr ambiguous.

src/pyOpenMS/addons/ResidueModification.pyx (1)

4-35: ResidueModification __repr__/__str__ are clear and match intended semantics

Logic for including id, name, mass_diff, and origin is straightforward and robust; delegating __str__ to __repr__ keeps behavior consistent. One minor improvement would be to use !r when formatting string fields (e.g. id={mod_id_str!r}) so values containing quotes or special characters are always represented unambiguously.

src/pyOpenMS/tests/unittests/test000.py (1)

5000-5069: Consider adding explicit tests for PeptideIdentificationList.__repr__/__str__

This block thoroughly exercises construction, push_back, indexing, iteration, and clear behavior of PeptideIdentificationList, but doesn’t currently assert anything about its new __repr__/__str__ implementations mentioned in the PR summary. Adding a small check that repr(pil) includes the class name and size, and that str(pil) behaves as intended, would make the tests fully cover the new binding.

As per coding guidelines for pyOpenMS tests, I can help sketch a concise repr/str test here if you’d like.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 702d89d and 3fe19e5.

📒 Files selected for processing (6)
  • src/pyOpenMS/addons/EmpiricalFormula.pyx (1 hunks)
  • src/pyOpenMS/addons/PeptideEvidence.pyx (1 hunks)
  • src/pyOpenMS/addons/PeptideHit.pyx (1 hunks)
  • src/pyOpenMS/addons/ProteinHit.pyx (1 hunks)
  • src/pyOpenMS/addons/ResidueModification.pyx (1 hunks)
  • src/pyOpenMS/tests/unittests/test000.py (18 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • src/pyOpenMS/addons/PeptideHit.pyx
  • src/pyOpenMS/addons/ProteinHit.pyx
  • src/pyOpenMS/addons/PeptideEvidence.pyx
🧰 Additional context used
📓 Path-based instructions (1)
src/pyOpenMS/tests/**/*.py

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Write pyOpenMS tests for new Python bindings

Files:

  • src/pyOpenMS/tests/unittests/test000.py
🧠 Learnings (4)
📓 Common learnings
Learnt from: CR
Repo: OpenMS/OpenMS PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-08-05T12:43:11.681Z
Learning: Applies to src/pyOpenMS/tests/**/*.py : Write pyOpenMS tests for new Python bindings
Learnt from: CR
Repo: OpenMS/OpenMS PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-08-05T12:43:11.681Z
Learning: Applies to src/pyOpenMS/**/*.h : Add Doxygen comments for new public methods and classes in pyOpenMS bindings
📚 Learning: 2025-08-05T12:43:11.681Z
Learnt from: CR
Repo: OpenMS/OpenMS PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-08-05T12:43:11.681Z
Learning: Applies to src/pyOpenMS/tests/**/*.py : Write pyOpenMS tests for new Python bindings

Applied to files:

  • src/pyOpenMS/tests/unittests/test000.py
📚 Learning: 2025-08-05T12:43:11.681Z
Learnt from: CR
Repo: OpenMS/OpenMS PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-08-05T12:43:11.681Z
Learning: Applies to src/pyOpenMS/**/*.h : Add Doxygen comments for new public methods and classes in pyOpenMS bindings

Applied to files:

  • src/pyOpenMS/tests/unittests/test000.py
  • src/pyOpenMS/addons/ResidueModification.pyx
📚 Learning: 2025-08-05T12:43:11.681Z
Learnt from: CR
Repo: OpenMS/OpenMS PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-08-05T12:43:11.681Z
Learning: Applies to src/tests/class_tests/openms/**/*.cpp : Write unit tests for new functionality

Applied to files:

  • src/pyOpenMS/tests/unittests/test000.py
🧬 Code graph analysis (1)
src/pyOpenMS/tests/unittests/test000.py (1)
src/openms/source/KERNEL/MSSpectrum.cpp (9)
  • setRT (585-588)
  • setRT (585-585)
  • MSSpectrum (515-515)
  • MSSpectrum (517-520)
  • MSSpectrum (522-522)
  • setMSLevel (620-623)
  • setMSLevel (620-620)
  • setName (630-633)
  • setName (630-630)
🔇 Additional comments (19)
src/pyOpenMS/addons/EmpiricalFormula.pyx (1)

4-9: __str__ implementation and docstring look consistent and idiomatic

Returning self.toString() here matches the documented intent of giving just the bare formula string, and keeps __str__ nicely focused on the “equation-friendly” view. No issues from my side.

src/pyOpenMS/tests/unittests/test000.py (18)

72-76: MetaInfoInterface test strengthens key/value round‑trip checks

The extra assertions on presence and values of b"key"/b"key2" before and after removal solidify the MetaInfoInterface contract without changing behavior.

Also applies to: 83-84


286-303: AASequence __repr__/__str__ tests exercise key fields and modified flag

The new checks validate that the representation includes the class name, sequence, length, mono mass, and the modified flag for modified vs unmodified sequences, plus str(aas) == repr(aas). This is good, focused coverage for the new bindings.


392-407: ResidueModification representation test covers the new binding minimally but sufficiently

Using ModificationsDB.getModification and asserting the prefix ResidueModification( plus str(mod) == repr(mod) is a reasonable smoke test for the new repr/str without hard‑coding UniMod‑specific content.


409-425: MobilityPeak1D __repr__/__str__ test is aligned with the new summary format

Asserting presence of MobilityPeak1D(, mobility=, and intensity=, and checking str(p) == repr(p) gives good coverage of the new representation.


452-463: IsotopeDistribution repr test validates structural summary rather than raw container

The test builds a small distribution and checks for IsotopeDistribution(, num_isotopes=, and mass_range=, with str == repr, which matches the typical summary style and avoids over‑specifying internal details.


573-583: EmpiricalFormula test clearly differentiates __str__ (formula) from __repr__ (detailed)

Checking that repr contains EmpiricalFormula(, formula=, mono_mass=, while str(ef) is just the bare formula and not equal to repr(ef) accurately encodes the intended API contract.


718-724: ChromatogramPeak representation test covers RT and intensity summary

The test ensures ChromatogramPeak( plus rt= and intensity= appear and that str(p) == repr(p), which is exactly what you want for an interactive summary.


811-827: ConsensusFeature __repr__/__str__ tests validate all important dimensions

Using a populated ConsensusFeature and checking for rt=, mz=, intensity=, and num_features= in repr (and str == repr) gives strong coverage for the new representation.


1358-1376: Feature representation test matches typical RT/mz/intensity/charge/quality summary

The test builds a realistic Feature and asserts that repr contains the main quantitative fields and that str matches repr. This is solid coverage of the new binding behavior.


3196-3215: MSExperiment repr test checks core structural fields

By creating an experiment with two spectra and asserting that repr contains MSExperiment(, num_spectra=, and num_chromatograms=, plus str == repr, the test nicely guards the high‑level summary format.


3544-3556: MSSpectrum representation test validates ms_level/rt/num_peaks reporting

The new test ensures MSSpectrum(, ms_level=, rt=, and num_peaks= appear in the repr and that str delegates to repr, which aligns with the corresponding addon implementation.


3805-3814: MSChromatogram __repr__/__str__ test focuses on peak count summary

Checking for MSChromatogram( and num_peaks= and enforcing str(chrom) == repr(chrom) is the right level of assertion for this binding.


3852-3870: MRMFeature repr test captures RT/mz/intensity and transition count

Populating an MRMFeature with RT, m/z, intensity, charge, and two transitions, then asserting presence of num_transitions= and that str == repr, provides meaningful coverage for the new human‑readable summary.


4174-4187: ReactionMonitoringTransition representation test validates core numeric fields

The test sets native ID, precursor/product m/z, peptide ref, and library intensity, then asserts that repr includes ReactionMonitoringTransition(, precursor_mz=, and product_mz=, with str matching repr. This is appropriate and stable coverage.


4825-4855: PeptideHit __repr__/__str__ tests cover both basic and evidence‑rich cases

You first assert presence of score=, sequence=, and charge= in the base repr, then add a PeptideEvidence and check that evidences= and nested PeptideEvidence( with the accession show up, plus str == repr. This thoroughly exercises the new representation.


4883-4902: PeptideEvidence representation test validates all positional and flanking info

The repr checks for PeptideEvidence(, protein=, start=, end=, aa_before=, and aa_after=, and ensures str matches, which directly reflects the added binding behavior.


4973-4997: PeptideIdentification repr test asserts summary of RT/mz/score type and top hit

By setting RT, m/z, score type, and a single top hit, then checking for rt=, mz=, score_type=, num_hits=, top_hit=, and the peptide sequence in repr, plus str == repr, you get excellent coverage of the new textual summary.


5306-5324: ProteinHit representation test verifies accession/score/coverage in repr

The test builds a realistic ProteinHit, then asserts that repr contains ProteinHit(, accession=, the accession string, score=, and coverage=, and that str equals repr. This nicely validates the new string representation.

@timosachsenberg timosachsenberg enabled auto-merge (squash) December 5, 2025 13:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants