Skip to content

feat(diann): Enable DIANN 2.0 for DIANNtoMSstatsPTMFormat#125

Merged
tonywu1999 merged 5 commits intodevelfrom
fix-diann
Feb 26, 2026
Merged

feat(diann): Enable DIANN 2.0 for DIANNtoMSstatsPTMFormat#125
tonywu1999 merged 5 commits intodevelfrom
fix-diann

Conversation

@tonywu1999
Copy link
Contributor

@tonywu1999 tonywu1999 commented Feb 18, 2026

Motivation and Context

This PR enables DIANN 2.0 support in the DIANN PTM converter by adding a flexible quantificationColumn parameter to handle differing DIANN output formats across versions (1.8.x, 1.9.x, 2.x). It lets DIANNtoMSstatsPTMFormat accept DIANN 2.0 fragment-intensity parquet inputs (per-fragment columns) via an "auto" mode while preserving existing behavior for earlier DIANN versions.

Summary of the solution

Introduce a new public argument quantificationColumn (default "FragmentQuantCorrected") to DIANNtoMSstatsPTMFormat, propagate it into the underlying MSstatsConvert::DIANNtoMSstatsFormat calls for both PTM and optional PROTEIN paths, update documentation and examples to demonstrate DIANN 2.0 usage (including arrow::read_parquet), and update package metadata to suggest arrow and require a newer MSstatsConvert.

Detailed changes

  • R/converters.R

    • Function signature: DIANNtoMSstatsPTMFormat(...) gains quantificationColumn = "FragmentQuantCorrected".
    • Propagate quantificationColumn into DIANNtoMSstatsFormat(...) calls for both PTM and PROTEIN branches (passed as named argument).
    • Replaced positional DIANNtoMSstatsFormat calls with named-argument calls for clarity and to ensure correct binding of the new parameter.
    • Maintain existing input processing and PTM site localization logic; only wiring for quantificationColumn added.
    • Examples: added a DIANN 2.0 example showing reading parquet via arrow::read_parquet and using quantificationColumn = "auto".
  • man/DIANNtoMSstatsPTMFormat.Rd

    • Updated Rd to document new quantificationColumn argument and its supported values:
      • "FragmentQuantCorrected" (default) — DIANN 1.8.x
      • "FragmentQuantRaw" — DIANN 1.9.x
      • "auto" — DIANN 2.x (per-fragment columns like Fr0Quantity)
    • Improved/clarified multiple argument descriptions and included the DIANN 2.0 usage example.
  • DESCRIPTION

    • MSstatsConvert import version bumped to (>= 1.19.1).
    • arrow added to Suggests (to support DIANN 2.0 parquet examples).
  • Test data / examples

    • Added DIANN 2.0 example data files under inst/tinytest/raw_data/DIANN:
      • diann_2_ptm.parquet
      • annotation_diann_2.0_ptm.csv
    • Examples in converters.R demonstrate usage with these files.

Unit tests added or modified

  • Test data files for DIANN 2.0 were included (inst/tinytest/raw_data/DIANN/diann_2_ptm.parquet and annotation_diann_2.0_ptm.csv).
  • Existing test suites remain present (inst/tinytest/test_converters.R and other tinytest files), but no new explicit unit test cases were added that assert the behavior of quantificationColumn (e.g., quantificationColumn = "auto" parsing or DIANN 2.0-specific code paths). This leaves a coverage gap for the new functionality.

Coding guidelines (violations or notes)

  • No coding guideline violations detected in the modified files:
    • Named-argument usage improves readability and reduces risk of misbinding.
    • Roxygen documentation updated to reflect API changes.
  • Note: Although examples and test data for DIANN 2.0 were added, automated tests validating the new quantificationColumn behavior were not added — consider adding targeted unit tests to verify "auto" behavior and parquet handling (requires arrow in test environment).

@coderabbitai
Copy link

coderabbitai bot commented Feb 18, 2026

📝 Walkthrough

Walkthrough

Adds a new public parameter quantificationColumn (default "FragmentQuantCorrected") to DIANNtoMSstatsPTMFormat, threads it into both PTM and protein conversion calls to MSstatsConvert::DIANNtoMSstatsFormat via named arguments, updates the function documentation and examples, and adds arrow to Suggests in DESCRIPTION.

Changes

Cohort / File(s) Summary
Converters Implementation
R/converters.R
Added quantificationColumn = "FragmentQuantCorrected" to DIANNtoMSstatsPTMFormat signature; replaced positional calls with named-argument calls to MSstatsConvert::DIANNtoMSstatsFormat and passed quantificationColumn through both PTM and protein conversion branches.
Documentation
man/DIANNtoMSstatsPTMFormat.Rd
Documented new quantificationColumn parameter, revised several parameter descriptions for clarity, and expanded Examples with DIANN 2.0 usage demonstrating quantificationColumn (including "auto" example).
Package Metadata
DESCRIPTION
Added arrow to Suggests and tightened MSstatsConvert import version to >= 1.19.1.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Poem

🐰 I nibbled code beneath moonlight pale,
A column named that tells the tale,
Args aligned like rows of peas,
PTM and proteins pass with ease,
Hop, hum, convert — a happy trail!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Description check ⚠️ Warning No pull request description was provided by the author; all required template sections (Motivation and Context, Changes, Testing, Checklist) are missing. Add a detailed description following the template: explain motivation for DIANN 2.0 support, list all changes including the quantificationColumn parameter, document testing performed, and complete the pre-review checklist.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main change: adding DIANN 2.0 support to DIANNtoMSstatsPTMFormat via the new quantificationColumn parameter.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix-diann

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@tonywu1999 tonywu1999 changed the title Fix diann feat(diann): Enable DIANN 2.0 for MSstatsPTM converter Feb 18, 2026
@tonywu1999 tonywu1999 changed the title feat(diann): Enable DIANN 2.0 for MSstatsPTM converter feat(diann): Enable DIANN 2.0 for DIANNtoMSstatsPTMFormat Feb 18, 2026
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@man/DIANNtoMSstatsPTMFormat.Rd`:
- Around line 62-66: The sentence "Run should be the same as filename." is
misplaced in the pg_qvalue_cutoff help text; remove that sentence from the
pg_qvalue_cutoff description in man/DIANNtoMSstatsPTMFormat.Rd and instead add
it to the annotation parameter documentation (either by fixing upstream in
MSstatsConvert::DIANNtoMSstatsFormat or by adding a local `@param` annotation
override in R/converters.R). Update the documentation for the annotation
parameter to include "Run should be the same as filename." and ensure
pg_qvalue_cutoff retains only its q-value cutoff description (function/parameter
references: pg_qvalue_cutoff, annotation, MSstatsConvert::DIANNtoMSstatsFormat,
R/converters.R).
- Line 68: The documentation typo in the `useUniquePeptide` description
("pepties") must be corrected; either fix it upstream in
MSstatsConvert::DIANNtoMSstatsFormat or override it locally by adding a `@param`
useUniquePeptide docstring in R/converters.R that replaces the inherited text
with the correct phrase (e.g. "Should unique peptides be removed?"); update the
`DIANNtoMSstatsPTMFormat` documentation by ensuring the corrected `@param` is
present so the generated Rd shows "peptides" instead of "pepties".

In `@R/converters.R`:
- Around line 50-70: Add arrow to DESCRIPTION under Suggests and guard the
example block that calls arrow::read_parquet so R CMD check won’t warn when
arrow isn’t installed; wrap the DIANN 2.0 example (the calls around
arrow::read_parquet and related example code that demonstrates
DIANNtoMSstatsPTMFormat) in an if (requireNamespace("arrow", quietly = TRUE)) {
... } conditional (or use requireNamespace check before calling
arrow::read_parquet) so the example is skipped when arrow is absent, and update
DESCRIPTION to include arrow in Suggests.
- Line 88: The package uses the quantificationColumn argument when calling
MSstatsConvert::DIANNtoMSstatsFormat (e.g., in functions referencing
quantificationColumn) which requires MSstatsConvert >= 1.19.1; update the
DESCRIPTION's Imports field to include MSstatsConvert (>= 1.19.1) so
installations pull a compatible version and avoid runtime failures when
DIANNtoMSstatsFormat is invoked. Ensure the package namespace and any NAMESPACE
imports remain correct after updating DESCRIPTION.

---

Duplicate comments:
In `@man/DIANNtoMSstatsPTMFormat.Rd`:
- Around line 127-147: The example for DIANNtoMSstatsPTMFormat includes an
unguarded call to arrow::read_parquet() which will break R CMD check if arrow
isn't installed; modify the converter that generates the roxygen example for the
DIANNtoMSstatsPTMFormat documentation so that the example block is wrapped in
\dontrun{} (or conditionally checks for requireNamespace("arrow", quietly=TRUE))
and emits the guarded code (e.g., wrap the arrow::read_parquet() and downstream
example calls including DIANNtoMSstatsPTMFormat and head(msstatsptm_format$PTM)
in the dontrun or conditional) so the auto-generated man page no longer contains
unguarded arrow calls.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
R/converters.R (1)

50-70: ⚠️ Potential issue | 🟠 Major

DIANN 2.0 example still missing \dontrun{} guard — will fail R CMD check without arrow installed.

arrow is correctly listed in Suggests (DESCRIPTION), but Suggests packages are not guaranteed to be available during R CMD check. The example at Line 53 calls arrow::read_parquet() unconditionally, which will produce an error when arrow is absent.

Wrap the DIANN 2.0 block in \dontrun{}:

📄 Proposed fix
+#' \dontrun{
 #' # Example DIANN 2.0
 #' input = system.file("tinytest/raw_data/DIANN/diann_2_ptm.parquet", 
 #'                                         package = "MSstatsPTM")
 #' input = arrow::read_parquet(input)
 #' annot = system.file("tinytest/raw_data/DIANN/annotation_diann_2.0_ptm.csv", 
 #'                                         package = "MSstatsPTM")
 #' annot = data.table::fread(annot)
 #' fasta_path = system.file("extdata", "diann.fasta", 
 #'                        package="MSstatsPTM")
 #' 
 #' msstatsptm_format = DIANNtoMSstatsPTMFormat(
 #'     input, 
 #'     annot, 
 #'     protein_id_col = "Protein.Names", 
 #'     fasta_path = fasta_path, 
 #'     fasta_protein_name = "entry_name", 
 #'     use_log_file = FALSE,
 #'     quantificationColumn = "auto"
 #' )
 #' 
 #' head(msstatsptm_format$PTM)
+#' }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@R/converters.R` around lines 50 - 70, The example DIANN 2.0 block calls
arrow::read_parquet() unconditionally and will fail R CMD check if arrow is not
installed; wrap the entire example block (the lines showing usage of
DIANNtoMSstatsPTMFormat, arrow::read_parquet, system.file calls and
head(msstatsptm_format$PTM)) inside a \dontrun{} section so it is skipped during
checks when Suggested packages are unavailable, ensuring the example remains but
is not executed during R CMD check.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@R/converters.R`:
- Around line 50-70: The example DIANN 2.0 block calls arrow::read_parquet()
unconditionally and will fail R CMD check if arrow is not installed; wrap the
entire example block (the lines showing usage of DIANNtoMSstatsPTMFormat,
arrow::read_parquet, system.file calls and head(msstatsptm_format$PTM)) inside a
\dontrun{} section so it is skipped during checks when Suggested packages are
unavailable, ensuring the example remains but is not executed during R CMD
check.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d95980c and 4c2f863.

⛔ Files ignored due to path filters (2)
  • inst/tinytest/raw_data/DIANN/annotation_diann_2.0_ptm.csv is excluded by !**/*.csv
  • inst/tinytest/raw_data/DIANN/diann_2_ptm.parquet is excluded by !**/*.parquet
📒 Files selected for processing (3)
  • DESCRIPTION
  • R/converters.R
  • man/DIANNtoMSstatsPTMFormat.Rd

@tonywu1999 tonywu1999 merged commit 0f49cc2 into devel Feb 26, 2026
2 checks passed
@tonywu1999 tonywu1999 deleted the fix-diann branch February 26, 2026 02:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant