Implement alignment-adjusted PEP recalculation and re-ranking by Copilot · Pull Request #8 · singjc/pyprophet

Copilot · 2025-10-31T03:56:17Z

Building on PRs #1 and #7 which integrated alignment data into exports, this PR computes adjusted posterior error probabilities (PEPs) that combine MS2 and alignment evidence, then re-ranks peak groups and computes new model-based FDR.

Changes

Core Implementation

Added compute_adjusted_pep_and_rerank() in pyprophet/io/util.py
- Computes pep_adj = 1 - (1 - pep_ms2) × (1 - pep_align) for aligned features
- Reference features (alignment anchors) maintain MS2 PEP unchanged to avoid double-counting
- Re-ranks within each (run_id, transition_group_id) by adjusted PEP
- Computes new qvalues via compute_model_fdr on top-1 features per group

Integration

Called automatically in all readers (OSW, Parquet, SplitParquet)
- Executes after data augmentation, before export
- Gracefully skips when no alignment data present

Output Schema

Preserves MS2-only results: m_score → ms2_m_score, peak_group_rank → ms2_peak_group_rank
New columns:
- ms2_aligned_adj_pep: Combined PEP from MS2 and alignment
- m_score: Model-based qvalues from adjusted PEPs (replaces old m_score)
- peak_group_rank: New ranking based on adjusted PEPs (replaces old ranking)

Example

# Reference feature (alignment anchor)
pep_ms2 = 0.02
alignment_pep = N/A  # Reference doesn't align to itself
→ ms2_aligned_adj_pep = 0.02  # Unchanged

# Aligned feature with weak MS2 but good alignment
pep_ms2 = 0.90
alignment_pep = 0.06
→ ms2_aligned_adj_pep = 1 - (1-0.90)×(1-0.06) = 0.906

Documentation

Updated ALIGNMENT_INTEGRATION_WORKFLOW.md with detailed process overview, formula interpretation, and output column descriptions

Bug Fix

Fixed incorrect return statement placement in SplitParquetReader.read() that prevented proper cleanup

Original prompt

From PR's #1 and #7, we've endabled the incorproation and export of alignment results if they're present, when exporting regular standard OpenSwath results. Now the next step is to use the alignment results to perform a recalulcation of the PEPs, qvalue and re-ranking of peak-groups.

What to do with the reference feature’s missing alignment_pep?

Treat it as neutral: set alignment_pep = 1.0.
Reasoning:

The reference is the anchor; the alignment statistic is defined relative to it, so it doesn’t provide extra independent evidence for itself.

Setting it to 0 (perfect support) or to pep_ms2 would double-count evidence and bias the anchor downward (over-confident).

With alignment_pep = 1, the combined posterior stays pep_adj = pep_ms2 for the reference row, which is exactly what we want.

(If you want a reporting column that summarizes group support, you can propagate the group-best adjusted score to all siblings for display, but keep the ranking/FDR based on each row’s own pep_adj and use only top-1 per (run, precursor) for FDR.)

Can you reuse compute_model_fdr on pep_adj?

Yes. That function expects a vector of PEPs and returns model-based q-values (the mean PEP up to each rank). Use it like this:

Compute pep_adj = 1 - (1 - pep_ms2)·(1 - pep_align) (clip to (ε, 1−ε) as usual).

Within each (run_id, FullPeptideName, Charge) (or your key), keep top-1 = argmin(pep_adj).

Pass that top-1 vector of pep_adj to compute_model_fdr. The result is your new m_score_adj (q-values) for those winners.

If desired, join m_score_adj back to all rows and/or propagate the group-best for display.

That’s fully consistent with PyProphet’s semantics: you combined calibrated posteriors first, then computed model-based FDR on the winners. If you instead want decoy-based q-values, you’d refit the usual target/decoy q from the new score order—but for model-based FDR your compute_model_fdr is exactly the right tool.

Tiny implementation detail

For the reference row (your feature id 5405272318039692409), set alignment_pep=1.0 ⇒ pep_adj = pep_ms2 = 0.02.

For the three alignment-only rows (pep_ms2=1), pep_adj = alignment_pep (0.06, 0.14, 0.08).

Re-rank within each run/precursor by pep_adj (or -log10(pep_adj)); if you want MS2 to win on exact ties, break ties with from_alignment (MS2 preferred).

That’s it—you’ll get calibrated, conservative q-values that let alignment-rescued peaks rise when the alignment evidence is strong, while keeping the anchor unbiased.

So the end result of what we would want basically, is the old m_score, and peak_group_rank would be kept in the output tsv and renamed to ms2_m_score, and providence_peak_group_rank. We would then add the add the pep_adj (which we can name ms2_aligned_adj_pep), and m_score (which would represent the new qvalue from the new adjusted PEPs).

💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

coderabbitai · 2025-10-31T03:56:31Z

Important

Review skipped

Bot user detected.

To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Co-authored-by: singjc <32938975+singjc@users.noreply.github.com>

Initial plan

777af66

Copilot AI assigned Copilot and singjc Oct 31, 2025

Copilot started work on behalf of singjc October 31, 2025 03:56 View session

Copilot AI and others added 3 commits October 31, 2025 04:05

Add alignment-adjusted PEP calculation and re-ranking functionality

8617636

Co-authored-by: singjc <32938975+singjc@users.noreply.github.com>

Fix alignment PEP adjustment logic for reference features

a365ab2

Co-authored-by: singjc <32938975+singjc@users.noreply.github.com>

Add documentation for alignment-adjusted PEP calculation

501b748

Co-authored-by: singjc <32938975+singjc@users.noreply.github.com>

Copilot AI changed the title ~~[WIP] Update PEP recalculation using alignment results~~ Implement alignment-adjusted PEP recalculation and re-ranking Oct 31, 2025

Copilot AI requested a review from singjc October 31, 2025 04:19

Copilot finished work on behalf of singjc October 31, 2025 04:19

Copilot AI mentioned this pull request Oct 31, 2025

Add alignment-integration command for adjusted PEP/q-value computation on full feature set #9

Draft

8 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement alignment-adjusted PEP recalculation and re-ranking#8

Implement alignment-adjusted PEP recalculation and re-ranking#8
Copilot wants to merge 4 commits intomasterfrom
copilot/update-pep-recalculation

Copilot AI commented Oct 31, 2025 •

edited

Loading

Uh oh!

coderabbitai bot commented Oct 31, 2025

Review skipped

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Copilot AI commented Oct 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Core Implementation

Integration

Output Schema

Example

Documentation

Bug Fix

Uh oh!

coderabbitai bot commented Oct 31, 2025

Review skipped

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Copilot AI commented Oct 31, 2025 •

edited

Loading