Skip to content

Unified global FLR (decoy-AA method) across AScore / PhosphoRS / LucXor for same-FDR/same-FLR comparison #40

@timosachsenberg

Description

@timosachsenberg

Goal

Compute one common global False-Localization-Rate (FLR) for all three localization
algorithms (AScore, PhosphoRS, LucXor) so they can be benchmarked at the same PSM-FDR and
the same FLR level
(e.g. "at 5 % global FLR, how many STY sites does each tool recover?").

TL;DR feasibility

Possible in principle — not deliverable by the code as-is. The decoy-amino-acid method
(Alanine = PhosphoDecoy) is the correct and only common basis, and it is already the
method we partially implement. But a correct, comparable FLR is currently blocked by two
per-tool scoring bugs, a merge that can align different peptides, and the fact that the only
existing decoy-FLR estimator (snippets.py) is statistically invalid (~3.7× optimistic on
our data). The fixes are localized, not architectural.

Background: there are two different "FLR"s in this repo — don't conflate them

Decoy-AA counting FLR LucXor native model FLR (onsite/lucxor/flr.py)
Basis Count wins on a residue that can't be phosphorylated (A) KDE of delta-scores: REAL vs decoy-peptide permutations
Scope Tool-agnostic — definable identically for all 3 LucXor only
Granularity Per-site or per-PSM Per-PSM (delta_score)

Only the decoy-AA FLR is common to all three, so it is the only valid axis for a 3-way
comparison. LucXor's native Luciphor_global_flr is a different estimator and must not be
mixed in — and note that with --add-decoys, the PhosphoDecoy(A) wins are scored on the
REAL/target side of LucXor's KDE and contribute nothing to its native decoy density,
so Luciphor_global_flr is not a decoy-AA FLR.

Method grounding (Ramsbottom et al. 2022)

Ramsbottom, K. A. et al. "Method for Independent Estimation of the False Localization Rate
for Phosphoproteomics."
J. Proteome Res. 2022. DOI
10.1021/acs.jproteome.1c00827 ·
open access PMC9251759. This is the
source of the PhosphoDecoy strategy. Equations quoted verbatim from the full text:

  • Decoy-AA global FLR (Eq. 2): pX_FLR_n = (2 × (T_c/X_c) × Σ₁ⁿ(p_X_c)) / n
    • T_c = total STY (target) residue count in the scored-phosphosite PSM set
    • X_c = total decoy-AA (A) residue count in that set
    • p_X_c = cumulative phospho-on-decoy count down the ranked list
    • T_c/X_c normalizes for decoy positions being scarcer; ×2 models wrong assignments
      landing in both pools.
  • Global model FLR (Eq. 1): Global model FLR = Σ₁ⁿ (P^PSM + P^PTM) / n — sums
    PSM-identification error and localization error (union-bound), i.e. the principled
    "same PSM-FDR and FLR" estimator.

Method rules the comparison must follow: rank sites by the tool's localization confidence
(not by site count); collapse redundant sites (max prob per site, bin to 2 dp, tie-break by
PSM count); threshold at 5 %, not 1 % (the paper states 1 % estimates are not robust);
decoy AA = Ala or Leu (we use A ✓).

Empirical code audit — per-tool status

Tool A competes in scoring? A-win annotated in idXML? Ranking score (granularity) Status
PhosphoRS ✅ verified end-to-end (phosphors.py:1012,1167) (PhosphoDecoy) + phospho_decoy_count (cli.py:531,647) PhosphoRS_site_probs (per-site, ↑=better) Ready
AScore ⚠️ single-site only ✅ when not dropped by the bug AScore_N (per-site, ↑=better) Needs fix
LucXor ❌ A scored as unmodified backbone ❌ A-wins can't be serialized delta_score (per-PSM only) Needs fix

Blocking bug 1 — AScore drops decoy-A on multi-phospho peptides

getSites_ appends A-sites after S/T/Y without re-sorting (onsite/ascore/ascore.py:483-491),
so combinations() emits unsorted index combos (e.g. [4,1]); createTheoreticalSpectra_
(ascore.py:641-659) walks positions ascending and silently drops any out-of-order
modification. Verified: createTheoreticalSpectra_([[4,1]],'SAAAYK')SAAAY(Phospho)K
(the A is lost) instead of SA(PhosphoDecoy)AAY(Phospho)K. Net effect: A is systematically
disadvantaged on multi-site peptides → decoy-AA FLR biased low.
Fix: sort sites in getSites_ (or sort each permutation before spectrum build).
Secondary: add_losses is hard-coded false (ascore.py:37) so −H3PO4 losses are never
generated for STY or A; score scale is polluted by sentinels (1000.0/0.0/-1.0) a
score-sweep must exclude.

Blocking bug 2 — LucXor scores A as unmodified backbone and can't write A-wins

Production scoring _get_mod_positions_from_perm (onsite/lucxor/psm.py:1115) only
recognizes lowercase s/t/y, not a, so an A-localization hypothesis is scored as the
unmodified backbone (no +79.966 on any fragment) → A almost always loses → decoy-AA FLR
biased low / falsely optimistic
. The output path convert_sequence_to_standard_format
(psm.py:1651) has no case for a, so an A-win produces a bare-a string that fails
AASequence.fromString; the except (cli.py:948-951) silently keeps the original input
sequence — A-wins are never written as A(PhosphoDecoy) and can't be counted downstream.
(A separate path, _to_pyopenms_format peptide.py:377-380, does handle a correctly and
is used for KDE model-building — "the code handles A" is false comfort; the wrong path is
wired into scoring.) Mass/NL config is correct (79.966331 + H3PO4).
Fix (localized): teach _get_mod_positions_from_perm, _get_mod_map (psm.py:996) and
convert_sequence_to_standard_format to treat a as PhosphoDecoy.

Merge / "same PSM set" — not guaranteed

All three tools read the same input idXML (onsite/onsitec.py:149-192), but
merge_algorithm_results zips outputs by raw positional index with only a length warning
(onsitec.py:259-283); LucXor (keepNBestHits, spectrum-miss cli.py:740, min-PSM abort
cli.py:846) and AScore (error-drops cli.py:167-177) drop/reorder PSMs, so a single
mid-list drop silently aligns hits from different peptides. Even bypassing the merge, the
three tools retain different PSM subsets, so a fair comparison must be on the
intersection of PSMs all three scored, with per-tool drop counts reported as a result.

snippets.py decoy-FLR is statistically invalid

  • Not score-ranked — accumulates by site count (snippets.py:971-1023), so the curve
    is an artifact of ordering, not confidence.
  • UnnormalizedFLR = PhosphoDecoy/(Phospho+PhosphoDecoy) (snippets.py:982,1570)
    omits both T_c/X_c and ×2. On our data T_c/X_c ≈ 1.83, so it underestimates the true
    FLR by ≈ 2 × 1.83 ≈ 3.7× (a "1 % FLR" read off it is really ~3–4 %).
  • Cosmetic — sigmoid surge-smoothing, double Gaussian filtering, hard extension to 0.1
    (snippets.py:1075-1347). It is untracked, has hard-coded Windows paths, and is never
    wired into onsitec.

Proposed work

  • AScore: sort sites in getSites_ so multi-phospho decoy-A permutations aren't dropped (ascore.py:483-491/641-659).
  • LucXor: handle lowercase a as PhosphoDecoy in the scoring + output paths (psm.py:1115, 996, 1651; cli.py:948-951).
  • Merge: join by spectrum_reference+sequence, not positional index; compare on the PSM intersection and report per-tool drop counts (onsitec.py:259-283).
  • FLR module: replace snippets.py with a single downstream estimator implementing Eq. 2 (apply 2·T_c/X_c), ranking each tool by its own localization confidence, collapsing sites per the paper, thresholding at 5 %.
  • Granularity decision (needs a maintainer call): site-level FLR (paper-faithful; requires adding a per-site score to LucXor, which only has per-PSM delta_score) vs PSM-level FLR for all three (works after the bug fixes; coarser).
  • Keep LucXor's native FLR separate; optionally use it as a cross-check of the decoy-AA FLR.
  • (Optional) implement Eq. 1 (P^PSM + P^PTM) using the search-engine PEP for P^PSM and each tool's localization probability for P^PTM.

Acceptance

A single command produces, for AScore / PhosphoRS / LucXor on the same q-value-filtered PSM
intersection, a normalized decoy-AA global-FLR curve (sites recovered vs FLR) and the site
count each tool reports at 5 % global FLR, with per-tool PSM drop counts.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingenhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions