Goal
Compute one common global False-Localization-Rate (FLR) for all three localization
algorithms (AScore, PhosphoRS, LucXor) so they can be benchmarked at the same PSM-FDR and
the same FLR level (e.g. "at 5 % global FLR, how many STY sites does each tool recover?").
TL;DR feasibility
Possible in principle — not deliverable by the code as-is. The decoy-amino-acid method
(Alanine = PhosphoDecoy) is the correct and only common basis, and it is already the
method we partially implement. But a correct, comparable FLR is currently blocked by two
per-tool scoring bugs, a merge that can align different peptides, and the fact that the only
existing decoy-FLR estimator (snippets.py) is statistically invalid (~3.7× optimistic on
our data). The fixes are localized, not architectural.
Background: there are two different "FLR"s in this repo — don't conflate them
|
Decoy-AA counting FLR |
LucXor native model FLR (onsite/lucxor/flr.py) |
| Basis |
Count wins on a residue that can't be phosphorylated (A) |
KDE of delta-scores: REAL vs decoy-peptide permutations |
| Scope |
Tool-agnostic — definable identically for all 3 |
LucXor only |
| Granularity |
Per-site or per-PSM |
Per-PSM (delta_score) |
Only the decoy-AA FLR is common to all three, so it is the only valid axis for a 3-way
comparison. LucXor's native Luciphor_global_flr is a different estimator and must not be
mixed in — and note that with --add-decoys, the PhosphoDecoy(A) wins are scored on the
REAL/target side of LucXor's KDE and contribute nothing to its native decoy density,
so Luciphor_global_flr is not a decoy-AA FLR.
Method grounding (Ramsbottom et al. 2022)
Ramsbottom, K. A. et al. "Method for Independent Estimation of the False Localization Rate
for Phosphoproteomics." J. Proteome Res. 2022. DOI
10.1021/acs.jproteome.1c00827 ·
open access PMC9251759. This is the
source of the PhosphoDecoy strategy. Equations quoted verbatim from the full text:
- Decoy-AA global FLR (Eq. 2):
pX_FLR_n = (2 × (T_c/X_c) × Σ₁ⁿ(p_X_c)) / n
T_c = total STY (target) residue count in the scored-phosphosite PSM set
X_c = total decoy-AA (A) residue count in that set
p_X_c = cumulative phospho-on-decoy count down the ranked list
T_c/X_c normalizes for decoy positions being scarcer; ×2 models wrong assignments
landing in both pools.
- Global model FLR (Eq. 1):
Global model FLR = Σ₁ⁿ (P^PSM + P^PTM) / n — sums
PSM-identification error and localization error (union-bound), i.e. the principled
"same PSM-FDR and FLR" estimator.
Method rules the comparison must follow: rank sites by the tool's localization confidence
(not by site count); collapse redundant sites (max prob per site, bin to 2 dp, tie-break by
PSM count); threshold at 5 %, not 1 % (the paper states 1 % estimates are not robust);
decoy AA = Ala or Leu (we use A ✓).
Empirical code audit — per-tool status
| Tool |
A competes in scoring? |
A-win annotated in idXML? |
Ranking score (granularity) |
Status |
| PhosphoRS |
✅ verified end-to-end (phosphors.py:1012,1167) |
✅ (PhosphoDecoy) + phospho_decoy_count (cli.py:531,647) |
PhosphoRS_site_probs (per-site, ↑=better) |
Ready |
| AScore |
⚠️ single-site only |
✅ when not dropped by the bug |
AScore_N (per-site, ↑=better) |
Needs fix |
| LucXor |
❌ A scored as unmodified backbone |
❌ A-wins can't be serialized |
delta_score (per-PSM only) |
Needs fix |
Blocking bug 1 — AScore drops decoy-A on multi-phospho peptides
getSites_ appends A-sites after S/T/Y without re-sorting (onsite/ascore/ascore.py:483-491),
so combinations() emits unsorted index combos (e.g. [4,1]); createTheoreticalSpectra_
(ascore.py:641-659) walks positions ascending and silently drops any out-of-order
modification. Verified: createTheoreticalSpectra_([[4,1]],'SAAAYK') → SAAAY(Phospho)K
(the A is lost) instead of SA(PhosphoDecoy)AAY(Phospho)K. Net effect: A is systematically
disadvantaged on multi-site peptides → decoy-AA FLR biased low.
Fix: sort sites in getSites_ (or sort each permutation before spectrum build).
Secondary: add_losses is hard-coded false (ascore.py:37) so −H3PO4 losses are never
generated for STY or A; score scale is polluted by sentinels (1000.0/0.0/-1.0) a
score-sweep must exclude.
Blocking bug 2 — LucXor scores A as unmodified backbone and can't write A-wins
Production scoring _get_mod_positions_from_perm (onsite/lucxor/psm.py:1115) only
recognizes lowercase s/t/y, not a, so an A-localization hypothesis is scored as the
unmodified backbone (no +79.966 on any fragment) → A almost always loses → decoy-AA FLR
biased low / falsely optimistic. The output path convert_sequence_to_standard_format
(psm.py:1651) has no case for a, so an A-win produces a bare-a string that fails
AASequence.fromString; the except (cli.py:948-951) silently keeps the original input
sequence — A-wins are never written as A(PhosphoDecoy) and can't be counted downstream.
(A separate path, _to_pyopenms_format peptide.py:377-380, does handle a correctly and
is used for KDE model-building — "the code handles A" is false comfort; the wrong path is
wired into scoring.) Mass/NL config is correct (79.966331 + H3PO4).
Fix (localized): teach _get_mod_positions_from_perm, _get_mod_map (psm.py:996) and
convert_sequence_to_standard_format to treat a as PhosphoDecoy.
Merge / "same PSM set" — not guaranteed
All three tools read the same input idXML (onsite/onsitec.py:149-192), but
merge_algorithm_results zips outputs by raw positional index with only a length warning
(onsitec.py:259-283); LucXor (keepNBestHits, spectrum-miss cli.py:740, min-PSM abort
cli.py:846) and AScore (error-drops cli.py:167-177) drop/reorder PSMs, so a single
mid-list drop silently aligns hits from different peptides. Even bypassing the merge, the
three tools retain different PSM subsets, so a fair comparison must be on the
intersection of PSMs all three scored, with per-tool drop counts reported as a result.
snippets.py decoy-FLR is statistically invalid
- Not score-ranked — accumulates by site count (
snippets.py:971-1023), so the curve
is an artifact of ordering, not confidence.
- Unnormalized —
FLR = PhosphoDecoy/(Phospho+PhosphoDecoy) (snippets.py:982,1570)
omits both T_c/X_c and ×2. On our data T_c/X_c ≈ 1.83, so it underestimates the true
FLR by ≈ 2 × 1.83 ≈ 3.7× (a "1 % FLR" read off it is really ~3–4 %).
- Cosmetic — sigmoid surge-smoothing, double Gaussian filtering, hard extension to 0.1
(snippets.py:1075-1347). It is untracked, has hard-coded Windows paths, and is never
wired into onsitec.
Proposed work
Acceptance
A single command produces, for AScore / PhosphoRS / LucXor on the same q-value-filtered PSM
intersection, a normalized decoy-AA global-FLR curve (sites recovered vs FLR) and the site
count each tool reports at 5 % global FLR, with per-tool PSM drop counts.
Goal
Compute one common global False-Localization-Rate (FLR) for all three localization
algorithms (AScore, PhosphoRS, LucXor) so they can be benchmarked at the same PSM-FDR and
the same FLR level (e.g. "at 5 % global FLR, how many STY sites does each tool recover?").
TL;DR feasibility
Possible in principle — not deliverable by the code as-is. The decoy-amino-acid method
(Alanine =
PhosphoDecoy) is the correct and only common basis, and it is already themethod we partially implement. But a correct, comparable FLR is currently blocked by two
per-tool scoring bugs, a merge that can align different peptides, and the fact that the only
existing decoy-FLR estimator (
snippets.py) is statistically invalid (~3.7× optimistic onour data). The fixes are localized, not architectural.
Background: there are two different "FLR"s in this repo — don't conflate them
onsite/lucxor/flr.py)delta_score)Only the decoy-AA FLR is common to all three, so it is the only valid axis for a 3-way
comparison. LucXor's native
Luciphor_global_flris a different estimator and must not bemixed in — and note that with
--add-decoys, thePhosphoDecoy(A)wins are scored on theREAL/target side of LucXor's KDE and contribute nothing to its native decoy density,
so
Luciphor_global_flris not a decoy-AA FLR.Method grounding (Ramsbottom et al. 2022)
Ramsbottom, K. A. et al. "Method for Independent Estimation of the False Localization Rate
for Phosphoproteomics." J. Proteome Res. 2022. DOI
10.1021/acs.jproteome.1c00827 ·
open access PMC9251759. This is the
source of the
PhosphoDecoystrategy. Equations quoted verbatim from the full text:pX_FLR_n = (2 × (T_c/X_c) × Σ₁ⁿ(p_X_c)) / nT_c= total STY (target) residue count in the scored-phosphosite PSM setX_c= total decoy-AA (A) residue count in that setp_X_c= cumulative phospho-on-decoy count down the ranked listT_c/X_cnormalizes for decoy positions being scarcer;×2models wrong assignmentslanding in both pools.
Global model FLR = Σ₁ⁿ (P^PSM + P^PTM) / n— sumsPSM-identification error and localization error (union-bound), i.e. the principled
"same PSM-FDR and FLR" estimator.
Method rules the comparison must follow: rank sites by the tool's localization confidence
(not by site count); collapse redundant sites (max prob per site, bin to 2 dp, tie-break by
PSM count); threshold at 5 %, not 1 % (the paper states 1 % estimates are not robust);
decoy AA = Ala or Leu (we use A ✓).
Empirical code audit — per-tool status
phosphors.py:1012,1167)(PhosphoDecoy)+phospho_decoy_count(cli.py:531,647)PhosphoRS_site_probs(per-site, ↑=better)AScore_N(per-site, ↑=better)delta_score(per-PSM only)Blocking bug 1 — AScore drops decoy-A on multi-phospho peptides
getSites_appends A-sites after S/T/Y without re-sorting (onsite/ascore/ascore.py:483-491),so
combinations()emits unsorted index combos (e.g.[4,1]);createTheoreticalSpectra_(
ascore.py:641-659) walks positions ascending and silently drops any out-of-ordermodification. Verified:
createTheoreticalSpectra_([[4,1]],'SAAAYK')→SAAAY(Phospho)K(the A is lost) instead of
SA(PhosphoDecoy)AAY(Phospho)K. Net effect: A is systematicallydisadvantaged on multi-site peptides → decoy-AA FLR biased low.
Fix: sort sites in
getSites_(or sort each permutation before spectrum build).Secondary:
add_lossesis hard-codedfalse(ascore.py:37) so −H3PO4 losses are nevergenerated for STY or A; score scale is polluted by sentinels (
1000.0/0.0/-1.0) ascore-sweep must exclude.
Blocking bug 2 — LucXor scores A as unmodified backbone and can't write A-wins
Production scoring
_get_mod_positions_from_perm(onsite/lucxor/psm.py:1115) onlyrecognizes lowercase
s/t/y, nota, so an A-localization hypothesis is scored as theunmodified backbone (no +79.966 on any fragment) → A almost always loses → decoy-AA FLR
biased low / falsely optimistic. The output path
convert_sequence_to_standard_format(
psm.py:1651) has no case fora, so an A-win produces a bare-astring that failsAASequence.fromString; theexcept(cli.py:948-951) silently keeps the original inputsequence — A-wins are never written as
A(PhosphoDecoy)and can't be counted downstream.(A separate path,
_to_pyopenms_formatpeptide.py:377-380, does handleacorrectly andis used for KDE model-building — "the code handles A" is false comfort; the wrong path is
wired into scoring.) Mass/NL config is correct (79.966331 + H3PO4).
Fix (localized): teach
_get_mod_positions_from_perm,_get_mod_map(psm.py:996) andconvert_sequence_to_standard_formatto treataasPhosphoDecoy.Merge / "same PSM set" — not guaranteed
All three tools read the same input idXML (
onsite/onsitec.py:149-192), butmerge_algorithm_resultszips outputs by raw positional index with only a length warning(
onsitec.py:259-283); LucXor (keepNBestHits, spectrum-misscli.py:740, min-PSM abortcli.py:846) and AScore (error-dropscli.py:167-177) drop/reorder PSMs, so a singlemid-list drop silently aligns hits from different peptides. Even bypassing the merge, the
three tools retain different PSM subsets, so a fair comparison must be on the
intersection of PSMs all three scored, with per-tool drop counts reported as a result.
snippets.pydecoy-FLR is statistically invalidsnippets.py:971-1023), so the curveis an artifact of ordering, not confidence.
FLR = PhosphoDecoy/(Phospho+PhosphoDecoy)(snippets.py:982,1570)omits both
T_c/X_cand×2. On our dataT_c/X_c ≈ 1.83, so it underestimates the trueFLR by ≈ 2 × 1.83 ≈ 3.7× (a "1 % FLR" read off it is really ~3–4 %).
(
snippets.py:1075-1347). It is untracked, has hard-coded Windows paths, and is neverwired into
onsitec.Proposed work
getSites_so multi-phospho decoy-A permutations aren't dropped (ascore.py:483-491/641-659).aasPhosphoDecoyin the scoring + output paths (psm.py:1115,996,1651;cli.py:948-951).spectrum_reference+sequence, not positional index; compare on the PSM intersection and report per-tool drop counts (onsitec.py:259-283).snippets.pywith a single downstream estimator implementing Eq. 2 (apply2·T_c/X_c), ranking each tool by its own localization confidence, collapsing sites per the paper, thresholding at 5 %.delta_score) vs PSM-level FLR for all three (works after the bug fixes; coarser).P^PSM + P^PTM) using the search-engine PEP forP^PSMand each tool's localization probability forP^PTM.Acceptance
A single command produces, for AScore / PhosphoRS / LucXor on the same q-value-filtered PSM
intersection, a normalized decoy-AA global-FLR curve (sites recovered vs FLR) and the site
count each tool reports at 5 % global FLR, with per-tool PSM drop counts.