Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve fragment matching plot #13

Open
sgibb opened this issue Aug 10, 2022 · 1 comment
Open

Improve fragment matching plot #13

sgibb opened this issue Aug 10, 2022 · 1 comment
Labels
question Further information is requested

Comments

@sgibb
Copy link
Member

sgibb commented Aug 10, 2022

Currently our fragment matching plot looks a bit boring:

Code
library("Spectra")
#> Loading required package: S4Vectors
#> Loading required package: stats4
#> Loading required package: BiocGenerics
#> 
#> Attaching package: 'BiocGenerics'
#> The following objects are masked from 'package:stats':
#> 
#>     IQR, mad, sd, var, xtabs
#> The following objects are masked from 'package:base':
#> 
#>     anyDuplicated, append, as.data.frame, basename, cbind, colnames,
#>     dirname, do.call, duplicated, eval, evalq, Filter, Find, get, grep,
#>     grepl, intersect, is.unsorted, lapply, Map, mapply, match, mget,
#>     order, paste, pmax, pmax.int, pmin, pmin.int, Position, rank,
#>     rbind, Reduce, rownames, sapply, setdiff, sort, table, tapply,
#>     union, unique, unsplit, which.max, which.min
#> 
#> Attaching package: 'S4Vectors'
#> The following objects are masked from 'package:base':
#> 
#>     expand.grid, I, unname
#> Loading required package: BiocParallel
#> Loading required package: ProtGenerics
#> 
#> Attaching package: 'ProtGenerics'
#> The following object is masked from 'package:stats':
#> 
#>     smooth
#> 
#> Attaching package: 'Spectra'
#> The following object is masked from 'package:ProtGenerics':
#> 
#>     addProcessing
library("MsCoreUtils")
#> 
#> Attaching package: 'MsCoreUtils'
#> The following objects are masked from 'package:Spectra':
#> 
#>     bin, smooth
#> The following objects are masked from 'package:ProtGenerics':
#> 
#>     bin, smooth
#> The following object is masked from 'package:stats':
#> 
#>     smooth
library("PSMatch")

sp <- DataFrame(msLevel = 2L, rtime = 2345, sequence = "SIGFEGDSIGR")
sp$mz <- list(c(100.048614501953, 110.069030761719, 112.085876464844,
                117.112571716309, 158.089569091797, 163.114898681641,
                175.117172241211, 177.098587036133, 214.127075195312,
                232.137542724609, 233.140335083008, 259.938415527344,
                260.084167480469, 277.111572265625, 282.680786132812,
                284.079437255859, 291.208282470703, 315.422576904297,
                317.22509765625, 327.2060546875, 362.211944580078,
                402.235290527344, 433.255004882812, 529.265991210938,
                549.305236816406, 593.217041015625, 594.595092773438,
                609.848327636719, 631.819702148438, 632.324035644531,
                632.804931640625, 640.8193359375, 641.309936523438,
                641.82568359375, 678.357238769531, 679.346252441406,
                688.291259765625, 735.358947753906, 851.384033203125,
                880.414001464844, 881.40185546875, 919.406433105469,
                938.445861816406, 1022.56658935547, 1050.50415039062,
                1059.82800292969, 1107.52734375, 1138.521484375,
                1147.51538085938, 1226.056640625))
sp$intensity <- list(c(83143.03, 65473.8, 192735.53, 3649178.5,
                       379537.81, 89117.58, 922802.69, 61190.44,
                       281353.22, 2984798.75, 111935.03, 42512.57,
                       117443.59, 60773.67, 39108.15, 55350.43,
                       209952.97, 37001.18, 439515.53, 139584.47,
                       46842.71, 1015457.44, 419382.31, 63378.77,
                       444406.66, 58426.91, 46007.71, 58711.72,
                       80675.59, 312799.97, 134451.72, 151969.72,
                       3215457.75, 1961975, 395735.62, 71002.98,
                       69405.73, 136619.47, 166158.69, 682329.75,
                       239964.69, 242025.44, 1338597.62, 50118.02,
                       1708093.12, 43119.03, 97048.02, 2668231.75,
                       83310.2, 40705.72))
sp <- Spectra(sp)

plotSpectra(sp, labels = addFragments, labelPos = 3)

Created on 2022-08-10 by the reprex package (v2.0.1)

I have seen the following figure in https://www.nature.com/articles/s41586-022-04839-2:

41586_2022_4839_Fig14_ESM

We could generate something similar:

(currently addFragments doesn't allow to set the fragment types, also plotSpectra modifies xlim so we could not generate identical x axes; both has to be fixed if we really want to produce this image)

Code
library("Spectra")
#> Loading required package: S4Vectors
#> Loading required package: stats4
#> Loading required package: BiocGenerics
#> 
#> Attaching package: 'BiocGenerics'
#> The following objects are masked from 'package:stats':
#> 
#>     IQR, mad, sd, var, xtabs
#> The following objects are masked from 'package:base':
#> 
#>     anyDuplicated, append, as.data.frame, basename, cbind, colnames,
#>     dirname, do.call, duplicated, eval, evalq, Filter, Find, get, grep,
#>     grepl, intersect, is.unsorted, lapply, Map, mapply, match, mget,
#>     order, paste, pmax, pmax.int, pmin, pmin.int, Position, rank,
#>     rbind, Reduce, rownames, sapply, setdiff, sort, table, tapply,
#>     union, unique, unsplit, which.max, which.min
#> 
#> Attaching package: 'S4Vectors'
#> The following objects are masked from 'package:base':
#> 
#>     expand.grid, I, unname
#> Loading required package: BiocParallel
#> Loading required package: ProtGenerics
#> 
#> Attaching package: 'ProtGenerics'
#> The following object is masked from 'package:stats':
#> 
#>     smooth
#> 
#> Attaching package: 'Spectra'
#> The following object is masked from 'package:ProtGenerics':
#> 
#>     addProcessing
library("MsCoreUtils")
#> 
#> Attaching package: 'MsCoreUtils'
#> The following objects are masked from 'package:Spectra':
#> 
#>     bin, smooth
#> The following objects are masked from 'package:ProtGenerics':
#> 
#>     bin, smooth
#> The following object is masked from 'package:stats':
#> 
#>     smooth
library("PSMatch")

sp <- DataFrame(msLevel = 2L, rtime = 2345, sequence = "SIGFEGDSIGR")
sp$mz <- list(c(100.048614501953, 110.069030761719, 112.085876464844,
                117.112571716309, 158.089569091797, 163.114898681641,
                175.117172241211, 177.098587036133, 214.127075195312,
                232.137542724609, 233.140335083008, 259.938415527344,
                260.084167480469, 277.111572265625, 282.680786132812,
                284.079437255859, 291.208282470703, 315.422576904297,
                317.22509765625, 327.2060546875, 362.211944580078,
                402.235290527344, 433.255004882812, 529.265991210938,
                549.305236816406, 593.217041015625, 594.595092773438,
                609.848327636719, 631.819702148438, 632.324035644531,
                632.804931640625, 640.8193359375, 641.309936523438,
                641.82568359375, 678.357238769531, 679.346252441406,
                688.291259765625, 735.358947753906, 851.384033203125,
                880.414001464844, 881.40185546875, 919.406433105469,
                938.445861816406, 1022.56658935547, 1050.50415039062,
                1059.82800292969, 1107.52734375, 1138.521484375,
                1147.51538085938, 1226.056640625))
sp$intensity <- list(c(83143.03, 65473.8, 192735.53, 3649178.5,
                       379537.81, 89117.58, 922802.69, 61190.44,
                       281353.22, 2984798.75, 111935.03, 42512.57,
                       117443.59, 60773.67, 39108.15, 55350.43,
                       209952.97, 37001.18, 439515.53, 139584.47,
                       46842.71, 1015457.44, 419382.31, 63378.77,
                       444406.66, 58426.91, 46007.71, 58711.72,
                       80675.59, 312799.97, 134451.72, 151969.72,
                       3215457.75, 1961975, 395735.62, 71002.98,
                       69405.73, 136619.47, 166158.69, 682329.75,
                       239964.69, 242025.44, 1338597.62, 50118.02,
                       1708093.12, 43119.03, 97048.02, 2668231.75,
                       83310.2, 40705.72))
sp <- Spectra(sp)

x_data <- peaksData(sp)[[1L]]
y_data <- calculateFragments(
    spectraData(sp)[["sequence"]],
    type = c("a", "b", "c", "x", "y", "z"),
    verbose = FALSE
)
y_data <- y_data[order(y_data$mz), ]

tolerance <- 0
ppm <- 5000

## Find common peaks and prepare annotations
idx <- MsCoreUtils::closest(
    y_data[, "mz"], x_data[, "mz"],
    duplicates = "closest", tolerance = tolerance, ppm = ppm
)
idy <- MsCoreUtils::closest(
    x_data[, "mz"], y_data[, "mz"],
    duplicates = "closest", tolerance = tolerance, ppm = ppm
)

## Prepare labels
labels <- data.frame(
   ion = y_data[idy, "ion"],
   type = y_data[idy, "type"],
   delta = NA,
   col = NA
)
labels$delta[idx[!is.na(idx)]] <-
    x_data[idx[!is.na(idx)], "mz"] - y_data[idy[!is.na(idy)], "mz"]

col <- setNames(palette.colors(6), c("a", "b", "c", "x", "y", "z"))
labels$col <- col[substr(labels$type, 1, 1)]
labels$col[is.na(labels$col)] <- "#808080"

## Annotate the spectum with the fragment labels
par.old <- par(
    no.readonly = TRUE
)
on.exit(par(par.old))

layout(
    matrix(c(1, 2), nrow = 2), heights = c(5, 1)
)
par(mar = c(7, 4, 4, 2) + 0.1)
plotSpectra(
    sp, col = labels$col,
    labels = labels$ion, labelCol = labels$col, labelPos = 3,
    xaxt = "n", yaxt = "n", bty = "l", main = spectraData(sp)[["sequence"]]
)
axis(1)
axis(2)

par(mar = c(2, 4, 0, 2) + 0.1)
plot(
    x_data[, "mz"], labels$delta, col = labels$col, type = "h", pch = 19,
    ann = FALSE, xaxt = "n"
)
points(
    x_data[, "mz"], labels$delta, col = labels$col, type = "p", pch = 19
)
abline(h = 0, col = "#808080", lty = 2)
title(ylab = "delta m/z")
axis(3)

psmatch

Created on 2022-08-10 by the reprex package (v2.0.1)

Has anyone an idea how to plot the fragment marker between the amino acid letters?

What do you think about these plots?

@sgibb sgibb added the question Further information is requested label Aug 10, 2022
@lgatto
Copy link
Member

lgatto commented Aug 11, 2022

Hi Sebastian! Yes, I also saw similar plots and wanted to have a go to implement them - thank you!

Not sure about the markers between the amino acids, but I have already seen them in R plots (or possibly python), I think, but can't remember where. I'll update you here if I figure out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants