feature request: read different types of scan #20

tentrillion · 2023-09-15T15:01:54Z

I've been using RaMS for about a year and it is amazing; finally an easy depencency-light way to fast reading and tidy manipulation of MS data! My feature request is could there be a way to label different types of MS1 scans acquired in the same experiment. (I often acquire data like this when varying ion source parameters.)

I have *.mzML files generated from Sciex *.wiff2 files that were acquired from a qTOF that was running multiple MS1 scan types. Sciex calls the different scan types "experiments" and in the XML (generated via ProteoWizard) these different scan types are referred to like this:

<spectrum index="0" id="sample=1 period=1 cycle=1 experiment=2" defaultArrayLength="2271">
[...]
<spectrum index="1" id="sample=1 period=1 cycle=1 experiment=4" defaultArrayLength="4300">
[...]
 <spectrum index="3" id="sample=1 period=1 cycle=1 experiment=7" defaultArrayLength="3">

I'd be happy to supply an example mzML file.

I imagine one output type might be an extra column (relative to what get_what = c('MS1') returns) containing the spectrum id strings like sample=1 period=1 cycle=1 experiment=4.

The text was updated successfully, but these errors were encountered:

wkumler · 2023-09-25T18:22:58Z

Hi @tentrillion, thanks for the feature request. I'm a little swamped with other tasks for my PhD right now (as you may have already determined from the backlog of issues) but I'm hoping to push out v1.4 by the end of the year or early 2024. This looks like a good contribution for that but I will need some demo mzML files. If you're able to share one or two publicly with a Box or Dropbox link that'd be great - otherwise we'll have to chat about a good way of getting those to me for testing.

wkumler · 2023-11-13T18:45:54Z

Hi @tentrillion, I've got some time now to work on this issue now and think it's worth the effort. Do you have a demo mzML file you're able to share?

tentrillion · 2024-07-24T17:00:37Z

Apologies for missing your November reply until now. I couldn't figure out how to attach an mzML directly in this thread. As a (odd I know) workaround I've committed it to a random git repo I use to store / publish random notebooks. LMK if there's a better way to send you these, I have more if you need them.
https://github.com/tentrillion/ipython_notebooks/blob/master/example_sciex_multiMS1scantypes.mzML

wkumler · 2024-08-01T17:20:46Z

Hi @tentrillion, thanks for providing the demo file! It's a good question about how to best go about getting this data and combining it with the rest of the MS1 info. This feels like a similar function to the grabAccessionData but since the information's stored in the spectrum tag itself that doesn't work for extraction. Instead I had to manually read in the XML and extract the experiment number, bind that to the associated retention time, and then merge it back onto the MS1 info. This could definitely be streamlined into a single function (which would then also avoid having to read the mzML file twice) but is this essentially what you're looking for?

library(xml2)
library(RaMS)
library(ggplot2)

xml_data <- read_xml("~/../Downloads/example_sciex_multiMS1scantypes.mzML")
all_spectra <- xml_find_all(xml_data, "//d1:spectrum")
scan_ids <- xml_attr(all_spectra, "id")
experiment_nums <- as.numeric(gsub(".*experiment=", "", scan_ids))
scan_rts <- grabAccessionData("~/../Downloads/example_sciex_multiMS1scantypes.mzML", "MS:1000016")
rt_id_df <- cbind(rt=as.numeric(scan_rts$value), exp_num=experiment_nums)

msdata <- grabMSdata("~/../Downloads/example_sciex_multiMS1scantypes.mzML")

ms1_w_expnum <- merge(msdata$MS1, rt_id_df)

There seem to be some quirky data in the file - each mass is "bracketed" by two zeros on either side at higher and lower masses, creating a strange triplicate data point layout pattern:

ggplot(ms1_w_expnum[mz%between%pmppm(371.09458, 100)]) +
  geom_point(aes(x=rt, y=mz, color=int>0))

but when those points are removed you can see the instrument cycling through each of the different MS1 scan types

ggplot(ms1_w_expnum[mz%between%pmppm(371.09458, 100)][int>0]) +
  geom_point(aes(x=rt, y=mz, color=factor(exp_num)))

and you can then use the experiment number to separate out the types of scans and plot them individually

ggplot(ms1_w_expnum[mz%between%pmppm(371.09458, 10)][int>0]) +
  geom_line(aes(x=rt, y=int)) +
  facet_wrap(~exp_num, ncol=1, scales = "free_y")

wkumler mentioned this issue Nov 13, 2023

RaMS v1.3.2 #23

Closed

9 tasks

tentrillion added a commit to tentrillion/ipython_notebooks that referenced this issue Jul 24, 2024

hacky way to send a file to service wkumler/RaMS#20

77fcbda

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feature request: read different types of scan #20

feature request: read different types of scan #20

tentrillion commented Sep 15, 2023 •

edited

Loading

wkumler commented Sep 25, 2023

wkumler commented Nov 13, 2023

tentrillion commented Jul 24, 2024

wkumler commented Aug 1, 2024

feature request: read different types of scan #20

feature request: read different types of scan #20

Comments

tentrillion commented Sep 15, 2023 • edited Loading

wkumler commented Sep 25, 2023

wkumler commented Nov 13, 2023

tentrillion commented Jul 24, 2024

wkumler commented Aug 1, 2024

tentrillion commented Sep 15, 2023 •

edited

Loading