Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mz values not in order #135

Closed
lgatto opened this issue Jul 25, 2016 · 11 comments
Closed

mz values not in order #135

lgatto opened this issue Jul 25, 2016 · 11 comments

Comments

@lgatto
Copy link
Owner

lgatto commented Jul 25, 2016

Following up from @sgibb's comment:

> library("mzR")
> rw <- openMSfile("~/Data2/Thermo_HELA_PRT/Thermo_Hela_PRTC_1.mzML")
> pk1 <- peaks(rw, 1)
> which(diff(pk1[, 1]) < 0)
[1] 18053 19085 19749 21879 21920 22303
> pk1[18050:18055, ]
         [,1]      [,2]
[1,] 1120.546 4862.1455
[2,] 1120.555 3165.6448
[3,] 1120.564 1505.5123
[4,] 1120.573  442.5249
[5,] 1120.570    0.0000
[6,] 1120.579    0.0000

The file is available here. It was converted from RAW using msconvert without any filters.

@lgatto
Copy link
Owner Author

lgatto commented Jul 25, 2016

@sgibb suggested to update the constructor to automatically sort the mz values. First, it might be good to double check the actual data with other software. I don't have the raw data at hand, but could add it.

@lgatto
Copy link
Owner Author

lgatto commented Jul 25, 2016

@jotsetung

I don't know from where it comes (eventually from the mzML files, i.e. some manufacturer or other software not exporting as expected), but also in xcms there are sometimes checks for that (e.g. https://github.com/sneumann/xcms/blob/devel/R/xcmsRaw.R#L1926-L1927).

Would be nice if that could already be done downstream in mzR.

mzR just returns a data.frame. I would prefer keep it as it is at that level (or possibly have a private function to sort it according to mz) and do the checks on formally defined data structures (in the Spectrum constructor, for example).

@jorainer
Copy link
Collaborator

OK; I would like to keep the spectra construction as fast as possible. For Spectrum objects the check could go into the validObject or do you think it might be even better to automatically ensure sorting by M/Z in the constructor?

@sgibb
Copy link
Collaborator

sgibb commented Jul 26, 2016

validObject is called very often and will slow things down. Would it be enough to modify readMSData to ensure that the order of the mz values is correct?

@jorainer
Copy link
Collaborator

Not for OnDiskMSnExp objects: the data is not read in readMSData (actually readMSData2), just the header is imported; the actual spectra data are read on-the-fly. For MSnExp and readMSData it would however work.

Presently I'm playing around with automatic ordering of M/Z-intensity values by M/Z in the c-level Spectrum1 (Spectrum2) constructor. If I can ensure that this runs fast enough we could put it there.

@lgatto
Copy link
Owner Author

lgatto commented Jul 26, 2016

The re-ordering should be done in the Spectrum constuctor (C-level and R-level). I will add one in the Spectrum initialize method now.

Done in 43e5976. I have also added this check in the Spectrum validity.

@jotsetung - ping us when/if you work on this at the C-level.

(PS: I have also removed the sort in .isCentroided now)

@jorainer
Copy link
Collaborator

@lgatto I'm currently working on the C-level stuff; for Spectrum1 it is implemented (commit 8659768). I still have to do the Spectrum2 constructor and fix failing test cases.

@jorainer
Copy link
Collaborator

@lgatto there is a problem in the initialize method for Spectrum objects: the ordering of mzand intensity should be done before the callNextMethod that creates the object (and hence throws an error if the mz values are unsorted). I fixed that too and will push in the next commit. Just wanted to let you know.

jorainer added a commit that referenced this issue Jul 27, 2016
o Clean up code for Spectrum1 constructors.
o Fix failing unit tests and cleanup.
o Fix issue in initialize of Spectrum objects: M/Z sorting should be
  done prior to initialization (see issue #135).
@lgatto
Copy link
Owner Author

lgatto commented Jul 27, 2016

I think this is now sorted, thus closing. @jotsetung - please reopen if necessary.

@lgatto lgatto closed this as completed Jul 27, 2016
@jorainer
Copy link
Collaborator

Not completely done yet; the Spectrum2 C-constructor is not yet ready. Will close once that's done too.

@jorainer jorainer reopened this Jul 27, 2016
jorainer added a commit that referenced this issue Jul 27, 2016
o Add Spectra2 C-constructor that enforces ordering by M/Z values.
o Add functionality to Spectrum1/2 C-constructor that calculates TIC from
  the data if not provided.
o tic method for OnDiskMSnExp: load spectra and get TIC if any of the
  totIonCurrent values in fData are 0.
@jorainer
Copy link
Collaborator

Now it's done (1d3024f). Ensured that ordering is correct in unit tests for Spectrum1 and Spectrum2 classes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants