Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with getEICs in xcms #7

Closed
sneumann opened this issue Dec 22, 2014 · 8 comments
Closed

Problem with getEICs in xcms #7

sneumann opened this issue Dec 22, 2014 · 8 comments

Comments

@sneumann
Copy link
Owner

Alan Smith reported:

We ran into a problem with the newer versions of xcms (1.38, 1.40, and 1.43 have been tested) after tyring to upgrade from version 1.26.1. Interestingly the newer versions of the getEIC function works fine on data xcmsSet objects generated under xcms version 1.26.1. I just wanted to let you know that if this problem interests you and you would like real examples of our data I would be happy to provide it to you if the example below is not helpful. Any advice on how to handle this issue will be greatly appreciated especially if it is a user error on my part.

We are experiencing problems generating EICs using the getEIC function with xcmsSet objects and group IDs when the "rt" argument is set to "cor". Recently we upgraded xcms from version 1.26.1 to version 1.40 and began to experience the issue where all of the EICs generated by getEIC do not make sense when rt="cor". The problem occurs with or without data that has been rt corrected when rt is set to cor. The EICs look as expected when using rt = raw. This problem occurs with every independent data set we have evaluated. I have been able to reproduce the problem with the faahKO package [example code following session info]. I have attached a pdf from the example code below where page 1 and 3 correspond to the same feature. We have rolled back to version 1.26.1, but would like to use the current version of xcms. Any information on the origin of this problem and advice on how to solve this problem would be appreciated so that we can evaluate features with RT corrected data using the current builds of xcms and the getEIC fucntion.

@sneumann
Copy link
Owner Author

Example script from Alan:

library(xcms)
library(faahKO)

cdfpath <- system.file("cdf", package = "faahKO")
list.files(cdfpath, recursive = TRUE)
cdffiles <- list.files(cdfpath, recursive = TRUE, full.names = TRUE)

xset<-xcmsSet(cdffiles,profmethod = "bin", method="centWave", ppm=30, peakwidth=c(5,50), snthresh=10, prefilter=c(7,1000), integrate=1, mzdiff= -0.001, fitgauss=FALSE, scanrange= c(0,500),mzCenterFun="wMean",noise=300,sleep=0, verbose.columns=T,nSlaves=3);
xset2<-retcor(xset, method = "obiwarp",distFunc="cor_opt",profStep=1,plot="deviation",gapInit=0.8, gapExtend=3.3)
gxset2<-group(xset2,method="density", bw=8, minfrac = .5, minsamp = 2, mzwid = .01, max = 50, sleep = 0)
xset3<-fillPeaks(gxset2);

peaktab <-data.frame(ID=groupnames(xset3),peakTable(xset3))

a<-getEIC(xset3,groupidx=c("M205T2790","M392T2927"))
b<-getEIC(xset3,groupidx=c("M205T2790","M392T2927"),rt="raw")
pdf("plot.example.pdf")
plot(a)
plot(b)
dev.off()
sessionInfo()

@sneumann
Copy link
Owner Author

So the raw RT EIC has :

> str(b)
Formal class 'xcmsEIC' [package "xcms"] with 5 slots
  ..@ eic       :List of 12
  .. ..$ ko15:List of 2
  .. .. ..$ : num [1:128, 1:2] 2689 2691 2692 2694 2695 ...
  .. .. .. ..- attr(*, "dimnames")=List of 2
  .. .. .. .. ..$ : NULL
  .. .. .. .. ..$ : chr [1:2] "rt" "intensity"
  .. .. ..$ : num [1:128, 1:2] 2828 2830 2832 2833 2835 ...
  .. .. .. ..- attr(*, "dimnames")=List of 2
  .. .. .. .. ..$ : NULL
  .. .. .. .. ..$ : chr [1:2] "rt" "intensity"
[...]
  .. ..$ wt22:List of 2
  .. .. ..$ : num [1:128, 1:2] 2689 2691 2692 2694 2695 ...
  .. .. .. ..- attr(*, "dimnames")=List of 2
  .. .. .. .. ..$ : NULL
  .. .. .. .. ..$ : chr [1:2] "rt" "intensity"
  .. .. ..$ : num [1:128, 1:2] 2828 2830 2832 2833 2835 ...
  .. .. .. ..- attr(*, "dimnames")=List of 2
  .. .. .. .. ..$ : NULL
  .. .. .. .. ..$ : chr [1:2] "rt" "intensity"
  ..@ mzrange   : num [1:2, 1:2] 205 392 205 392
  .. ..- attr(*, "dimnames")=List of 2
  .. .. ..$ : NULL
  .. .. ..$ : chr [1:2] "mzmin" "mzmax"
  ..@ rtrange   : num [1:2, 1:2] 2689 2827 2889 3027
  .. ..- attr(*, "dimnames")=List of 2
  .. .. ..$ : NULL
  .. .. ..$ : chr [1:2] "rtmin" "rtmax"
  ..@ rt        : chr "raw"
  ..@ groupnames: chr [1:2] "M205T2790" "M392T2927"

i.e. the raw have 128 datapoints per EIC.

@sneumann
Copy link
Owner Author

sneumann commented Oct 7, 2015

Hi Alan,
it would be a fantastic help if you could create a self-contained and standalone test here.
The we could try to isolate the relevant commit between 1.26.1 and 1.44.x with "git bisect"
(see https://git-scm.com/book/en/v2/Git-Tools-Debugging-with-Git or https://robots.thoughtbot.com/git-bisect )
Yours,
Steffen

@sneumann
Copy link
Owner Author

Alan Smith made some excellent progress:

Hi Steffen,

I picked this up again and found that the problems in EIC extraction are caused by the length of the vector of corrected retention times. The vector of corrected RTs corresponds to the length of the all of the scans in an xcmsRaw object in xcms version 1.26.1 and only the scan range used for peak detection in xcms version 1.46.0.

  1. When we perform peak detection using findPeaks.centWave we only evaluate a portion of the gradient that contains peaks of interest. This subset represents a subset of the total number of scans present in the mzData data file. We use the argument scanrange to reduce scans to regions of interest using centwave. This reduces computation time and focus downstream analysis on the regions of the gradient we are interested in.
    a) A typical LC-MS analysis contains 7007 scans.
    b) We perform peak detection over 5325 scans.

  2. We typically analyze EICs to assess performance of peak detection, grouping and specifically retention correction.
    a) To evaluate retention time alignment we plot two EICs for each feature. getEICs with "rt" argument set to "raw" and then set to "corrected"
    b) We found that accurate EICs could not be generated when the "rt" argument was set to "corrected" with newer versions of xcms, but that accurate EICs
    could be generated with "rt" set to raw.
    c) The line of code in getEICs which leads to this observations is below. The code came from getMethod("getEIC", "xcmsSet") for xcms version 1.46.0. lcraw@scantime <- object@rt$corrected[[sampidx[i]]]

  3. The root of the EIC extraction problem seems to come from how the xcms class object is initially created. The older libraries created an rt slot that had values of the entire scan range [7007 entries per file] while the newer versions of xcms create a rt slot with the truncated scan ranges [5325 entries per file ]. When getEICs is called the scantime vector for the raw lcms file is overwritten with the truncated range of the corrected retention time range. This overwriting of the scan range causes some problems with when the eics are created. I was not able to follow the code through getMethod("getEICNew", "xcmsRaw") or getMethod("getEICOld", "xcmsRaw") to find the exact line of code where an index is messed up.

  4. I was able to hack the centwave (xset) object by taking the following steps.

# hack to plot rt corrected EICs on samples analyzed using a subset of the retention time.
lcraw <- xcmsRaw(filepaths(xset)[1], profmethod = profinfo(xset)$method, profstep = 0)
xset@scanrange <- c(0,length(lcraw@scantime)) 
raw.dat <- sapply(filepaths(xset),xcmsRaw,profmethod = profinfo(xset)$method, profstep = 0)
names(raw.dat ) <- 1:length(raw.dat)
xset@rt$raw <- lapply(raw.dat,function(x) x@scantime)
xset@rt$corrected <- lapply(raw.dat,function(x) x@scantime)

Attach is a pdf with an EIC generated using data after filling peaks [no hack] and an EIC generated using the hack on a centwave prior to retention time correction and filling peaks.

Let me know if there is anything else I can do to help.
Alan

eic-hackexample-0
eic-hackexample-1

@jorainer
Copy link
Collaborator

Concerning Alan Smith's point 3) above:
for the getEICOld and getEICNew, I introduced that split as I had some problems with the "original" getEIC method (now referred to as getEICOld) on xcmsRaw objects. Just out of curiosity, do you see the same problem you're describing when using the getEICNew method? You can switch to getEICNew using:

library(xcms)
opts <- options()$BioC
opts$xcms$getEIC.method <- "getEICNew"
options(BioC=opts)

thanks, jo

@jorainer
Copy link
Collaborator

Actually, I repeated the code chunk above after changing xcms to use the "new" getEIC method:

library(xcms)
library(faahKO)
## change getEIC to use getEICNew
opts <- options()$BioC
opts$xcms$getEIC.method <- "getEICNew"
options(BioC=opts)

cdfpath <- system.file("cdf", package = "faahKO")
list.files(cdfpath, recursive = TRUE)
cdffiles <- list.files(cdfpath, recursive = TRUE, full.names = TRUE)

xset<-xcmsSet(cdffiles,profmethod = "bin", method="centWave", ppm=30, peakwidth=c(5,50), snthresh=10, prefilter=c(7,1000), integrate=1, mzdiff= -0.001, fitgauss=FALSE, scanrange= c(0,500),mzCenterFun="wMean",noise=300,sleep=0, verbose.columns=T,nSlaves=3);
xset2<-retcor(xset, method = "obiwarp",distFunc="cor_opt",profStep=1,plot="deviation",gapInit=0.8, gapExtend=3.3)
gxset2<-group(xset2,method="density", bw=8, minfrac = .5, minsamp = 2, mzwid = .01, max = 50, sleep = 0)
xset3<-fillPeaks(gxset2);

peaktab <-data.frame(ID=groupnames(xset3),peakTable(xset3))

a<-getEIC(xset3,groupidx=c("M205T2790","M392T2927"))
b<-getEIC(xset3,groupidx=c("M205T2790","M392T2927"),rt="raw")
pdf("plot.example-new.pdf")
plot(a)
plot(b)
dev.off()

The results look actually pretty good to me, but please reconfirm.
Eventually, we should switch by default to getEICNew (i.e. change getEIC.method="getEICOld" in init.R to getEIC.method="getEICNew")? What I didn't like about the getEICOld was that an EIC "buffer" was created, apparently to speed things up, but I guess this caused ultimately the problem. In the getEICNew I kept the code as simple as possible, loosing eventually some milliseconds in the calculation but be more stable.

sneumann pushed a commit that referenced this issue Mar 30, 2016
+ Fix problem in getEIC on xcmsSet objects reported by Alan Smith in
  issue #7 and add a RUnit test case to test for this (test.issue7 in
  runit.getEIC.R).
+ Change some unnecessary warnings into messages.
sneumann pushed a commit that referenced this issue Mar 30, 2016
+ Fix bug in getEIC reported by Alan Smith in issue #7.
+ Replace unnecessary warnings with messages.
@sneumann
Copy link
Owner Author

Merged and pushed as 1.147.3 with Jo's fixes 2f5e291 and 35be059 . If Alain confirms, this can be closed.

sneumann pushed a commit that referenced this issue Apr 1, 2016
+ Fix problem in getEIC on xcmsSet objects reported by Alan Smith in
  issue #7 and add a RUnit test case to test for this (test.issue7 in
  runit.getEIC.R).
+ Change some unnecessary warnings into messages.

git-svn-id: https://hedgehog.fhcrc.org/bioconductor/trunk/madman/Rpacks/xcms@115441 bc3139a8-67e5-0310-9ffc-ced21a209358
sneumann pushed a commit that referenced this issue Apr 1, 2016
+ Fix bug in getEIC reported by Alan Smith in issue #7.
+ Replace unnecessary warnings with messages.

git-svn-id: https://hedgehog.fhcrc.org/bioconductor/trunk/madman/Rpacks/xcms@115442 bc3139a8-67e5-0310-9ffc-ced21a209358
sneumann added a commit that referenced this issue Aug 17, 2017
+ Fix problem in getEIC on xcmsSet objects reported by Alan Smith in
  issue #7 and add a RUnit test case to test for this (test.issue7 in
  runit.getEIC.R).
+ Change some unnecessary warnings into messages.

git-svn-id: file:///home/git/hedgehog.fhcrc.org/bioconductor/trunk/madman/Rpacks/xcms@115441 bc3139a8-67e5-0310-9ffc-ced21a209358
sneumann added a commit that referenced this issue Aug 17, 2017
+ Fix bug in getEIC reported by Alan Smith in issue #7.
+ Replace unnecessary warnings with messages.

git-svn-id: file:///home/git/hedgehog.fhcrc.org/bioconductor/trunk/madman/Rpacks/xcms@115442 bc3139a8-67e5-0310-9ffc-ced21a209358
jorainer pushed a commit that referenced this issue Sep 27, 2017
+ Fix problem in getEIC on xcmsSet objects reported by Alan Smith in
  issue #7 and add a RUnit test case to test for this (test.issue7 in
  runit.getEIC.R).
+ Change some unnecessary warnings into messages.

git-svn-id: file:///home/git/hedgehog.fhcrc.org/bioconductor/trunk/madman/Rpacks/xcms@115441 bc3139a8-67e5-0310-9ffc-ced21a209358
jorainer pushed a commit that referenced this issue Sep 27, 2017
+ Fix bug in getEIC reported by Alan Smith in issue #7.
+ Replace unnecessary warnings with messages.

git-svn-id: file:///home/git/hedgehog.fhcrc.org/bioconductor/trunk/madman/Rpacks/xcms@115442 bc3139a8-67e5-0310-9ffc-ced21a209358
@jorainer
Copy link
Collaborator

closing this issue now - should have been fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants