-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
merge generateCompounds from different tools #108
Comments
Hi @Boris-Droz Interesting question... in principle, taking a consensus is indeed the way to go to merge the results. There are a few options to control this process, but I'm unsure if they would help you in this case. There might be a few reasons for candidates to drop in ID level, from what I can quickly foresee it could be related to the re-ranking of candidates or averaging of scores that occur while making the consensus. Perhaps to get some more hints on what is actually happening, you could check and compare a few log files that Thanks, |
Hi Rick, Anyway, I try to chase down how to improve this process using the lo/ident. That became more questionable on how this process of consensus work. For the test I was focussing on one specific features. For Metfrag annotation only I get: For LibMatch I get: Then using the a consensus between Metfrga + libMatch, I was expecting a level of confidence of 2a, but get I tried to change the parameter of the consensus function but always get the same values. |
Hi Boris, I just quickly tried to do something similar in order to have some data to test. Using the patRoon demo data and suspects lists, I got
So there was one candidate in the consensus that got degraded to a level 3, but that's because with the consensus it was ranked second instead of first. I think this usually quite reasonable, but if you are really sure it's not, then perhaps you could somehow filter out unwanted candidates from the compounds object (either before or after making a consensus), e.g. by using the Thanks, |
Hi Rick, |
Hi Boris, All was mostly with defaults. Below is the script I used, which is mostly a template from # Script automatically generated on Mon Apr 29 16:13:00 2024
library(patRoon)
# -------------------------
# initialization
# -------------------------
workPath <- "E:/devel/tests/test2"
setwd(workPath)
# Example data from patRoonData package (triplicate solvent blank + triplicate standard)
anaInfo <- patRoonData::exampleAnalysisInfo("positive")
# -------------------------
# features
# -------------------------
# Find all features
# NOTE: see the reference manual for many more options
fList <- findFeatures(anaInfo, "openms", noiseThrInt = 1000, chromSNR = 3, chromFWHM = 5, minFWHM = 1, maxFWHM = 30)
# Group and align features between analyses
fGroups <- groupFeatures(fList, "openms", rtalign = TRUE)
# Basic rule based filtering
fGroups <- filter(fGroups, preAbsMinIntensity = 100, absMinIntensity = 10000, relMinReplicateAbundance = 1,
maxReplicateIntRSD = 0.75, blankThreshold = 5, removeBlanks = TRUE,
retentionRange = NULL, mzRange = NULL)
# -------------------------
# suspect screening
# -------------------------
# Get example suspect list
suspList <- patRoonData::suspectsPos
# Set onlyHits to FALSE to retain features without suspects (eg for full NTA)
# Set adduct to NULL if suspect list contains an adduct column
fGroups <- screenSuspects(fGroups, suspList, rtWindow = 12, mzWindow = 0.005, adduct = "[M+H]+", onlyHits = TRUE)
# -------------------------
# annotation
# -------------------------
# Retrieve MS peak lists
avgMSListParams <- getDefAvgPListParams(clusterMzWindow = 0.005)
mslists <- generateMSPeakLists(fGroups, "mzr", maxMSRtWindow = 5, precursorMzWindow = 4,
avgFeatParams = avgMSListParams,
avgFGroupParams = avgMSListParams)
# Rule based filtering of MS peak lists. You may want to tweak this. See the manual for more information.
mslists <- filter(mslists, absMSIntThr = NULL, absMSMSIntThr = NULL, relMSIntThr = NULL, relMSMSIntThr = 0.05,
topMSPeaks = NULL, topMSMSPeaks = 25)
# Calculate formula candidates
formulas <- generateFormulas(fGroups, mslists, "genform", relMzDev = 5, adduct = "[M+H]+", elements = "CHNOP",
oc = FALSE, calculateFeatures = TRUE,
featThresholdAnn = 0.75)
# Calculate compound structure candidates
compounds <- generateCompounds(fGroups, mslists, "metfrag", dbRelMzDev = 5, fragRelMzDev = 5, fragAbsMzDev = 0.002,
adduct = "[M+H]+", database = "pubchemlite",
maxCandidatesToStop = 2500)
compounds <- addFormulaScoring(compounds, formulas, updateScore = TRUE)
# Annotate suspects
fGroups <- annotateSuspects(fGroups, formulas = formulas, compounds = compounds, MSPeakLists = mslists,
IDFile = "idlevelrules.yml")
mslib <- loadMSLibrary("~/../Downloads/MassBank_NIST (1).msp", "msp")
compoundsLib <- generateCompounds(fGroups, mslists, "library", mslib, adduct = "[M+H]+")
fGroupsLib <- annotateSuspects(fGroups, formulas = formulas, compounds = compoundsLib, MSPeakLists = mslists,
IDFile = "idlevelrules.yml")
compoundsCons <- consensus(compounds, compoundsLib)
fGroupsCons <- annotateSuspects(fGroups, formulas = formulas, compounds = compoundsCons, MSPeakLists = mslists,
IDFile = "idlevelrules.yml")
siMF <- screenInfo(fGroups)
siLib <- screenInfo(fGroupsLib)
siCons <- screenInfo(fGroupsCons)
siCons <- siCons[name %in% c(siMF$name, siLib$name)]
siCons[, IDL_MF := {
n <- name
siMF[match(n, name)]$estIDLevel
}]
siCons[, IDL_lib := {
n <- name
siLib[match(n, name)]$estIDLevel
}]
siCons <- siCons[numericIDLevel(estIDLevel) > pmin(numericIDLevel(IDL_MF), numericIDLevel(IDL_lib))] Thanks, |
Hi Rick, |
Hi Rick,
I try to merge the result of generateCompounds made by Metfrag and using in-house library tools.
First I taught that I can used consensus() but this gave me not exactly what I want. But maybe I am lost with the setting.
Briefly, 1. I generateCompounds with Metfrag and separatively with a library. 2. I would like to have a compounds object merging the two tools information containing the highest match score in order to increase the level of confidence of my final annotation.
Using consensus decrease my final level of confidence because I have some LC2a in Metfrag that I don't have in the library or reverse.
Thank you for your help
Best
Boris
The text was updated successfully, but these errors were encountered: