-
Notifications
You must be signed in to change notification settings - Fork 246
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
In fgsea(pathways = geneSets, stats = geneList, nperm = nPerm, minSize = minGSSize, : There are duplicate gene names, fgsea may produce unexpected results #189
Comments
Try |
If I use the command, I get the following However if I do: unique_GSEA<-unique(genekegg_6) to get all the unique entries I get the same number of genes as in the original genekegg_6 |
Did you remove those duplicated entries? Or corrected somehow the name of the values? Unless you have unique names it won't work. |
How can I remove them? Just using unique does not seem to get rid of them.
Hämta Outlook för iOS<https://aka.ms/o0ukef>
…________________________________
Från: Lluís <notifications@github.com>
Skickat: måndag, mars 4, 2019 4:52 em
Till: GuangchuangYu/clusterProfiler
Kopia: Sandra Hellberg; State change
Ämne: Re: [GuangchuangYu/clusterProfiler] In fgsea(pathways = geneSets, stats = geneList, nperm = nPerm, minSize = minGSSize, : There are duplicate gene names, fgsea may produce unexpected results (#189)
Did you remove those duplicated entries? Or corrected somehow the name of the values? Unless you have unique names it won't work.
—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub<#189 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/At-SjPNcW-4YgsiAYtMBO66z-at0oSUVks5vTUEmgaJpZM4bcDxP>.
|
If you use |
Thank you so much! The error message is now gone 👍 |
Would you please tell me how to prepare geneSet for fgsea? |
@rodela71 If you have a question you could try to ask at https://support.bioconductor.org, this is if there are some issues with the package itself. It is not working properly or there are some other problems. To get help I would suggest you to show what you have tried to do and explain on what step you have trouble. |
Hi @llrs, apologies for commenting in this closed issue, but my question is very related: Ideally, I would like to keep gene duplicates and treat them as individual entries of the same gene set. Is there a mode for this, or could you point me to the code snippet where this step happens? Many thanks! |
@AStubbusch this warning is from fgsea, so it has nothing to do with clusterProfiler (except that it doesn't check it before providing the data to fgsea). The problem on fgsea is that the bootstrapping and selection of genes might not follow the underlying mathematical assumptions of the test. You can't keep genes as separate but consider as one for GSEA to make sense. I would merge them into one single entity. If this are transcripts, you can use gene expression instead of transcript expression and similar for other entities. If people have more questions, please post on support.bioconductor.org site so that other people (and maintainers) can help. I will stop helping here as this is not the right place to ask questions and I don't want to encourage them here. |
Thanks a lot for your answer @llrs , and apologies, following questions will go to support.bioconductor.org! |
I have been using the guide for Clusterprofiler https://bioconductor.org/packages/release/bioc/vignettes/clusterProfiler/inst/doc/clusterProfiler.html and have created a ranked gene list according to the instructions from Wiki.
head(genekegg_6)
400746 114794 90204 104355220 29881 55515
5.700247 4.955668 4.846596 4.550389 4.490711 4.188874
When I run gseKEGG or gsePathway I get the same type of error.
gsea_6_kegg <- gseKEGG(geneList = genekegg_6 ,
organism = 'hsa',
nPerm = 1000,
maxGSSize=500,
minGSSize = 20,
pAdjustMethod = "BH",
pvalueCutoff = 0.05,
verbose = TRUE)
preparing geneSet collections...
GSEA analysis...
leading edge analysis...
done...
Warning message:
In fgsea(pathways = geneSets, stats = geneList, nperm = nPerm, minSize = minGSSize, :
There are duplicate gene names, fgsea may produce unexpected results
The warning is that there are duplicated gene names but I can not find any duplicated gene names in the list.
anyDuplicated(genekegg_6)
[1] 0
I do not understand why I get this error message?
The text was updated successfully, but these errors were encountered: