-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
getSignificance very memory intensive when fit = ANOVA #63
Comments
Update: I have subsetted each of the individual cell types, and am now running Update: Using some meta-data kung-fu, I was able to segregate the cell-types into individual Seurat objects and successfully run With regards to the interpretation, for the sake of argument in a cell-type X for the GO term 'increased carbohydrate synthesis' say I am comparing males to females and the p.val is <0.001 (the column name is malesvsfemales.pval), also say the median for males is 4000 and median for females is 2000, would that suggest that this specific GO is up-regulated in males, owing to a higher median in males and furthermore, in the males_vs_females direction and it is significant? Cheers, 🐉 |
Hey Fahd, Apologies for the delay - Thank for reaching out and giving an extensive summary (with follow up) of the problem. You are completely correct - there is a large computational requirement for a lot of the additional testing as Please let me know if you have any suggestions from your experience and I will test some ideas I have in the mean time. Nick |
Cheers thanks legend!! |
Hi Nick!
Amazing tool! I was wondering if you could help me with a issue I am having owing to a large dataset.
I have a
seurat
object which consists of sample data integrated across 15 individuals 50k+ subsetted high-quality doublet free cells.Now in this dataset I have a
metadata
forcelltype
(15 types) which are further broken into race (2 types) and sex (2 types) which leads to joined metadata slot for 60 celltypes, on the basis of sex and race.Now I have already downloaded the entire
msigdb
for catC5
(gene ontologies), using:and successfuly ran enrichIt:
At this point, I have successfully reproduced all of the plotting params that I have tested and which are outlined in your vignette. Obviously after transferring an enormous amount of C5 metadata or the ES file to the Seurat object.
Now I am interested in running a statistical test across all 60 samples in various configurations to test for pathways that are more activated in some sex_ancestral_celltype vs another sex_ancestral_celltype (obviously the comparison is for the same cell type but differing sex and ancestry). In order to achieve this I run:
As I write this I just crossed 450GB of ram utilization. Is there any way I can reduce this colossal computational complexity?
I am thinking probably the only way, is to individually subset cell types, calculate the significance and then add that metadata iteratively back to the primary Seurat file. Or is there some way of using
getSignificance
to specify the testing combinations?Thanks again for the wonderful tool.
Cheers,
🐉
The text was updated successfully, but these errors were encountered: