Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GSVA calculation takes extremely long #18

Open
jonrot1906 opened this issue Oct 25, 2023 · 3 comments
Open

GSVA calculation takes extremely long #18

jonrot1906 opened this issue Oct 25, 2023 · 3 comments

Comments

@jonrot1906
Copy link

Dear @guokai8,

thanks for your great package. I am currently struggling a little to use it on my dataset, as the GSVA calculation takes extremely long.
I am using a custom gene set in this structure:

GeneID | Annot
PTGS2 | Ferroptosis

And I am running these commands:

gene_set <- read.csv("gene_set.csv")
res<-scgsva(nft_ad,annot=gene_set,method="gsva",useTerm = F)

This produces the following console messages (which look fine in my opinion):

Setting parallel calculations through a MulticoreParam back-end
with workers=4 and tasks=100.
Estimating GSVA scores for 1 gene sets.
Estimating ECDFs with Poisson kernels
Estimating ECDFs in parallel on 4 cores

About 21 iterations (I assume cells) took around 12 hours. I am running this on a M1 Pro MacBook with 32 GB RAM - do you think it will be faster once I switch to a computer with better specifications? I want to run GSVA analysis on around 100000 cells...this would take ages.

I am keen to get your recommendations!
Thanks and best regards,
Jonas

@guokai8
Copy link
Owner

guokai8 commented Nov 17, 2023

Hi @jonrot1906 ,
I am working on the new version now. Will fix this issue soon. thanks!
K

@guokai8
Copy link
Owner

guokai8 commented Nov 22, 2023

Hi @jonrot1906 ,
Now, I am testing two approaches: 1, use batch methods and 2, use sampling methods. I may release the new version in few days.
Best,
K

@guokai8
Copy link
Owner

guokai8 commented Nov 28, 2023

Hi @jonrot1906 ,
batch method is available now. And you can also calculate the UCell scores by setting the method="UCell". Now working on the sampling methods
K,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants