-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question about running the scDECAF #2
Comments
Hi ! thank you for using the issue tracker!
The vector space representation computed by scDECAF requires more than one gene signature. So, i'd recommend you consider adding other gene sets to run the model. For your project, for example, differentially expressed genes in differentially abundant neighbourhoods which you can get from miloDE will provide you with sufficient number of gene sets to use as input to scDECAF. Link to miloDE https://github.com/MarioniLab/miloDE Hope this helps. |
So for example, if I have two gene signatures, s1 = [a, b, c], s2 = [d, e, f]. I can create a geneset like [s1, s2]. Then I run the |
so, the
As i mentioned, due to nature of the model we generally need larger than 2 gene sets. you got the order of running the functions correctly, but if your full geneset list has less than 10 gene sets, pruning via I suggest you also checkout our tutorials from the reproducibility repo Hope this helps |
Great! Thanks for your suggestion! I will try it in the coming days. |
Hi. so the error is suggesting that For embedding, you can use umap as you're doing here, but can also consider any other embedding (PCA, PHATE etc with > 2 dimensions). Hope this helps! |
ok - thanks. Are the row names set for the embedding matrix? scDECAF at some point matches column names in |
Sorry, I'm not sure about what you mean. The row name of embedding is the gene name and the column name is PC1, PC2, PC3, PC4. The row name of |
ah then i see what's going wrong. the embedding is a cell embedding ie. has dims n_cells x n_D where D is the dimension in the dimension reduction space. Whereas you are providing a gene embedding. Your initial code was correct because you had |
So the |
correct. |
also since you only have 8 gene sets, |
Thanks! Will try it. |
Hi, I'm NianzhenGu's teammate. I still cannot run scDECAF successfully. Here's the error: "merged_logcounts" was defined by "merged_logcounts <- logcounts(merged_sce)" and the logcounts assay was generated by "merged_sce <- logNormCounts(merged_sce)". "target" was defined by "target <- genesets2ids(merged_logcounts, gene_signature)", where "gene_signature" was a list of geneset, as below: hvg_union was a vector of highly variable genes we chose. Reduced dimensions were generated by: Do you have any idea about what could possibly be the problems? Thanks a lot :) |
Hey :). |
Yes, it worked! Thank you so much for your help! |
No worries. please close the issue, if this is done! |
Hi! I'm working on a single-cell RNA project that compares single-cell transcriptomic data of embryonic and adult mouse colons to identify embryonic-specific gene signatures and use these genes to score colon cancer single-cell data. I find the scDECAF algorithm is suitable for this project.
I have some problems understanding the inputs to the algorithm in the Quick Start part.
geneset
and theHM_geneset
. Does theHM_geneset
represent the human geneset that can be downloaded online? How about the mouse geneset?What I have now is the gene signature, a vector of gene ids, like "ENSMUSG00000031957" "ENSMUSG00000069893" "ENSMUSG00000055827"...; the data I want to score: a SingleCellExperiment object; a list of highly variable genes (hvg).
Very much appreciate it if you could give me some instructions! Thanks!
The text was updated successfully, but these errors were encountered: