-
Notifications
You must be signed in to change notification settings - Fork 891
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gene set expression analysis using addModuleScore() #5549
Comments
Thanks for your question. Yes, the module score represents relative expression. If gene A is highly expressed across all cells, the module scores assess whether a given cell expresses gene A more often than other highly expressed genes. Therefore, if the module score is high, it indicates these genes are expressed. If the module score is low, it might mean that these genes are expressed but not more so than in other cells. I would recommend looking at co-expression of the specific gene and each member of the gene set. To quantify this, you could calculate the correlation of expression of the specific gene with each member of the gene set. |
@mhkowalski will you please elaborate on how to calculate the correlation of expression of the specific gene with each member of the gene set and check co-expression of the specific gene and each member of the gene set? thanks |
@gt7901b @mhkowalski Did you guys have any update on how to calculate the correlation of expression for a given gene? Thanks in advance for your help! |
Hi @mhkowalski and @samuel-marsh, I am trying to understand the calculations powered in the "AddModule Score" function of seurat. Luckily, I got a post that explains the fundamentals very well. To compute the score, I used the script for 4 known signature genes (of my cell type of interest - labeling "cell_typeA" to trace back) as suggested (#3521). When I use the snippet, it adds on to 4 score (i guess for individual gene) columns appended to metadata viz. cell_typeA_score1, cell_typeA_score2, cell_typeA_score3, cell_typeA_score4. (tested for 100 genes, gave 100 score columns suffixed "1"..."100") Please help me understand on adding "1" (for instance - cell_typeA_score"1") and its significance over other (Why not 2 /3 / 4?) I am sure, I might be missing something in the function, please clarify. Again thank you so much, Seurat team and kudos to seurat5! With regards, |
Hi Rajneesh, So you need the gene set to be of class To score a set of genes you run (let’s call them T cell genes) you would run something like the following:
Best, |
This is not quite right
"""
|
Hi @YU-LIN96, So I don't see anything that you are doing wrong. Your variable
The important thing is to check meta.data slot after adding score and make sure it only added one column
Best, |
I am interested in cells expressing both a specific gene and a gene set:
I used the code from #3521:
After adding the score to the object metadata, I do a FeaturePlot with the specific gene and the gene set to check their co-expression:
FeaturePlot(object, c('specificGene', 'geneSet1'), blend = TRUE, label = T, pt.size = 2)
Some cells seem to express both but it's hard to tell with the shades of colors, even playing with the blend threshold and choosing different palette.
To overcome the view issue, I did:
DimPlot(object, cells.highlight= WhichCells(object,expression = specificGene>0 & geneSet1>0))
However after reading #522, I realized that my way of doing this might be wrong since I am using the gene set score as if it was gene count .
Since the score of gene set is unitless and reflects the likeliness of the gene set being more expressed that expected, it cannot give any information on the level of expression of the gene set, can it?
If so, would it be an other way of assessing the level of co-expression of the specific gene and the gene set?
The text was updated successfully, but these errors were encountered: