Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gene set expression analysis using addModuleScore() #5549

Closed
andynkili opened this issue Jan 26, 2022 · 7 comments
Closed

gene set expression analysis using addModuleScore() #5549

andynkili opened this issue Jan 26, 2022 · 7 comments

Comments

@andynkili
Copy link

I am interested in cells expressing both a specific gene and a gene set:
I used the code from #3521:

geneSet_list <- list(geneSet)
object <- AddModuleScore(object = object, features = geneSet_list , name = "geneSet")

After adding the score to the object metadata, I do a FeaturePlot with the specific gene and the gene set to check their co-expression:
FeaturePlot(object, c('specificGene', 'geneSet1'), blend = TRUE, label = T, pt.size = 2)

Some cells seem to express both but it's hard to tell with the shades of colors, even playing with the blend threshold and choosing different palette.
To overcome the view issue, I did:
DimPlot(object, cells.highlight= WhichCells(object,expression = specificGene>0 & geneSet1>0))

However after reading #522, I realized that my way of doing this might be wrong since I am using the gene set score as if it was gene count .
Since the score of gene set is unitless and reflects the likeliness of the gene set being more expressed that expected, it cannot give any information on the level of expression of the gene set, can it?

If so, would it be an other way of assessing the level of co-expression of the specific gene and the gene set?

@mhkowalski
Copy link
Contributor

Thanks for your question. Yes, the module score represents relative expression. If gene A is highly expressed across all cells, the module scores assess whether a given cell expresses gene A more often than other highly expressed genes. Therefore, if the module score is high, it indicates these genes are expressed. If the module score is low, it might mean that these genes are expressed but not more so than in other cells.

I would recommend looking at co-expression of the specific gene and each member of the gene set. To quantify this, you could calculate the correlation of expression of the specific gene with each member of the gene set.

@gt7901b
Copy link

gt7901b commented Feb 1, 2022

@mhkowalski will you please elaborate on how to calculate the correlation of expression of the specific gene with each member of the gene set and check co-expression of the specific gene and each member of the gene set? thanks

@denvercal1234GitHub
Copy link

@gt7901b @mhkowalski Did you guys have any update on how to calculate the correlation of expression for a given gene? Thanks in advance for your help!

@RajneeshSrivastava
Copy link

Hi @mhkowalski and @samuel-marsh,

I am trying to understand the calculations powered in the "AddModule Score" function of seurat. Luckily, I got a post that explains the fundamentals very well.

To compute the score, I used the script for 4 known signature genes (of my cell type of interest - labeling "cell_typeA" to trace back) as suggested (#3521). When I use the snippet, it adds on to 4 score (i guess for individual gene) columns appended to metadata viz. cell_typeA_score1, cell_typeA_score2, cell_typeA_score3, cell_typeA_score4. (tested for 100 genes, gave 100 score columns suffixed "1"..."100")

Please help me understand on adding "1" (for instance - cell_typeA_score"1") and its significance over other (Why not 2 /3 / 4?)
Are these scores computed gene-wise (and appended)? If, so how AddModuleScore is computing the cumulative enrichment?

I am sure, I might be missing something in the function, please clarify.

Again thank you so much, Seurat team and kudos to seurat5!

With regards,
Rajneesh

@samuel-marsh
Copy link
Collaborator

Hi Rajneesh,

So you need the gene set to be of class list in order to be scored together. Otherwise it will create one score for each gene. The numeral is simply that. It starts at 1 and goes up. So if you have list of two vectors they will be suffixed 1 and 2.

To score a set of genes you run (let’s call them T cell genes) you would run something like the following:

t_genes <- c(“gene1”, “gene2”, “gene3”, etc)

obj <- AddModuleScore(pbmc, features = list(t_genes), name="T_Score")

Best,
Sam

@YU-LIN96
Copy link

YU-LIN96 commented Dec 3, 2023

Otherwise it will create one score for each gene. The numeral is simply that. It starts at 1 and goes up. So if you have list of two vectors they will be suffixed 1 and 2.

This is not quite right
Given my code
"""

fear_condition_genes <- mapIds(org.Mm.eg.db,
      keys = "GO:0042596",
      keytype="GO",
      column="SYMBOL", multiVals = "list")

fear_condition_genes <- fear_condition_genes$`GO:0042596`


merge.obj <- AddModuleScore(merge.obj, 
               features = list(fear_condition_genes),
               name = "fear_response")

"""
and
"""

fear_condition_genes
[1] "Adra2a" "Adrb1" "Bdnf" "Cacna1e" "Crhr1" "Dbh" "Drd4" "Ext1" "Grin2b"
class(fear_condition_genes)
[1] "character"
"""
I do not see how this is wrong. I am currently using Seurat4.
And it is giving me fear_response1-9 still

@samuel-marsh
Copy link
Collaborator

Hi @YU-LIN96,

So I don't see anything that you are doing wrong. Your variable fear_condition_genes won't be of class list because you didn't change the format before using in function. You can either make variable a list before or during AddModuleScore.

gene_list <- c("gene1", "gene2", etc)

Obj <- AddModuleScore(Obj, features = list(gene_list))

# OR

gene_list <- list(c("gene1", "gene2", etc))

Obj <- AddModuleScore(Obj, features = gene_list)

The important thing is to check meta.data slot after adding score and make sure it only added one column

grep(pattern = "fear_condition_genes", x = merge.obj@meta.data, value = TRUE)

Best,
Sam

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants