gene set expression analysis using addModuleScore() #5549

andynkili · 2022-01-26T16:25:33Z

I am interested in cells expressing both a specific gene and a gene set:
I used the code from #3521:

geneSet_list <- list(geneSet)
object <- AddModuleScore(object = object, features = geneSet_list , name = "geneSet")

After adding the score to the object metadata, I do a FeaturePlot with the specific gene and the gene set to check their co-expression:
FeaturePlot(object, c('specificGene', 'geneSet1'), blend = TRUE, label = T, pt.size = 2)

Some cells seem to express both but it's hard to tell with the shades of colors, even playing with the blend threshold and choosing different palette.
To overcome the view issue, I did:
DimPlot(object, cells.highlight= WhichCells(object,expression = specificGene>0 & geneSet1>0))

However after reading #522, I realized that my way of doing this might be wrong since I am using the gene set score as if it was gene count .
Since the score of gene set is unitless and reflects the likeliness of the gene set being more expressed that expected, it cannot give any information on the level of expression of the gene set, can it?

If so, would it be an other way of assessing the level of co-expression of the specific gene and the gene set?

The text was updated successfully, but these errors were encountered:

mhkowalski · 2022-01-28T16:41:16Z

Thanks for your question. Yes, the module score represents relative expression. If gene A is highly expressed across all cells, the module scores assess whether a given cell expresses gene A more often than other highly expressed genes. Therefore, if the module score is high, it indicates these genes are expressed. If the module score is low, it might mean that these genes are expressed but not more so than in other cells.

I would recommend looking at co-expression of the specific gene and each member of the gene set. To quantify this, you could calculate the correlation of expression of the specific gene with each member of the gene set.

gt7901b · 2022-02-01T23:00:26Z

@mhkowalski will you please elaborate on how to calculate the correlation of expression of the specific gene with each member of the gene set and check co-expression of the specific gene and each member of the gene set? thanks

denvercal1234GitHub · 2022-03-18T23:13:22Z

@gt7901b @mhkowalski Did you guys have any update on how to calculate the correlation of expression for a given gene? Thanks in advance for your help!

RajneeshSrivastava · 2023-11-11T16:07:09Z

Hi @mhkowalski and @samuel-marsh,

I am trying to understand the calculations powered in the "AddModule Score" function of seurat. Luckily, I got a post that explains the fundamentals very well.

To compute the score, I used the script for 4 known signature genes (of my cell type of interest - labeling "cell_typeA" to trace back) as suggested (#3521). When I use the snippet, it adds on to 4 score (i guess for individual gene) columns appended to metadata viz. cell_typeA_score1, cell_typeA_score2, cell_typeA_score3, cell_typeA_score4. (tested for 100 genes, gave 100 score columns suffixed "1"..."100")

Please help me understand on adding "1" (for instance - cell_typeA_score"1") and its significance over other (Why not 2 /3 / 4?)
Are these scores computed gene-wise (and appended)? If, so how AddModuleScore is computing the cumulative enrichment?

I am sure, I might be missing something in the function, please clarify.

Again thank you so much, Seurat team and kudos to seurat5!

With regards,
Rajneesh

samuel-marsh · 2023-11-11T19:19:46Z

Hi Rajneesh,

So you need the gene set to be of class list in order to be scored together. Otherwise it will create one score for each gene. The numeral is simply that. It starts at 1 and goes up. So if you have list of two vectors they will be suffixed 1 and 2.

To score a set of genes you run (let’s call them T cell genes) you would run something like the following:

t_genes <- c(“gene1”, “gene2”, “gene3”, etc)

obj <- AddModuleScore(pbmc, features = list(t_genes), name="T_Score")

Best,
Sam

YU-LIN96 · 2023-12-03T01:51:02Z

Otherwise it will create one score for each gene. The numeral is simply that. It starts at 1 and goes up. So if you have list of two vectors they will be suffixed 1 and 2.

This is not quite right
Given my code
"""

fear_condition_genes <- mapIds(org.Mm.eg.db,
      keys = "GO:0042596",
      keytype="GO",
      column="SYMBOL", multiVals = "list")

fear_condition_genes <- fear_condition_genes$`GO:0042596`

merge.obj <- AddModuleScore(merge.obj, 
               features = list(fear_condition_genes),
               name = "fear_response")

"""
and
"""

fear_condition_genes
[1] "Adra2a" "Adrb1" "Bdnf" "Cacna1e" "Crhr1" "Dbh" "Drd4" "Ext1" "Grin2b"
class(fear_condition_genes)
[1] "character"
"""
I do not see how this is wrong. I am currently using Seurat4.
And it is giving me fear_response1-9 still

samuel-marsh · 2023-12-04T13:40:29Z

Hi @YU-LIN96,

So I don't see anything that you are doing wrong. Your variable fear_condition_genes won't be of class list because you didn't change the format before using in function. You can either make variable a list before or during AddModuleScore.

gene_list <- c("gene1", "gene2", etc)

Obj <- AddModuleScore(Obj, features = list(gene_list))

# OR

gene_list <- list(c("gene1", "gene2", etc))

Obj <- AddModuleScore(Obj, features = gene_list)

The important thing is to check meta.data slot after adding score and make sure it only added one column

grep(pattern = "fear_condition_genes", x = merge.obj@meta.data, value = TRUE)

Best,
Sam

mhkowalski closed this as completed Jan 28, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gene set expression analysis using addModuleScore() #5549

gene set expression analysis using addModuleScore() #5549

andynkili commented Jan 26, 2022

mhkowalski commented Jan 28, 2022

gt7901b commented Feb 1, 2022

denvercal1234GitHub commented Mar 18, 2022

RajneeshSrivastava commented Nov 11, 2023

samuel-marsh commented Nov 11, 2023

YU-LIN96 commented Dec 3, 2023 •

edited

Loading

samuel-marsh commented Dec 4, 2023

gene set expression analysis using addModuleScore() #5549

gene set expression analysis using addModuleScore() #5549

Comments

andynkili commented Jan 26, 2022

mhkowalski commented Jan 28, 2022

gt7901b commented Feb 1, 2022

denvercal1234GitHub commented Mar 18, 2022

RajneeshSrivastava commented Nov 11, 2023

samuel-marsh commented Nov 11, 2023

YU-LIN96 commented Dec 3, 2023 • edited Loading

samuel-marsh commented Dec 4, 2023

YU-LIN96 commented Dec 3, 2023 •

edited

Loading