Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to add average expression scale to dotplot of merged gene list (plotted onto single dot plot) #4544

Closed
ksaunders73 opened this issue May 27, 2021 · 2 comments

Comments

@ksaunders73
Copy link

ksaunders73 commented May 27, 2021

Hello!

From #3521 I learned how to plot a list of genes onto one dot plot:

nklist <- list(c("TFF1", "MB", "ANKRD30B",
             "LINC00173", "DSCAM-AS1", "IGHG1", "SERPINA5"))
sobj <- AddModuleScore(object = sobj, features = nklist, name = "NK_List")  
DotPlot(object = sobj, features = "NK_List1")

image

However, unlike the usual dotplot code, this dot plot does not have the average expression scale added in:

DotPlot(object = sobj, features = c("TFF1", "MB", "ANKRD30B", "LINC00173", "DSCAM-AS1", "IGHG1", "SERPINA5")) + RotatedAxis()

image

Is there a way to add the average expression scale onto the merged dot plot also?

Hopefully this makes sense, and thanks for reading!

@samuel-marsh
Copy link
Collaborator

Hi Again,

Not member of dev team but hopefully can be helpful. In my opinion DotPlot is probably not the best tool for visualization of module scores. There are two issues with plotting module scores via a DotPlot that are of particular note.

So in basic sense as described in manual AddModuleScore is doing the following:
Calculate the average expression levels of each program (cluster) on single cell level, subtracted by the aggregated expression of control feature sets. All analyzed features are binned based on averaged expression, and the control features are randomly selected from each bin.

So each cell has a score (in very basic sense positive scores are enriched compared to the randomly selected control gene set and vice-versa for negative scores). Because the code that DotPlot uses to determine % expression has threshold of 0 it is counting the % expressed as those only with positive module scores. However, just having a positive module score doesn't necessarily mean the enrichment is statistically significant and thus % expressing is a bit deceiving.

Due to the nature of module scores I think in terms of visualization of them on using DotPlot also becomes problematic for average expression metric (which anyways is normally based on scaled expression data which is not the case for module score). The distribution of scores I think are often more relevant personally.

Overall in my opinion I think the more relevant plotting functions that you could do other than simply overlaying on tSNE/UMAP are VlnPlot and/or FeatureScatter. Here is example from my own data with a score using core signature of microglial genes with VlnPlot:
image

Or the same microglia score plotted against and artificial score using FeatureScatter (See Figure 1H)

You can also run stats independently on the module score (see #3719) to compare between different populations or experimental conditions. Can easily be done using wilcox.test function of base R.

Best,
Sam

@ksaunders73
Copy link
Author

Hello again @samuel-marsh !
Thanks so much again for your help in getting me to a better understanding of all of these!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants