Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Heatmap appears to mix up rank of cluster averages for a particular gene #73

Open
TeresaSteyn opened this issue Jul 24, 2023 · 1 comment

Comments

@TeresaSteyn
Copy link

Hi ShinyCell team,
Thank you so much for this amazing software! It has been extremely useful.

I recently noticed that when I plot the expression of a particular gene (Slc12a2) across three different clusters in a heatmap on the ShinyCell app, the colouring for the cluster averages is different to when I generate this plot myself using a package known as DittoHeatmap. As you can see in the attached pictures, the ShinyCell heatmap indicates that the Astrocyte cluster has the highest expression of Slc12a2, followed by the Inhibitory neurons and then Excitatory neurons, whereas the Dittoheatmap plot indicates that Inhibitory neurons have the highest expression of Slc12a2, followed by Astrocytes and then Excitatory neurons. All other genes appear to show the same pattern of expression in both the ShinyCell and Dittoheatmap heatmaps.

ShinyCell heatmap
Dittoheatmap

I checked the average expression levels of the Slc12a2 gene across the three different clusters as follows:

library(dittoSeq)
library(Seurat)
library(stringr)
library(ComplexHeatmap)
library(circlize)
library(dplyr)
library(ggplot2)

seurat_object <- readRDS("/scratch/styter001/snRNAseq/Mouse/Merged/mt_genes_removed/Annotation/Multi/results/seu_Allen_user_agreement.rds")

seurat_object_editeda2 <- subset(x = seurat_object, subset = (cluster_id == c("Astrocytes", "Excitatory neurons", "Inhibitory neurons")))

DefaultAssay(seurat_object_editeda2) <- 'RNA'

sample.averages <- AverageExpression(seurat_object_editeda2, assay = 'RNA', slot = 'data', return.seurat=TRUE, group.by = c("cluster_id"))

#Find average expression of Slc12a2 across three clusters:
df@assays$RNA@data[grep("Slc12a2", rownames(df@assays$RNA@data)), ]

Astrocytes Excitatory neurons Inhibitory neurons
0.4555909 0.3919602 0.4713912

For both the ShinyCell app and the Dittoheatmap plots, I used the RNA assay and data slot. I also scaled by row for both plots. I am therefore unsure why the ShinyCell app is plotting the average expression to be highest in the Astrocyte cluster. I was wondering if you have any ideas where the discrepancy could have arisen from.

Regards,
Teresa

@jfouyang
Copy link
Collaborator

AverageExpression in Seurat first exponentiates the gene expression then averages the expression, which is what you have performed in the code above.

In ShinyCell, after averaging the expression, the averaged expression is then subjected to log1p transformation before being scaled. Not sure if this extra step caused the difference.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants