# Dot plot of expression of cell type markers in the data

In [1]:
suppressWarnings({suppressMessages({
    library(Seurat) # the main framework for the scRNA-Seq analyses
    library(readxl)
    library(ggplot2)
    library(circlize)
    library(RColorBrewer)
})})

Loading the data.

In [2]:
hgsoc <- readRDS("HGSOC_CellHashing_CLUSTERED.RDS")

Loading the ovarian cancer markers for different cell types.

In [3]:
celltypes <- as.data.frame(read_xlsx(path = "ovarian_cancer_markers.xlsx"))
genemarkers <- na.omit(unique(unlist(sapply(celltypes$geneSymbolmore1, function(x) strsplit(x = x, split = ",")[[1]]))))

In [4]:
genemarkers

Retrieving only the DMSO expression of the cells, but from the whole object. This is to retain the normalization applied to the whole dataset.

In [5]:
hgsoc@meta.data$treatment_ext <- paste0(hgsoc@meta.data$model, "_", hgsoc@meta.data$Treatment_group)

In [6]:
head(hgsoc@meta.data$treatment_ext)

In [7]:
pdf(file = "DotPlot_cellTypeMarkers_reviewersOnly.pdf", height = 4, width = 15)
DefaultAssay(hgsoc) <- "SCT"
Idents(hgsoc) <- "treatment_ext"
DotPlot(hgsoc, 
        idents = c("JHOS2_DMSO", "PDC3_DMSO", "PDC2_DMSO"), 
        features = na.omit(genemarkers), 
        scale = F) +
ylab("") +
xlab("") + 
geom_point(aes(size = pct.exp), shape = 21, colour="black", stroke = 0.5) +
scale_color_gradient2(low = "blue", mid = "white", high = "red") +
guides(size = guide_legend(title = "% Expression", override.aes = list(shape = 21, colour = "black", fill = "white")))+ 
theme(axis.text.x = element_text(angle = 45, hjust = 1, face = "italic"))
dev.off()

[1m[22mScale for [32mcolour[39m is already present.
Adding another scale for [32mcolour[39m, which will replace the existing scale.


Plotting the expression of _CD298_ and _B2M_, which are the two cell surface proteins used for Cell Hashing.

In [8]:
pdf(width = 55, height = 10, file = "VlnPlot_B2M_CD298.pdf")
DefaultAssay(hgsoc) <- "SCT"
Idents(hgsoc) <- "treatment_ext"
VlnPlot(hgsoc, ncol = 1,
        features = c("B2M", "ATP1B3"), #CD298 is ATP1B3
       ) + ggtitle("CD298") & theme(plot.title = element_text(face = "italic")) & xlab("")
dev.off()

For the _B2M_ and _CD298_ violin plots, we need 3 stacked violin plots; one per model.

In [9]:
pdf(width = 20, height = 10, file = "JHOS2_B2M_CD298_VlnPlots.pdf")
DefaultAssay(hgsoc) <- "SCT"
Idents(hgsoc) <- "treatment_ext"
levels(hgsoc) <- sort(as.character(unique(Idents(hgsoc))))
VlnPlot(hgsoc, ncol = 1, pt.size = 0.01, 
        cols = colorRampPalette(brewer.pal(8, "Paired"))(length(unique(Idents(hgsoc)[grep(Idents(hgsoc), pattern = "JHOS2")]))),
        idents = sort(as.character(unique(Idents(hgsoc)[grep(Idents(hgsoc), pattern = "JHOS2")]))),
        features = c("B2M", "ATP1B3"), #CD298 is ATP1B3
       ) + ggtitle("CD298") & theme(plot.title = element_text(face = "italic"), 
                                    axis.text.x = element_text(angle = 90)) & xlab("")
dev.off()

pdf(width = 20, height = 10, file = "PDC3_B2M_CD298_VlnPlots.pdf")
VlnPlot(hgsoc, ncol = 1, pt.size = 0.01, 
        cols = colorRampPalette(brewer.pal(8, "Paired"))(length(unique(Idents(hgsoc)[grep(Idents(hgsoc), pattern = "PDC3")]))),
        idents = sort(as.character(unique(Idents(hgsoc)[grep(Idents(hgsoc), pattern = "PDC3")]))),
        features = c("B2M", "ATP1B3"), #CD298 is ATP1B3
       ) + ggtitle("CD298") & theme(plot.title = element_text(face = "italic"), 
                                    axis.text.x = element_text(angle = 90)) & xlab("")
dev.off()

pdf(width = 20, height = 10, file = "PDC2_B2M_CD298_VlnPlots.pdf")
VlnPlot(hgsoc, ncol = 1, pt.size = 0.01, 
        cols = colorRampPalette(brewer.pal(8, "Paired"))(length(unique(Idents(hgsoc)[grep(Idents(hgsoc), pattern = "PDC2")]))),
        idents = sort(as.character(unique(Idents(hgsoc)[grep(Idents(hgsoc), pattern = "PDC2")]))),
        features = c("B2M", "ATP1B3"), #CD298 is ATP1B3
       ) + ggtitle("CD298") & theme(plot.title = element_text(face = "italic"), 
                                    axis.text.x = element_text(angle = 90)) & xlab("")
dev.off()

In [10]:
pdf(height = 40, width = 5, file = "DotPlot_B2M_CD298.pdf")
DefaultAssay(hgsoc) <- "SCT"
Idents(hgsoc) <- "treatment_ext"
DotPlot(hgsoc, 
        features = c("B2M", "ATP1B3"), #CD298 is ATP1B3
         scale = F
       ) +
ylab("") +
xlab("") + 
scale_x_discrete(labels = c("B2M", "CD298")) +
geom_point(aes(size = pct.exp), shape = 21, colour="black", stroke = 0.5) +
scale_color_gradient2(low = "blue", mid = "white", high = "red") +
guides(size = guide_legend(title = "% Expression", override.aes = list(shape = 21, colour = "black", fill = "white")))+ 
theme(axis.text.x = element_text(angle = 45, hjust = 1, face = "italic"))
dev.off()

[1m[22mScale for [32mcolour[39m is already present.
Adding another scale for [32mcolour[39m, which will replace the existing scale.


In [11]:
sessionInfo()

R version 4.2.2 (2022-10-31)
Platform: x86_64-conda-linux-gnu (64-bit)
Running under: Rocky Linux 8.8 (Green Obsidian)

Matrix products: default
BLAS/LAPACK: /homedir01/adini22/.conda/envs/cellhashing_analyses/lib/libopenblasp-r0.3.21.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] RColorBrewer_1.1-3 circlize_0.4.15    ggplot2_3.4.2      readxl_1.4.1      
[5] SeuratObject_4.1.3 Seurat_4.3.0.9002 

loaded via a namespace (and not attached):
  [1] ggbeeswarm_0.7.1       Rtsne_0.16             colorspace_2.1-0      
  [4] deldir_1.0-6           elli