In [None]:
########################################################################
# Author    : A. Alsema
# Date      : Nov 2023
# Dataset   : Visium Spatial Transcriptomics for MS lesions, 15 slices with WM 
# Purpose   : calculate a Seurat module score for a gene list
# Output    : csv file with module score per spot: "celltype-modulescores.csv"

# Required Inputs: 
# -  seurat compatible object with same barcodes as cds: "TEMP_seu_filt_as_monocle.rds"
# -  gene lists for module scores. 
## Note: we tried to make gene lists here as specific as possible for the indicated category. 
## This means "genesets/for_monocle_branches_v6.csv" generally contains short lists of 5-15 genes per category.
########################################################################

In [1]:
rm(list = ls())

library(monocle3)
library(Seurat)
library(dplyr)


Loading required package: Biobase

Loading required package: BiocGenerics


Attaching package: ‘BiocGenerics’


The following objects are masked from ‘package:stats’:

    IQR, mad, sd, var, xtabs


The following objects are masked from ‘package:base’:

    anyDuplicated, aperm, append, as.data.frame, basename, cbind,
    colnames, dirname, do.call, duplicated, eval, evalq, Filter, Find,
    get, grep, grepl, intersect, is.unsorted, lapply, Map, mapply,
    match, mget, order, paste, pmax, pmax.int, pmin, pmin.int,
    Position, rank, rbind, Reduce, rownames, sapply, setdiff, sort,
    table, tapply, union, unique, unsplit, which.max, which.min


Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.


Loading required package: SingleCellExperiment

Loading required package: SummarizedExperiment

Loading required package: MatrixGenerics

Loading

In [26]:
# load data
seu <- readRDS("RData/monocle3/TEMP_seu_filt_as_monocle.rds") # Before loading, I checked that barcodes are the exact same as in 3.monocle_traj_WM_allgenes.rds
seu

An object of class Seurat 
37538 features across 55084 samples within 2 assays 
Active assay: Spatial (33538 features, 0 variable features)
 3 layers present: counts, data, scale.data
 1 other assay present: integrated
 2 dimensional reductions calculated: pca, umap
 15 images present: section1, section2, section3, section4, section5, section6, section7, section8, section9, section10, section11, section12, section13, section14, section15

In [40]:
# load a list of gene modules
gene_module_df <- read.csv("genesets/for_monocle_branches_v6.csv", header  = TRUE)
genesets <- unique(gene_module_df$module)

genesets

# Create module scores

In [None]:
meta.vars = ncol(seu@meta.data)

for (k in 1:length(genesets)){
    geneset = genesets[k]
    print(geneset)
    current_geneset = gene_module_df %>% filter(module %in% c(as.character(geneset))) %>% pull(gene) %>% list()
    print(current_geneset)
    seu <- AddModuleScore(seu, features = current_geneset, name = geneset)
    }

In [34]:
start_slice <- meta.vars + 1
end_slice <- ncol(seu@meta.data)

# create df for plotting

In [35]:
data_plot = as.data.frame(seu[["umap"]]@cell.embeddings)
print("this should print TRUE:")
identical(row.names(data_plot), row.names(seu@meta.data)) # sanity check

[1] "this should print TRUE:"


In [None]:
data_plot <- cbind(data_plot, seu@meta.data[start_slice:end_slice])
data_plot[1:4,]

In [None]:
write.csv(data_plot, "RData/monocle3/celltype-modulescores.csv")

In [38]:
sessionInfo()

R version 4.2.0 (2022-04-22)
Platform: x86_64-conda-linux-gnu (64-bit)
Running under: Ubuntu 20.04.5 LTS

Matrix products: default
BLAS/LAPACK: /data/bcn/p283607/anaconda3/envs/R4.2/lib/libopenblasp-r0.3.21.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
 [1] dplyr_1.1.4                 SeuratObject_5.0.1         
 [3] Seurat_4.3.0                monocle3_1.3.1             
 [5] SingleCellExperiment_1.20.1 SummarizedExperiment_1.28.0
 [7] GenomicRanges_1.50.2        GenomeInfoDb_1.34.9        
 [9] IRanges_2.32.0              S4Vectors_0