# Standard Seurat Processing for Mol Bio sequencing

## Importing commonly used Libraries:

In [1]:
library(dplyr)
library(Seurat)
library(patchwork)
library(H5weaver)
library(hise)
library(tidyverse)
library(SeuratObject)


Attaching package: ‘dplyr’


The following objects are masked from ‘package:stats’:

    filter, lag


The following objects are masked from ‘package:base’:

    intersect, setdiff, setequal, union


Loading required package: SeuratObject

Loading required package: sp


Attaching package: ‘SeuratObject’


The following objects are masked from ‘package:base’:

    intersect, t


Loading required package: data.table


Attaching package: ‘data.table’


The following objects are masked from ‘package:dplyr’:

    between, first, last


Loading required package: Matrix

Loading required package: rhdf5


Attaching package: ‘H5weaver’


The following objects are masked from ‘package:rhdf5’:

    h5dump, h5ls


“running command 'timedatectl' had status 1”
── [1mAttaching core tidyverse packages[22m ──────────────────────── tidyverse 2.0.0 ──
[32m✔[39m [34mforcats  [39m 1.0.0     [32m✔[39m [34mreadr    [39m 2.1.5
[32m✔[39m [34mggplot2  [39m 3.4.3     [32m✔[39m [34mstringr  [39

## Creating Seurat Objects from h5 outs from Cellranger

### Reading h5 files into memory

In [2]:
h5s <- list.files(
    path = '/home/jupyter/CS15_WHBL/CWB_Paper/01_Final_Data/02_Data/2E_exp823/Exp00823_w2', 
    pattern = 'filtered_feature_bc_matrix.h5$', 
    full.names = TRUE, 
    recursive = TRUE
)


### Creating Seurat Objects

In [3]:
fixed_so <- lapply(h5s, function(x){
    pro <- strsplit(strsplit(x,'/per_sample_outs/')[[1]][2],'/count/')[[1]][1]
    exp <- strsplit(strsplit(x,'/outs/')[[1]][1],'/')[[1]][length(strsplit(strsplit(x,'/outs/')[[1]][1],'/')[[1]])]

    pro <- paste(exp,pro,sep='_')

    mtx <- Read10X_h5(x)
    so <- CreateSeuratObject(mtx,project=pro)
    return(so) 
    })


In [4]:
fully <- Reduce(merge,fixed_so) %>% JoinLayers()


In [5]:
fully[["percent.mt"]] <- PercentageFeatureSet(fully, pattern = "^MT-")
fully <- subset(fully, subset = percent.mt < 5)

### Normalizing, running PCA and UMAP clustering

In [6]:
fully <- NormalizeData(fully) %>% 
    FindVariableFeatures() %>% 
    ScaleData() %>% 
    RunPCA() %>% 
    RunUMAP(dims = 1:20) %>% 
    FindNeighbors(dims = 1:20) %>% 
    FindClusters(resolution = 0.5)


Normalizing layer: counts

Finding variable features for layer counts

Centering and scaling data matrix

PC_ 1 
Positive:  TRBC2, LTB, IL32, LEF1, TRBC1, CD2, CCR7, CD69, CD247, TRAT1 
	   IKZF3, PIM2, MAL, PCED1B, BCL2, THEMIS, ISG20, NELL2, TRABD2A, ITGA6 
	   CD27, OBSCN, CCND2, PYHIN1, TRIB2, MYC, SAMD3, ITM2A, ITGB7, RNF157 
Negative:  LYZ, S100A9, SPI1, SERPINA1, MPEG1, NCF2, MNDA, IFI30, CD68, CYBB 
	   S100A8, CST3, VCAN, CSF3R, TYMP, MS4A6A, CLEC7A, KLF4, CLEC12A, FGL2 
	   EMILIN2, PLXNB2, HCK, GRN, CEBPD, CD14, ZNF385A, LILRB2, TIMP2, CTSS 
PC_ 2 
Positive:  CD14, LEF1, ANXA1, S100A8, VCAN, S100A9, HK3, SLC11A1, IL17RA, IRS2 
	   IL32, MAL, TRBC1, CFD, TRAT1, CSF3R, TRABD2A, S100A6, UBXN11, NELL2 
	   CEBPD, SOCS3, CRISPLD2, RXRA, FCAR, LYZ, RGCC, S100A12, CEBPB, LRP1 
Negative:  NIBAN3, IGHM, COBLL1, BCL11A, BLNK, MS4A1, IGKC, TCF4, CD79A, BANK1 
	   FCRL2, RUBCNL, PLD4, FCRL1, CD22, PAX5, JCHAIN, POU2AF1, IGHD, TSPAN13 
	   BLK, AFF3, SETBP1, SPIB, CD79B, FCRLA, TCL1A, OS

Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck

Number of nodes: 4264
Number of edges: 158406

Running Louvain algorithm...
Maximum modularity in 10 random starts: 0.9171
Number of communities: 15
Elapsed time: 0 seconds


### Saving the SO to a file that can be read into memory later.


In [7]:
saveRDS(fully,'/home/jupyter/CS15_WHBL/CWB_Paper/01_Final_Data/02_Data/2E_exp823/Fig2E_Final.rds') # writing 

In [8]:
sessionInfo()

R version 4.1.3 (2022-03-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.4 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3

locale:
 [1] LC_CTYPE=C.UTF-8       LC_NUMERIC=C           LC_TIME=C.UTF-8       
 [4] LC_COLLATE=C.UTF-8     LC_MONETARY=C.UTF-8    LC_MESSAGES=C.UTF-8   
 [7] LC_PAPER=C.UTF-8       LC_NAME=C              LC_ADDRESS=C          
[10] LC_TELEPHONE=C         LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C   

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] lubridate_1.9.3    forcats_1.0.0      stringr_1.5.1      purrr_1.0.2       
 [5] readr_2.1.5        tidyr_1.3.1        tibble_3.2.1       ggplot2_3.4.3     
 [9] tidyverse_2.0.0    hise_2.15.0        H5weaver_1.2.0     rhdf5_2.38.1      
[13] Matrix_1.6-4       data.table_1.14.2  patchwork_1.1.1    Seurat_