## Introduction

##### This is an example for the users who want to do their own Preprocess step and already has a Seurat Project data available for Selection step. In this example, a seurat project object is firstly loaded and then the features are screened . You can customize the feature screen step and choose the ones you are interested. Then the required 10x format of data is saved for the Selection step in cellMarkerPipe

## Methods

### Step 0: load libraries

In [1]:
# import libraries
library(Seurat)

The legacy packages maptools, rgdal, and rgeos, underpinning the sp package,
which was just loaded, were retired in October 2023.
Please refer to R-spatial evolution reports for details, especially
https://r-spatial.org/r/2023/05/15/evolution4.html.
It may be desirable to make the sf package available;
package maintainers should consider adding sf to Suggests:.

Attaching SeuratObject



library(dplyr)
library(patchwork)
library(ggplot2)
library(DropletUtils)

### Step 1: load seurat project data

In [2]:
# the work directory (work.dir)
work.dir = "/work/sabirianov/yinglu/test/cellmaker-rev"
# The data directory (data.dir) where your save the seurat project object
data.dir = "/work/sabirianov/yinglu/software/cellMarkerPipe/data/Zeisel/10x"

In [18]:
# keep this data saving directory so that the Selection step in cellMarkerPipe is able to find it
Wdata.dir <- file.path(work.dir, "data")
dir.create(Wdata.dir)

“'/work/sabirianov/yinglu/test/cellmaker-rev/data' already exists”


In [3]:
zeisel <- readRDS(file =file.path(data.dir, "/seurat.rds"))

In [4]:
zeisel

An object of class Seurat 
4998 features across 2989 samples within 1 assay 
Active assay: RNA (4998 features, 2000 variable features)

### Step 2: screen the features you want to perform down-stream calculation

In [9]:
# n.variable save the number of high variable genes you want to save for Selection step
n.variable <- 2000
print("Find High Variable Features...")
zeisel <- FindVariableFeatures(zeisel, selection.method = "vst", nfeatures = n.variable)
mat_keep_rows <- head(VariableFeatures(object = zeisel),n.variable)
print("The number of high variable genes used:")
print(length(mat_keep_rows))

[1] "Find High Variable Features..."
[1] "The number of high variable genes used:"
[1] 2000


### Step 3: save the count matrix of the selected features

In [15]:
mat <- GetAssayData(object = zeisel, slot = 'counts')
mat_subset <- mat[rownames(mat) %in% mat_keep_rows, ]

print("The dimension of the selected matrix is:")
print(dim(mat_subset))
print("Save the counts of high variables")
start_time <- Sys.time()
write.table(mat_subset,file = file.path(Wdata.dir, "counts_high_variable.csv"),sep="\t") # keeps the rownames
end_time <- Sys.time()
print("time used:")
print(end_time - start_time)

print("Obtained High Variable Features!")

[1] "The dimension of the selected matrix is:"
[1] 2000 2989
[1] "Save the counts of high variables"
[1] "time used:"
Time difference of 4.658097 secs
[1] "Obtained High Variable Features!"
