### this code prepare RENIN objects using 20221221_324701_cells_wnn_for_RENIN_rerun_pca_lsi_harmony.RData

In [1]:
library(Seurat)
library(Signac)
library(SeuratWrappers)
library(RENIN)
library(harmony)

Attaching SeuratObject

Attaching sp


Attaching package: ‘Signac’


The following object is masked from ‘package:Seurat’:

    FoldChange


Loading required package: Rcpp



In [4]:
# step 1. load novaseq.wnn object
Sys.time()
print("step 1. load novaseq wnn object")
load("../../processed_data/wnn/20221221_324701_cells_wnn.RData")
Sys.time()

[1] "2023-06-28 14:45:09 CDT"

[1] "step 1. load novaseq wnn object"


[1] "2023-06-28 14:49:23 CDT"

In [5]:
novaseq.wnn

An object of class Seurat 
237522 features across 324701 samples within 2 assays 
Active assay: peaks (189184 features, 189184 variable features)
 1 other assay present: RNA
 6 dimensional reductions calculated: pca, harmony_RNA, lsi, harmony_peaks, umap.peaks, WNN.UMAP

### duplicate novaseq.wnn to novaseq.sub

In [6]:
novaseq.sub = novaseq.wnn

In [7]:
novaseq.sub

An object of class Seurat 
237522 features across 324701 samples within 2 assays 
Active assay: peaks (189184 features, 189184 variable features)
 1 other assay present: RNA
 6 dimensional reductions calculated: pca, harmony_RNA, lsi, harmony_peaks, umap.peaks, WNN.UMAP

In [8]:
table(novaseq.sub$renal_region_new)


      Cortex      Medulla      Papilla Renal Artery       Ureter 
      178521        76815        58891         3524         6950 

### make processed_dir

In [9]:
processed_dir = file.path("..", "..", "processed_data", "RENIN")
processed_dir
dir.create(processed_dir, recursive = T, showWarnings = F)

### rename RNA assay to SCT assay

In [10]:
novaseq.sub = RenameAssays(object = novaseq.sub, RNA = 'SCT')
novaseq.sub

“Cannot add objects with duplicate keys (offending key: rna_) setting key to original value 'sct_'”


An object of class Seurat 
237522 features across 324701 samples within 2 assays 
Active assay: peaks (189184 features, 189184 variable features)
 1 other assay present: SCT
 6 dimensional reductions calculated: pca, harmony_RNA, lsi, harmony_peaks, umap.peaks, WNN.UMAP

### change default assay to peaks

In [11]:
DefaultAssay(novaseq.sub)  <- "peaks"
novaseq.sub

An object of class Seurat 
237522 features across 324701 samples within 2 assays 
Active assay: peaks (189184 features, 189184 variable features)
 1 other assay present: SCT
 6 dimensional reductions calculated: pca, harmony_RNA, lsi, harmony_peaks, umap.peaks, WNN.UMAP

### change Idents to celltype5_rna

In [12]:
Idents(novaseq.sub) = novaseq.sub$celltype5_rna
head(Idents(novaseq.sub))

### create pseudo cells for background CRE aen analysis

In [13]:
Sys.time()
mats <- prepare_pseudocell_matrix(novaseq.sub, 
                                  assay = c("peaks", "SCT"), 
                                  cells_per_partition = 100,
                                  reduction1 = "pca",
                                  reduction2 = "lsi")
Sys.time()

[1] "2023-06-28 14:55:46 CDT"

Loading required package: tidyverse

── [1mAttaching packages[22m ─────────────────────────────────────────────────────────────────────────────── tidyverse 1.3.2 ──
[32m✔[39m [34mggplot2[39m 3.4.1     [32m✔[39m [34mpurrr  [39m 0.3.4
[32m✔[39m [34mtibble [39m 3.1.8     [32m✔[39m [34mdplyr  [39m 1.0.9
[32m✔[39m [34mtidyr  [39m 1.2.0     [32m✔[39m [34mstringr[39m 1.4.0
[32m✔[39m [34mreadr  [39m 2.1.2     [32m✔[39m [34mforcats[39m 0.5.1
── [1mConflicts[22m ────────────────────────────────────────────────────────────────────────────────── tidyverse_conflicts() ──
[31m✖[39m [34mdplyr[39m::[32mfilter()[39m masks [34mstats[39m::filter()
[31m✖[39m [34mdplyr[39m::[32mlag()[39m    masks [34mstats[39m::lag()
[31m✖[39m [34mpurrr[39m::[32mreduce()[39m masks [34mSignac[39m::reduce()
Loading required package: Matrix


Attaching package: ‘Matrix’


The following objects are masked from ‘package:tidyr’:

    expand, pack, unpack


Loading requi

[1] "2023-06-28 15:58:18 CDT"

In [14]:
expr_mat <- mats[["SCT"]]
peak_mat <- mats[["peaks"]]

In [15]:
unique(novaseq.sub$celltype_atac5)
unique(novaseq.sub$celltype5_rna)
unique(Idents(novaseq.sub))

### add motif

In [16]:
Sys.time()
library(chromVARmotifs)
library(BSgenome.Hsapiens.UCSC.hg19)

sout <- sapply(strsplit(names(human_pwms_v1), split = "_"), function(s) c(s[3]))
human_pwms_v2 <- human_pwms_v1[match(unique(sout), sout)]

[1] "2023-06-28 15:58:18 CDT"



“no function found corresponding to methods exports from ‘BSgenome’ for: ‘releaseName’”
Loading required package: BSgenome

Loading required package: Biostrings

Loading required package: XVector


Attaching package: ‘XVector’


The following object is masked from ‘package:purrr’:

    compact



Attaching package: ‘Biostrings’


The following object is masked from ‘package:base’:

    strsplit


Loading required package: rtracklayer



In [17]:
motifs = human_pwms_v2

In [18]:
Sys.time()
novaseq.sub <- AddMotifs(novaseq.sub, genome = BSgenome.Hsapiens.UCSC.hg19, pfm = motifs, assay = "peaks")
Sys.time()

[1] "2023-06-28 15:58:26 CDT"

Building motif matrix

Finding motif positions

Creating Motif object



[1] "2023-06-28 16:05:03 CDT"

In [19]:
novaseq.sub

An object of class Seurat 
237522 features across 324701 samples within 2 assays 
Active assay: peaks (189184 features, 189184 variable features)
 1 other assay present: SCT
 6 dimensional reductions calculated: pca, harmony_RNA, lsi, harmony_peaks, umap.peaks, WNN.UMAP

In [20]:
ls()

In [21]:
processed_dir

### save variables needed for the following steps

In [22]:
Sys.time()
save(list = c("novaseq.sub", "expr_mat", "peak_mat",
             'level.novaseq.rna', 'palette.novaseq.rna',
             'level.novaseq', 'palette.novaseq',
             'level.novaseq.renal_region_new', 'palette.novaseq.renal_region_new'), 
     file = file.path(processed_dir, "RENIN_324701_cells_preprocess.RData"), compress = T)
Sys.time()

[1] "2023-06-28 16:11:20 CDT"

[1] "2023-06-28 16:49:17 CDT"

In [23]:
sessionInfo()

R version 4.2.2 (2022-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.2 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets 
[8] methods   base     

other attached packages:
 [1] BSgenome.Hsapiens.UCSC.hg19_1.4.3 BSgenome_1.60.0                  
 [3] rtracklayer_1.52.0                Biostrings_2.60.0                
 [5] XVector_0.32.0                    chromVARmotifs_0.2.0             
 [7] SingleCellExperiment

In [24]:
Sys.time()

[1] "2023-06-28 16:49:17 CDT"