In [5]:
quiet_library <- function(...) {
    suppressPackageStartupMessages(library(...))
}
quiet_library('tidyverse')
quiet_library("hise")
quiet_library('ArchR')
quiet_library('data.table')
quiet_library('jsonlite')
quiet_library('parallel')
quiet_library("Seurat")
quiet_library(H5weaver)
addArchRGenome("hg38")
addArchRThreads(threads = 60)
options(future.globals.maxSize = 1000 * 1024^5)

Setting default genome to Hg38.

Setting default number of Parallel threads to 60.



In [2]:
T_cell_L3<-c('CD8aa',
 'DN T cell',
 'CD8 MAIT','CD4 MAIT','ISG+ MAIT',
 'CM CD4 T cell',
 'GZMB- CD27+ EM CD4 T cell','GZMB- CD27- EM CD4 T cell',
 'ISG+ memory CD4 T cell','KLRF1- GZMB+ CD27- memory CD4 T cell','GZMK+ CD27+ EM CD8 T cell',
 'CM CD8 T cell','KLRF1- GZMB+ CD27- EM CD8 T cell','GZMK- CD27+ EM CD8 T cell',
 'KLRF1+ GZMB+ CD27- EM CD8 T cell',
 'ISG+ memory CD8 T cell','Core naive CD4 T cell','SOX4+ naive CD4 T cell',
 'ISG+ naive CD4 T cell','Core naive CD8 T cell','SOX4+ naive CD8 T cell',
 'ISG+ naive CD8 T cell','Proliferating T cell',
 'Naive CD4 Treg','Memory CD4 Treg',
 'KLRB1+ memory CD4 Treg','Memory CD8 Treg','GZMK+ memory CD4 Treg',
 'KLRB1+ memory CD8 Treg','GZMK+ Vd2 gdT',
 'GZMB+ Vd2 gdT','Naive Vd1 gdT','KLRF1+ effector Vd1 gdT','SOX4+ Vd1 gdT','KLRF1- effector Vd1 gdT')

In [8]:
remotes::install_version("Matrix", version = "1.5.3")

Downloading package from url: https://cran.r-project.org/src/contrib/Archive/Matrix/Matrix_1.5-3.tar.gz

Installing package into ‘/home/jupyter/libb’
(as ‘lib’ is unspecified)



In [2]:
meta<-read.csv('meta_data_GEO.csv')


In [10]:
mclapply(meta$combined_sample_id,function(i){
ArrowFiles <- createArrowFiles(
  inputFiles = paste0(i,'_fragments.tsv.gz'),
  sampleNames = i,
  minTSS = 4, #Dont set this too high because you can always increase later
  minFrags = 1000, 
  addTileMat = TRUE,
  addGeneScoreMat = TRUE
)

},mc.cores=16)

In [11]:
projHeme1 <- ArchRProject(
  ArrowFiles = paste0(meta$combined_sample_id,'.arrow'), 
  outputDirectory = "PenSen_ATAC_L2",
  copyArrows = TRUE #This is recommened so that if you modify the Arrow files you have an original copy for later usage.
)

Using GeneAnnotation set by addArchRGenome(Hg38)!

Using GeneAnnotation set by addArchRGenome(Hg38)!

Validating Arrows...

Getting SampleNames...



Copying ArrowFiles to Ouptut Directory! If you want to save disk space set copyArrows = FALSE

1 
2 
3 
4 
5 
6 
7 
8 
9 
10 
11 
12 
13 
14 
15 
16 


Getting Cell Metadata...



Merging Cell Metadata...

Initializing ArchRProject...


                                                   / |
                                                 /    \
            .                                  /      |.
            \\\                              /        |.
              \\\                          /           `|.
                \\\                      /              |.
                  \                    /                |\
                  \\#####\           /                  ||
                ==###########>      /                   ||
                 \\##==......\    /                     ||
            ______ =       =|__ /_

In [12]:
projHeme1 <- addIterativeLSI(projHeme1, name = 'IterativeLSI', force = TRUE, varFeatures = 75000)


Checking Inputs...

ArchR logging to : ArchRLogs/ArchR-addIterativeLSI-2498991441-Date-2024-04-10_Time-16-09-08.758623.log
If there is an issue, please report to github with logFile!

2024-04-10 16:09:10.138291 : Computing Total Across All Features, 0.006 mins elapsed.

2024-04-10 16:09:15.028767 : Computing Top Features, 0.088 mins elapsed.

###########
2024-04-10 16:09:16.629768 : Running LSI (1 of 2) on Top Features, 0.115 mins elapsed.
###########

2024-04-10 16:09:16.899037 : Sampling Cells (N = 10010) for Estimated LSI, 0.119 mins elapsed.

2024-04-10 16:09:16.901294 : Creating Sampled Partial Matrix, 0.119 mins elapsed.

2024-04-10 16:09:37.035537 : Computing Estimated LSI (projectAll = FALSE), 0.455 mins elapsed.

2024-04-10 16:11:34.475518 : Identifying Clusters, 2.412 mins elapsed.

2024-04-10 16:11:49.263693 : Identified 6 Clusters, 2.659 mins elapsed.

2024-04-10 16:11:49.280901 : Saving LSI Iteration, 2.659 mins elapsed.

2024-04-10 16:12:05.6715 : Creating Cluster Matrix 

In [13]:
scRNA_labels<-read.csv('PedSenior_TEAseq_Labels_2024-03-26.csv')
scRNA_labels$real_barcodes<-paste0(scRNA_labels$well_id,'-',scRNA_labels$original_barcodes,'-1')

In [14]:
scATAC_labels<-as.data.frame(projHeme1@cellColData)
scATAC_labels$barcodes <- str_extract(rownames(scATAC_labels), "(?<=#).*$")

In [15]:
scATAC_meta_data_files<-paste0(meta$combined_sample_id,'_filtered_metadata.csv.gz')

In [16]:
meta_data_list_ATAC<-mclapply(scATAC_meta_data_files,function(x){


metadata <- read.csv(gzfile(x))
return(metadata)

},mc.cores=16)

In [17]:
meta_data_ATAC<-do.call(rbind,meta_data_list_ATAC)

In [18]:
scATAC_labels<-left_join(scATAC_labels,meta_data_ATAC,by=('barcodes'))

In [19]:
scATAC_labels$real_barcodes<-paste0(gsub('-A','-',scATAC_labels$well_id),'-',scATAC_labels$original_barcodes)

In [20]:
scATAC_labels<-left_join(scATAC_labels,scRNA_labels,by=c('real_barcodes'))

In [21]:
scATAC_labels<-scATAC_labels %>% filter(predicted_doublet=="False",pct_counts_mito<15,n_genes>200,n_genes<2500)

In [22]:
projHeme1@cellColData$barcodes<-str_extract(rownames(as.data.frame(projHeme1@cellColData)), "(?<=#).*$")

In [23]:
scATAC_labels$ATAC_barcodes<-paste0(scATAC_labels$Sample,'#',scATAC_labels$barcodes.x)

In [24]:
table(scATAC_labels$AIFI_L2)


        CD14 monocyte         CD16 monocyte    CD56bright NK cell 
                 1430                    11                    79 
      CD56dim NK cell                 CD8aa             DN T cell 
                 1761                  1160                  1882 
      Effector B cell           Erythrocyte                   gdT 
                   11                   682                  5614 
                  ILC                  MAIT         Memory B cell 
                    1                 13036                   197 
    Memory CD4 T cell     Memory CD8 T cell          Naive B cell 
                65762                 53170                  3111 
     Naive CD4 T cell      Naive CD8 T cell                   pDC 
               100931                 41225                    35 
          Plasma cell              Platelet Proliferating NK cell 
                    3                   436                    33 
 Proliferating T cell   Transitional B cell                  

In [25]:
scATAC_labels<-scATAC_labels %>% filter(AIFI_L1=='T cell')%>% filter(AIFI_L2%in% c('Treg','Proliferating T cell',
                                                                    'Naive CD8 T cell','Naive CD4 T cell','CD8aa','MAIT',
                                                                   'Memory CD8 T cell','Memory CD4 T cell','DN T cell'))%>% 
filter(AIFI_L3%in% T_cell_L3)

In [26]:
write.csv(scATAC_labels,'scATAC_cell_meta_data_L2.csv')

In [27]:
projHeme1<-projHeme1[scATAC_labels$ATAC_barcodes,]

In [28]:
projHeme1@cellColData$AIFI_L2<-scATAC_labels$AIFI_L2
projHeme1@cellColData$AIFI_L3<-scATAC_labels$AIFI_L3

In [29]:
saveArchRProject(ArchRProj = projHeme1, outputDirectory = "PenSen_ATAC_L2", load = TRUE)

Saving ArchRProject...

Loading ArchRProject...

Successfully loaded ArchRProject!


                                                   / |
                                                 /    \
            .                                  /      |.
            \\\                              /        |.
              \\\                          /           `|.
                \\\                      /              |.
                  \                    /                |\
                  \\#####\           /                  ||
                ==###########>      /                   ||
                 \\##==......\    /                     ||
            ______ =       =|__ /__                     ||      \\\
       \               '        ##_______ _____ ,--,__,=##,__   ///
        ,    __==    ___,-,__,--'#'  ==='      `-'    | ##,-/
        -,____,---'       \\####\\________________,--\\_##,/
           ___      .______        ______  __    __  .______      
          

class: ArchRProject 
outputDirectory: /home/jupyter/BRI_Analysis/scRNA/Analysis-Cross-Sectional/TEAseq/PenSen_ATAC_L2 
samples(16): GSM6611363_B065-P1_PB00593-04
  GSM6611364_B069-P1_PB00323-02 ... GSM6611377_B065-P1_PB00192-02
  GSM6611378_B065-P1_PB00197-02
sampleColData names(1): ArrowFiles
cellColData names(16): Sample TSSEnrichment ... AIFI_L2 AIFI_L3
numberOfCells(1): 287761
medianTSS(1): 25.364
medianFrags(1): 6889

# AIFI_L2

In [8]:
projHeme1 <- loadArchRProject(path = 'PenSen_ATAC_L2/')

Successfully loaded ArchRProject!


                                                   / |
                                                 /    \
            .                                  /      |.
            \\\                              /        |.
              \\\                          /           `|.
                \\\                      /              |.
                  \                    /                |\
                  \\#####\           /                  ||
                ==###########>      /                   ||
                 \\##==......\    /                     ||
            ______ =       =|__ /__                     ||      \\\
       \               '        ##_______ _____ ,--,__,=##,__   ///
        ,    __==    ___,-,__,--'#'  ==='      `-'    | ##,-/
        -,____,---'       \\####\\________________,--\\_##,/
           ___      .______        ______  __    __  .______      
          /   \     |   _  \      /      ||  |  |  | |   _ 

In [9]:
projHeme1 <- addGroupCoverages(ArchRProj = projHeme1,
                               groupBy = "AIFI_L2",
                               maxReplicates = 16,
                               maxCells = 2000,
                               threads = 10,force = TRUE )

ArchR logging to : ArchRLogs/ArchR-addGroupCoverages-734d1d409c90-Date-2024-04-16_Time-04-44-27.612041.log
If there is an issue, please report to github with logFile!

CD8aa (1 of 9) : CellGroups N = 10

DN T cell (2 of 9) : CellGroups N = 16

MAIT (3 of 9) : CellGroups N = 16

Memory CD4 T cell (4 of 9) : CellGroups N = 16

Memory CD8 T cell (5 of 9) : CellGroups N = 16

Naive CD4 T cell (6 of 9) : CellGroups N = 16

Naive CD8 T cell (7 of 9) : CellGroups N = 16

Proliferating T cell (8 of 9) : CellGroups N = 2

Treg (9 of 9) : CellGroups N = 16

2024-04-16 04:44:50.132104 : Creating Coverage Files!, 0.375 mins elapsed.

2024-04-16 04:44:50.135882 : Batch Execution w/ safelapply!, 0.375 mins elapsed.

2024-04-16 04:53:40.423563 : Adding Kmer Bias to Coverage Files!, 9.214 mins elapsed.

Completed Kmer Bias Calculation

Adding Kmer Bias (1 of 124)

Adding Kmer Bias (2 of 124)

Adding Kmer Bias (3 of 124)

Adding Kmer Bias (4 of 124)

Adding Kmer Bias (5 of 124)

Adding Kmer Bias (6 of 

In [10]:
pathToMacs2<-'/opt/conda/bin/macs3'

In [11]:
projHeme1 <- addReproduciblePeakSet(
                        ArchRProj = projHeme1, 
                        groupBy = "AIFI_L2", 
                        pathToMacs2 = pathToMacs2, force = TRUE
                            )

ArchR logging to : ArchRLogs/ArchR-addReproduciblePeakSet-734df0c7f32-Date-2024-04-16_Time-05-02-35.500954.log
If there is an issue, please report to github with logFile!

Calling Peaks with Macs2

2024-04-16 05:02:35.785032 : Peak Calling Parameters!, 0.005 mins elapsed.



                                    Group nCells nCellsUsed nReplicates nMin
CD8aa                               CD8aa   1157       1157          10   59
DN T cell                       DN T cell   1879       1879          16   51
MAIT                                 MAIT  13035      10607          16  164
Memory CD4 T cell       Memory CD4 T cell  65762      29048          16 1420
Memory CD8 T cell       Memory CD8 T cell  53146      27467          16  874
Naive CD4 T cell         Naive CD4 T cell 100931      32000          16 2000
Naive CD8 T cell         Naive CD8 T cell  41225      24189          16  416
Proliferating T cell Proliferating T cell      3          3           2    3
Treg                                 Treg  10623      10623          16  249
                     nMax maxPeaks
CD8aa                 240   150000
DN T cell             266   150000
MAIT                 2000   150000
Memory CD4 T cell    2000   150000
Memory CD8 T cell    2000   150000
Naive CD4 T cell    

2024-04-16 05:02:35.793882 : Batching Peak Calls!, 0.005 mins elapsed.

2024-04-16 05:02:35.800179 : Batch Execution w/ safelapply!, 0 mins elapsed.

2024-04-16 05:06:27.907851 : Identifying Reproducible Peaks!, 3.873 mins elapsed.

2024-04-16 05:06:51.209554 : Creating Union Peak Set!, 4.262 mins elapsed.

Converged after 12 iterations!

Plotting Ggplot!

2024-04-16 05:07:00.91853 : Finished Creating Union Peak Set (167345)!, 4.424 mins elapsed.



In [12]:
projHeme1 <- addPeakMatrix(projHeme1, force = TRUE)

ArchR logging to : ArchRLogs/ArchR-addPeakMatrix-734d1b189805-Date-2024-04-16_Time-05-07-00.938938.log
If there is an issue, please report to github with logFile!

2024-04-16 05:07:01.151648 : Batch Execution w/ safelapply!, 0 mins elapsed.

Overriding previous entry for ReadsInPeaks

Overriding previous entry for FRIP

ArchR logging successful to : ArchRLogs/ArchR-addPeakMatrix-734d1b189805-Date-2024-04-16_Time-05-07-00.938938.log



In [13]:
library(BSgenome.Hsapiens.UCSC.hg38)
library(TFBSTools)
library(JASPAR2020)
tair.motif=getMatrixSet(x=JASPAR2020,opts=list(collection="CORE",species="9606",matrixtype="PWM"))

In [14]:
projHeme1=addMotifAnnotations(projHeme1,name="JASPARMotif",motifPWMs=tair.motif,force=TRUE)

ArchR logging to : ArchRLogs/ArchR-addMotifAnnotations-734d7d467993-Date-2024-04-16_Time-05-11-20.778506.log
If there is an issue, please report to github with logFile!

peakAnnotation name already exists! Overriding.

2024-04-16 05:11:20.860524 : Gettting Motif Set, Species : , 0.001 mins elapsed.

2024-04-16 05:11:21.47616 : Finding Motif Positions with motifmatchr!, 0.012 mins elapsed.

2024-04-16 05:13:47.425046 : All Motifs Overlap at least 1 peak!, 2.444 mins elapsed.

2024-04-16 05:13:47.427889 : Creating Motif Overlap Matrix, 2.444 mins elapsed.

2024-04-16 05:13:49.573944 : Finished Getting Motif Info!, 2.48 mins elapsed.

ArchR logging successful to : ArchRLogs/ArchR-addMotifAnnotations-734d7d467993-Date-2024-04-16_Time-05-11-20.778506.log



In [15]:
projHeme1


           ___      .______        ______  __    __  .______      
          /   \     |   _  \      /      ||  |  |  | |   _  \     
         /  ^  \    |  |_)  |    |  ,----'|  |__|  | |  |_)  |    
        /  /_\  \   |      /     |  |     |   __   | |      /     
       /  _____  \  |  |\  \\___ |  `----.|  |  |  | |  |\  \\___.
      /__/     \__\ | _| `._____| \______||__|  |__| | _| `._____|
    



class: ArchRProject 
outputDirectory: /home/jupyter/BRI_Analysis/scRNA/Analysis-Cross-Sectional/TEAseq/PenSen_ATAC_L2 
samples(16): GSM6611363_B065-P1_PB00593-04
  GSM6611364_B069-P1_PB00323-02 ... GSM6611377_B065-P1_PB00192-02
  GSM6611378_B065-P1_PB00197-02
sampleColData names(1): ArrowFiles
cellColData names(18): Sample TSSEnrichment ... ReadsInPeaks FRIP
numberOfCells(1): 287761
medianTSS(1): 25.364
medianFrags(1): 6889

In [16]:
#remotes::install_version("Matrix", version = "1.6.3")

In [17]:
saveArchRProject(ArchRProj = projHeme1, outputDirectory = "PenSen_ATAC_L2", load = TRUE)

Saving ArchRProject...

Loading ArchRProject...

Successfully loaded ArchRProject!


                                                   / |
                                                 /    \
            .                                  /      |.
            \\\                              /        |.
              \\\                          /           `|.
                \\\                      /              |.
                  \                    /                |\
                  \\#####\           /                  ||
                ==###########>      /                   ||
                 \\##==......\    /                     ||
            ______ =       =|__ /__                     ||      \\\
       \               '        ##_______ _____ ,--,__,=##,__   ///
        ,    __==    ___,-,__,--'#'  ==='      `-'    | ##,-/
        -,____,---'       \\####\\________________,--\\_##,/
           ___      .______        ______  __    __  .______      
          

class: ArchRProject 
outputDirectory: /home/jupyter/BRI_Analysis/scRNA/Analysis-Cross-Sectional/TEAseq/PenSen_ATAC_L2 
samples(16): GSM6611363_B065-P1_PB00593-04
  GSM6611364_B069-P1_PB00323-02 ... GSM6611377_B065-P1_PB00192-02
  GSM6611378_B065-P1_PB00197-02
sampleColData names(1): ArrowFiles
cellColData names(18): Sample TSSEnrichment ... ReadsInPeaks FRIP
numberOfCells(1): 287761
medianTSS(1): 25.364
medianFrags(1): 6889

In [18]:
projHeme1 <- addBgdPeaks(projHeme1,force=TRUE)

Identifying Background Peaks!



In [19]:
projHeme1 <- addDeviationsMatrix(
  ArchRProj = projHeme1, 
  peakAnnotation = "Motif",
  force = TRUE
)


Using Previous Background Peaks!

ArchR logging to : ArchRLogs/ArchR-addDeviationsMatrix-734d57e93f06-Date-2024-04-16_Time-05-14-26.13394.log
If there is an issue, please report to github with logFile!



NULL


2024-04-16 05:14:32.060723 : Batch Execution w/ safelapply!, 0 mins elapsed.

###########
2024-04-16 06:28:39.973058 : Completed Computing Deviations!, 74.231 mins elapsed.
###########

ArchR logging successful to : ArchRLogs/ArchR-addDeviationsMatrix-734d57e93f06-Date-2024-04-16_Time-05-14-26.13394.log



In [20]:
saveArchRProject(ArchRProj = projHeme1, outputDirectory = "PenSen_ATAC_L2", load = TRUE)

Saving ArchRProject...

Loading ArchRProject...

Successfully loaded ArchRProject!


                                                   / |
                                                 /    \
            .                                  /      |.
            \\\                              /        |.
              \\\                          /           `|.
                \\\                      /              |.
                  \                    /                |\
                  \\#####\           /                  ||
                ==###########>      /                   ||
                 \\##==......\    /                     ||
            ______ =       =|__ /__                     ||      \\\
       \               '        ##_______ _____ ,--,__,=##,__   ///
        ,    __==    ___,-,__,--'#'  ==='      `-'    | ##,-/
        -,____,---'       \\####\\________________,--\\_##,/
           ___      .______        ______  __    __  .______      
          

class: ArchRProject 
outputDirectory: /home/jupyter/BRI_Analysis/scRNA/Analysis-Cross-Sectional/TEAseq/PenSen_ATAC_L2 
samples(16): GSM6611363_B065-P1_PB00593-04
  GSM6611364_B069-P1_PB00323-02 ... GSM6611377_B065-P1_PB00192-02
  GSM6611378_B065-P1_PB00197-02
sampleColData names(1): ArrowFiles
cellColData names(18): Sample TSSEnrichment ... ReadsInPeaks FRIP
numberOfCells(1): 287761
medianTSS(1): 25.364
medianFrags(1): 6889

In [3]:
projHeme1 <- loadArchRProject(path = 'PenSen_ATAC_L2_cisbp/')

Successfully loaded ArchRProject!


                                                   / |
                                                 /    \
            .                                  /      |.
            \\\                              /        |.
              \\\                          /           `|.
                \\\                      /              |.
                  \                    /                |\
                  \\#####\           /                  ||
                ==###########>      /                   ||
                 \\##==......\    /                     ||
            ______ =       =|__ /__                     ||      \\\
       \               '        ##_______ _____ ,--,__,=##,__   ///
        ,    __==    ___,-,__,--'#'  ==='      `-'    | ##,-/
        -,____,---'       \\####\\________________,--\\_##,/
           ___      .______        ______  __    __  .______      
          /   \     |   _  \      /      ||  |  |  | |   _ 

In [7]:
library(BSgenome.Hsapiens.UCSC.hg38)
projHeme1 <- addMotifAnnotations(ArchRProj = projHeme1, motifSet = "cisbp", name = "cisbp", force = TRUE)
projHeme1 <- addBgdPeaks(projHeme1,force = TRUE)
projHeme1 <- addDeviationsMatrix(
  ArchRProj = projHeme1, 
  peakAnnotation = "Motif",
  force = TRUE,threads=50
)

saveArchRProject(ArchRProj = projHeme1, outputDirectory = "PenSen_ATAC_L2_cisbp", load = TRUE)

Loading required package: BSgenome

Loading required package: Biostrings

Loading required package: XVector


Attaching package: ‘XVector’


The following object is masked from ‘package:plyr’:

    compact


The following object is masked from ‘package:purrr’:

    compact



Attaching package: ‘Biostrings’


The following object is masked from ‘package:grid’:

    pattern


The following object is masked from ‘package:base’:

    strsplit


Loading required package: BiocIO

Loading required package: rtracklayer


Attaching package: ‘rtracklayer’


The following object is masked from ‘package:BiocIO’:

    FileForFormat


ArchR logging to : ArchRLogs/ArchR-addMotifAnnotations-a95151975971-Date-2024-04-16_Time-15-13-57.350329.log
If there is an issue, please report to github with logFile!

peakAnnotation name already exists! Overriding.

2024-04-16 15:13:57.399649 : Gettting Motif Set, Species : Homo sapiens, 0.001 mins elapsed.

Using version 2 motifs!

2024-04-16 15:13:58.849237 : Fin

NULL


'as(<lgCMatrix>, "dgCMatrix")' is deprecated.
Use 'as(., "dMatrix")' instead.
See help("Deprecated") and help("Matrix-deprecated").

2024-04-16 15:18:55.103107 : Batch Execution w/ safelapply!, 0 mins elapsed.

###########
2024-04-16 17:07:19.647215 : Completed Computing Deviations!, 108.505 mins elapsed.
###########

ArchR logging successful to : ArchRLogs/ArchR-addDeviationsMatrix-a9514b12213c-Date-2024-04-16_Time-15-18-49.329871.log

Saving ArchRProject...

Loading ArchRProject...

Successfully loaded ArchRProject!


                                                   / |
                                                 /    \
            .                                  /      |.
            \\\                              /        |.
              \\\                          /           `|.
                \\\                      /              |.
                  \                    /                |\
                  \\#####\           /                  ||
             

class: ArchRProject 
outputDirectory: /home/jupyter/BRI_Analysis/scRNA/Analysis-Cross-Sectional/TEAseq/PenSen_ATAC_L2_cisbp 
samples(16): GSM6611374_B076-P1_PB00127-02
  GSM6611366_B076-P1_PB00353-03 ... GSM6611363_B065-P1_PB00593-04
  GSM6611372_B069-P1_PB00172-02
sampleColData names(1): ArrowFiles
cellColData names(18): Sample TSSEnrichment ... ReadsInPeaks FRIP
numberOfCells(1): 287761
medianTSS(1): 25.364
medianFrags(1): 6889

# AIFI_L3

In [39]:
projHeme1 <- loadArchRProject(path = 'PenSen_ATAC_L2/')

Successfully loaded ArchRProject!


                                                   / |
                                                 /    \
            .                                  /      |.
            \\\                              /        |.
              \\\                          /           `|.
                \\\                      /              |.
                  \                    /                |\
                  \\#####\           /                  ||
                ==###########>      /                   ||
                 \\##==......\    /                     ||
            ______ =       =|__ /__                     ||      \\\
       \               '        ##_______ _____ ,--,__,=##,__   ///
        ,    __==    ___,-,__,--'#'  ==='      `-'    | ##,-/
        -,____,---'       \\####\\________________,--\\_##,/
           ___      .______        ______  __    __  .______      
          /   \     |   _  \      /      ||  |  |  | |   _ 

In [40]:
cell_meta_filtered<-as.data.frame(projHeme1@cellColData) %>% 
filter(AIFI_L3%in% T_cell_L3)

In [41]:
projHeme1<-projHeme1[rownames(cell_meta_filtered),]

Dropping ImputeWeights Since You Are Subsetting Cells! ImputeWeights is a cell-x-cell Matrix!



In [43]:
projHeme1 <- addGroupCoverages(ArchRProj = projHeme1, groupBy = "AIFI_L3", maxReplicates = 16, maxCells = 2000, threads = 10, force = TRUE )

ArchR logging to : ArchRLogs/ArchR-addGroupCoverages-81f145e2ae73-Date-2024-04-10_Time-20-26-25.468317.log
If there is an issue, please report to github with logFile!

CD4 MAIT (1 of 35) : CellGroups N = 2

CD8 MAIT (2 of 35) : CellGroups N = 16

CD8aa (3 of 35) : CellGroups N = 9

CM CD4 T cell (4 of 35) : CellGroups N = 16

CM CD8 T cell (5 of 35) : CellGroups N = 15

Core naive CD4 T cell (6 of 35) : CellGroups N = 16

Core naive CD8 T cell (7 of 35) : CellGroups N = 16

DN T cell (8 of 35) : CellGroups N = 13

GZMB- CD27- EM CD4 T cell (9 of 35) : CellGroups N = 2

GZMB- CD27+ EM CD4 T cell (10 of 35) : CellGroups N = 16

GZMB+ Vd2 gdT (11 of 35) : CellGroups N = 2

GZMK- CD27+ EM CD8 T cell (12 of 35) : CellGroups N = 2

GZMK+ CD27+ EM CD8 T cell (13 of 35) : CellGroups N = 16

GZMK+ memory CD4 Treg (14 of 35) : CellGroups N = 2

GZMK+ Vd2 gdT (15 of 35) : CellGroups N = 2

ISG+ MAIT (16 of 35) : CellGroups N = 2

ISG+ memory CD4 T cell (17 of 35) : CellGroups N = 2

ISG+ memory C

In [44]:

pathToMacs2<-'/opt/conda/bin/macs3'


In [45]:
projHeme1 <- addReproduciblePeakSet(
                        ArchRProj = projHeme1, 
                        groupBy = "AIFI_L3", 
                        pathToMacs2 = pathToMacs2, force = TRUE
                            )

ArchR logging to : ArchRLogs/ArchR-addReproduciblePeakSet-81f140a708bd-Date-2024-04-10_Time-21-04-48.441453.log
If there is an issue, please report to github with logFile!

Calling Peaks with Macs2

2024-04-10 21:04:48.837942 : Peak Calling Parameters!, 0.007 mins elapsed.



                                                                    Group
CD4 MAIT                                                         CD4 MAIT
CD8 MAIT                                                         CD8 MAIT
CD8aa                                                               CD8aa
CM CD4 T cell                                               CM CD4 T cell
CM CD8 T cell                                               CM CD8 T cell
Core naive CD4 T cell                               Core naive CD4 T cell
Core naive CD8 T cell                               Core naive CD8 T cell
DN T cell                                                       DN T cell
GZMB- CD27- EM CD4 T cell                       GZMB- CD27- EM CD4 T cell
GZMB- CD27+ EM CD4 T cell                       GZMB- CD27+ EM CD4 T cell
GZMB+ Vd2 gdT                                               GZMB+ Vd2 gdT
GZMK- CD27+ EM CD8 T cell                       GZMK- CD27+ EM CD8 T cell
GZMK+ CD27+ EM CD8 T cell             

2024-04-10 21:04:48.84752 : Batching Peak Calls!, 0.007 mins elapsed.

2024-04-10 21:04:48.853907 : Batch Execution w/ safelapply!, 0 mins elapsed.

2024-04-10 21:09:54.880916 : Identifying Reproducible Peaks!, 5.107 mins elapsed.

2024-04-10 21:10:17.351342 : Creating Union Peak Set!, 5.482 mins elapsed.

Converged after 12 iterations!

Plotting Ggplot!

2024-04-10 21:10:28.735155 : Finished Creating Union Peak Set (182757)!, 5.672 mins elapsed.



In [46]:
projHeme1 <- addPeakMatrix(projHeme1, force = TRUE)

ArchR logging to : ArchRLogs/ArchR-addPeakMatrix-81f13be4f3f2-Date-2024-04-10_Time-21-10-28.752561.log
If there is an issue, please report to github with logFile!

2024-04-10 21:10:28.929324 : Batch Execution w/ safelapply!, 0 mins elapsed.

Overriding previous entry for ReadsInPeaks

Overriding previous entry for FRIP

ArchR logging successful to : ArchRLogs/ArchR-addPeakMatrix-81f13be4f3f2-Date-2024-04-10_Time-21-10-28.752561.log



In [47]:
library(BSgenome.Hsapiens.UCSC.hg38)
library(TFBSTools)
library(JASPAR2020)
tair.motif=getMatrixSet(x=JASPAR2020,opts=list(collection="CORE",species="9606",matrixtype="PWM"))

In [48]:
projHeme1=addMotifAnnotations(projHeme1,name="JASPARMotif",motifPWMs=tair.motif,force=TRUE)

ArchR logging to : ArchRLogs/ArchR-addMotifAnnotations-81f1519c92b9-Date-2024-04-10_Time-21-14-25.020458.log
If there is an issue, please report to github with logFile!

peakAnnotation name already exists! Overriding.

2024-04-10 21:14:25.100729 : Gettting Motif Set, Species : , 0.001 mins elapsed.

2024-04-10 21:14:25.640346 : Finding Motif Positions with motifmatchr!, 0.01 mins elapsed.

2024-04-10 21:16:40.810333 : All Motifs Overlap at least 1 peak!, 2.263 mins elapsed.

2024-04-10 21:16:40.813341 : Creating Motif Overlap Matrix, 2.263 mins elapsed.

2024-04-10 21:16:44.200172 : Finished Getting Motif Info!, 2.32 mins elapsed.

ArchR logging successful to : ArchRLogs/ArchR-addMotifAnnotations-81f1519c92b9-Date-2024-04-10_Time-21-14-25.020458.log



In [49]:
projHeme1 <- addBgdPeaks(projHeme1,force=T)

Identifying Background Peaks!



In [50]:
projHeme1 <- addDeviationsMatrix(
  ArchRProj = projHeme1, 
  peakAnnotation = "Motif",
  force = TRUE
)


Using Previous Background Peaks!

ArchR logging to : ArchRLogs/ArchR-addDeviationsMatrix-81f140965259-Date-2024-04-10_Time-21-17-02.747311.log
If there is an issue, please report to github with logFile!



NULL


2024-04-10 21:17:08.812261 : Batch Execution w/ safelapply!, 0 mins elapsed.

###########
2024-04-10 22:34:47.905838 : Completed Computing Deviations!, 77.753 mins elapsed.
###########

ArchR logging successful to : ArchRLogs/ArchR-addDeviationsMatrix-81f140965259-Date-2024-04-10_Time-21-17-02.747311.log



In [51]:
projHeme1


           ___      .______        ______  __    __  .______      
          /   \     |   _  \      /      ||  |  |  | |   _  \     
         /  ^  \    |  |_)  |    |  ,----'|  |__|  | |  |_)  |    
        /  /_\  \   |      /     |  |     |   __   | |      /     
       /  _____  \  |  |\  \\___ |  `----.|  |  |  | |  |\  \\___.
      /__/     \__\ | _| `._____| \______||__|  |__| | _| `._____|
    



class: ArchRProject 
outputDirectory: /home/jupyter/BRI_Analysis/scRNA/Analysis-Cross-Sectional/TEAseq/PenSen_ATAC_L2 
samples(16): GSM6611363_B065-P1_PB00593-04
  GSM6611364_B069-P1_PB00323-02 ... GSM6611377_B065-P1_PB00192-02
  GSM6611378_B065-P1_PB00197-02
sampleColData names(1): ArrowFiles
cellColData names(18): Sample TSSEnrichment ... ReadsInPeaks FRIP
numberOfCells(1): 287254
medianTSS(1): 25.363
medianFrags(1): 6890

In [52]:
projHeme1 <- addImputeWeights(projHeme1)


ArchR logging to : ArchRLogs/ArchR-addImputeWeights-81f1382c0265-Date-2024-04-10_Time-22-56-09.887072.log
If there is an issue, please report to github with logFile!

2024-04-10 22:56:09.927168 : Computing Impute Weights Using Magic (Cell 2018), 0 mins elapsed.

Filtering 1 dims correlated > 0.75 to log10(depth + 1)



In [53]:
saveArchRProject(ArchRProj = projHeme1, outputDirectory = "PenSen_ATAC_L3", load = TRUE)

Copying ArchRProject to new outputDirectory : /home/jupyter/BRI_Analysis/scRNA/Analysis-Cross-Sectional/TEAseq/PenSen_ATAC_L3

Copying Arrow Files...

Copying Arrow Files (1 of 16)

Copying Arrow Files (2 of 16)

Copying Arrow Files (3 of 16)

Copying Arrow Files (4 of 16)

Copying Arrow Files (5 of 16)

Copying Arrow Files (6 of 16)

Copying Arrow Files (7 of 16)

Copying Arrow Files (8 of 16)

Copying Arrow Files (9 of 16)

Copying Arrow Files (10 of 16)

Copying Arrow Files (11 of 16)

Copying Arrow Files (12 of 16)

Copying Arrow Files (13 of 16)

Copying Arrow Files (14 of 16)

Copying Arrow Files (15 of 16)

Copying Arrow Files (16 of 16)

Getting ImputeWeights

Dropping ImputeWeights...

Copying Other Files...

Copying Other Files (1 of 6): Annotations

Copying Other Files (2 of 6): Background-Peaks.rds

Copying Other Files (3 of 6): GroupCoverages

Copying Other Files (4 of 6): IterativeLSI

Copying Other Files (5 of 6): PeakCalls

Copying Other Files (6 of 6): Plots

Saving Ar

class: ArchRProject 
outputDirectory: /home/jupyter/BRI_Analysis/scRNA/Analysis-Cross-Sectional/TEAseq/PenSen_ATAC_L3 
samples(16): GSM6611374_B076-P1_PB00127-02
  GSM6611366_B076-P1_PB00353-03 ... GSM6611363_B065-P1_PB00593-04
  GSM6611372_B069-P1_PB00172-02
sampleColData names(1): ArrowFiles
cellColData names(18): Sample TSSEnrichment ... ReadsInPeaks FRIP
numberOfCells(1): 287254
medianTSS(1): 25.363
medianFrags(1): 6890

In [None]:
projHeme1 <- addMotifAnnotations(ArchRProj = projHeme1, motifSet = "cisbp", name = "cisbp", force = TRUE)
projHeme1 <- addBgdPeaks(projHeme1,force = TRUE)
projHeme1 <- addDeviationsMatrix(
  ArchRProj = projHeme1, 
  peakAnnotation = "Motif",
  force = TRUE,threads=50
)

saveArchRProject(ArchRProj = projHeme1, outputDirectory = "PenSen_ATAC_L3_cisbp", load = TRUE)

ArchR logging to : ArchRLogs/ArchR-addMotifAnnotations-81f15e32463c-Date-2024-04-10_Time-23-17-52.434011.log
If there is an issue, please report to github with logFile!

peakAnnotation name already exists! Overriding.

2024-04-10 23:17:52.489305 : Gettting Motif Set, Species : Homo sapiens, 0.001 mins elapsed.

Using version 2 motifs!

2024-04-10 23:17:54.16636 : Finding Motif Positions with motifmatchr!, 0.029 mins elapsed.

2024-04-10 23:21:45.984995 : All Motifs Overlap at least 1 peak!, 3.892 mins elapsed.

2024-04-10 23:21:45.997494 : Creating Motif Overlap Matrix, 3.893 mins elapsed.

2024-04-10 23:21:49.75955 : Finished Getting Motif Info!, 3.955 mins elapsed.

ArchR logging successful to : ArchRLogs/ArchR-addMotifAnnotations-81f15e32463c-Date-2024-04-10_Time-23-17-52.434011.log

Identifying Background Peaks!

Using Previous Background Peaks!

ArchR logging to : ArchRLogs/ArchR-addDeviationsMatrix-81f137615a70-Date-2024-04-10_Time-23-22-15.843755.log
If there is an issue, please

NULL


2024-04-10 23:22:21.642919 : Batch Execution w/ safelapply!, 0 mins elapsed.

