# Dataset integration 

We will integrate the Neurons 5k dataset with a subset of 5K cells from the larger 10X dataset [1.3 million brain cells from E18 mouse](https://support.10xgenomics.com/single-cell-gene-expression/datasets/1.3.0/1M_neurons). 

In [None]:
suppressPackageStartupMessages({
library(dplyr)
library(patchwork)
library(Seurat)
library(SummarizedExperiment)
library(TENxBrainData)})

## Load data

### Pre-processing of the reference data

No need to run this during the workshop, just load the objects

In [None]:
load(file='../data/objects/a3.refseurat.RData',verbose = TRUE)
ref.sobj[["Dataset"]]<-'nr1M'

In [None]:
load(file="../data/objects/a2.neur5k.RData",verbose = TRUE)
nr5k[['Dataset']]<-'nr5k'

## Integrate datasets

In [None]:
options(future.globals.maxSize = 4000 * 1024^2, future.seed=NULL, warnings=FALSE)

objects <- list(nr5k,ref.sobj)
features <- SelectIntegrationFeatures(object.list = objects, nfeatures = 1000)
objects <- PrepSCTIntegration(object.list = objects, anchor.features = features, verbose = FALSE)

anchors <- FindIntegrationAnchors(object.list = objects, normalization.method = "SCT", anchor.features = features, verbose = FALSE)
nr.int <- IntegrateData(anchorset = anchors, normalization.method = "SCT", verbose = FALSE)

save(nr.int,file='../data/objects/a3.integrated.RData')

In [None]:
load(file='../data/objects/a3.integrated.RData',verbose=TRUE)

## Downstream analysis

### Hands-on activity 2

---

Perform the downstream analysis steps that we did on the previous section for the Neurons 5K dataset
1. Dimensionality reduction 
2. Clustering
3. Cell type annotation 

Could you find more cell types?

If you do not have the output from the previous sections, just load the following RData object:

In [None]:
load(file='../data/objects/a3.integrated.RData',verbose=TRUE)