## The preparation for plotting

### Loading data

We use one simulation dataset provided by *SymSim* to show the plotting skills. The raw expression matrix and the annotation are saved as *.txt* files and we will provide the R codes to read them.

In [1]:
data.d4 <- read.table("DemoData/simulated_counts_discrete_Sigma0.40.txt", header = T)
annotation.d4 <- read.table("DemoData/cell_labels_discrete_Sigma0.40.txt", header = T)

In [2]:
# create fake gene names for the expression matrix
data.d4 <- as.matrix(data.d4)

GeneName <- NULL
for(i in 1:3000){
  GeneName[i] <- paste("Gene",i,sep = "")
}
rownames(data.d4) <- GeneName

### Creating Seurat project

The preprocessing steps are based on the *Seurat* package. We will build the SNN graph and run HGC on it. Our environment is shown here.

R version 4.0.0 (2020-04-24)
Platform: x86_64-conda_cos6-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

attached base packages:stats, graphics, grDevices, utils, datasets, methods, base     
other attached packages: Seurat_3.2.1

In [4]:
library(Seurat)
library(HGC)

In [6]:
# preprocessing with Seurat, the parameters follow the defalt settings
set.seed(82225)
data.d4.seuratobj <- CreateSeuratObject(counts = data.d4, min.cells = 20)
data.d4.seuratobj <- NormalizeData(object = data.d4.seuratobj, verbose = F)
data.d4.seuratobj <- ScaleData(object = data.d4.seuratobj, features = row.names(data.d4.seuratobj), verbose = F)
data.d4.seuratobj <- FindVariableFeatures(object = data.d4.seuratobj, nfeatures = 2000, verbose = F)
data.d4.seuratobj <- RunPCA(object = data.d4.seuratobj, npcs = 100, verbose = F)
data.d4.seuratobj <- RunTSNE(object = data.d4.seuratobj, reduction.name = "tsne", perplexity = 30, dims = 1:25, verbose = F)
data.d4.seuratobj <- FindNeighbors(object = data.d4.seuratobj, nn.eps = 0.5, k.param = 30, dims = 1:25, verbose = F)

In [7]:
# put the annotation file into the Seurat object
data.d4.seuratobj$cell.type <- annotation.d4$pop

### Running HGC

The SNN graph is tranfered into the sparse matrix with the *as.sparse* matrix from *Seurat*. We save the clustering tree and the *Seurat* object as the *.Rdata* file for the plotting.

In [8]:
data.d4.graph <- as.sparse(data.d4.seuratobj@graphs$RNA_snn)

In [9]:
# run HGC to get the clustering tree and save the results
clus.tree <- HGC.paris(G = data.d4.graph)
clus.tree


Call:
NA

Cluster method   : HGC 
Distance         : Node pair sampling ratio 
Number of objects: 1000 


In [10]:
save(clus.tree, data.d4.seuratobj, file = "DemoData/datad4objects.Rdata")