# NetID Tutorial

In this tutorial, I would use hematopoietic single cell RNA-seq dataset to show how to conduct network inference through NetID. First, load single cell datasets and the NetID packages

In [1]:
library(NetID)
sce <- readRDS("blood_sce2.rds")

The sce objects contains the spliced/unspliced read count matrices and the metadata of cells. For showing the performance, I also loaded the transcriptional factors gene sets and the tissue-unspecific GRN curated from ChIP-seq datasets as the ground truth.

In [2]:
sce

Loading required package: SingleCellExperiment

Loading required package: SummarizedExperiment

Loading required package: MatrixGenerics

Loading required package: matrixStats


Attaching package: 'MatrixGenerics'


The following objects are masked from 'package:matrixStats':

    colAlls, colAnyNAs, colAnys, colAvgsPerRowSet, colCollapse,
    colCounts, colCummaxs, colCummins, colCumprods, colCumsums,
    colDiffs, colIQRDiffs, colIQRs, colLogSumExps, colMadDiffs,
    colMads, colMaxs, colMeans2, colMedians, colMins, colOrderStats,
    colProds, colQuantiles, colRanges, colRanks, colSdDiffs, colSds,
    colSums2, colTabulates, colVarDiffs, colVars, colWeightedMads,
    colWeightedMeans, colWeightedMedians, colWeightedSds,
    colWeightedVars, rowAlls, rowAnyNAs, rowAnys, rowAvgsPerColSet,
    rowCollapse, rowCounts, rowCummaxs, rowCummins, rowCumprods,
    rowCumsums, rowDiffs, rowIQRDiffs, rowIQRs, rowLogSumExps,
    rowMadDiffs, rowMads, rowMaxs, rowMeans2, rowMedians, rowMins,
    

class: SingleCellExperiment 
dim: 909 4444 
metadata(0):
assays(2): spliced unspliced
rownames(909): Aatf Abl1 ... Zswim4 Zzz3
rowData names(0):
colnames(4444): AATTCAGTATCACCGAG GAACGCCATTGTAATCCC ...
  ATTTGGGAGATCAAGTG AGTGGATGGCTGTTCTT
colData names(7): orig.ident nCount_RNA ... seurat_clusters celltype
reducedDimNames(0):
mainExpName: NULL
altExpNames(0):

In [3]:
TF <- read.csv("mouse-tfs.csv",header=TRUE,stringsAsFactors=FALSE)
gt_net <- read.csv("Non-Specific-ChIP-seq-network.csv",header=TRUE, stringsAsFactors=FALSE)

## 1. Learning GRN skeleton from sketched and aggregated single cell RNA-seq datasets
At the first step, NetID would samples the single cell RNA-seq dataset through sketching methods, e.g. geosketch or Seurat sketch. These sketching methods would sample the cells as the “meta-cells” which would cover all latent manifold of single cell transcriptome. Then, the nearest neighborhoods of those meta-cells would specifically assign to the one meta cell according to the edges P-values calculated from VarID, and finally perform aggregation. The resulted GEP of meta-cells would be used as the input of network inference methods like GENIE3 (default), to learn the basic network structure.

In [4]:
library(scater)
dyn.out <- RunNetID(sce, regulators = TF$TF, targets = TF$TF,netID_params = list(normalize=FALSE), dynamicInfer = FALSE)

Loading required package: scuttle

Loading required package: ggplot2



Find VarID object at local dictionary, Read VarID object...
Using NetID to perform skeleton estimation...
prune sampled neighbourhoods according to q-value...
assign weight for edges using p-value...
aggregated matrix: the number of genes:895; the number of samples:424


Tree method: RF
K: sqrt
Number of trees: 500


Using 12 cores.



Done...


This step only output the global GRN learned from all sampled meta-cells. To learn lineage-specific GRN, we need to run cellrank and Palantir at first. NetID provided a function for end-to-end cell fate analysis through cellrank and Palantir.

In [7]:
FateDynamic(sce,global_params = list(cluster_label = "celltype",min_counts = 0 ,nhvgs = 3000),palantir_params = list(start_cell = "TGACACAAGGCCTTCAGGT"))

Using palantir to perform cell fate analysis...
Using cellrank with stochastic mode velocity to perform cell fate analysis...


In [None]:
dyn.out <- RunNetID(sce, regulators = TF$TF, targets = TF$TF,netID_params = list(normalize=FALSE), dynamicInfer = TRUE)

## 2. Learning cell fate specific GRN through granger causal regression model.
The dyn.out object contains the skeleton of global network (learned from all cells without lineage information) and cell fate probability information. NetID would run L2-penalized granger regression model for each target genes to re-calculate the regulatory coefficients of the skeleton, with using cell fate probability as the “time-series”, and then aggregate the learned coefficients with the global coefficients through rank method.

In [None]:
GRN <- FateCausal(dyn.out,velo_infor = FALSE,L=10,aggre_method = "RobustRankAggreg")

The output of FateCausal is a list object contains the lineage specific GRN.
And finally, we benchmarked the learned cell type specific GRN, with the network learned from GENIE3 in the first stage. We used the ChIP-seq datasets as the ground truth for benchmarking.