# scMLnet

**Language:** R (for statistical analysis) and Python (for visualization)

**Paper:** Cheng, J., et al., Inferring microenvironmental regulation of gene expression from single-cell RNA sequencing data using scMLnet with an application to COVID-19. Brief Bioinform, 2020.

**Code Resource:** https://github.com/SunXQlab/scMLnet

**Claiming:** Inference of intercellular and intracellular signaling networks (ligand-receptor-TF-target gene)

**Method:** Fisher's exact test and correlation, overlapping molecules

**Database:** 1. ligand-receptor information: 2557 pairs (from DLRP, IUPHAR, HPMR, HPRD, STRING databases and previos studies); 2. receptor-TF information: 39141 pairs (from STRING databases); 3. TF-target gene information (from TRED, KEGG, GeneCards and TRANSFAC databases)


## Input
1. scRNA-seq data: gene-expression matrix (raw) with rows as genes and columns as cells
2. cell type annotation
3. cell type: senders and receivers

For this tutorial, we will use scMLnet to construct the multi-layer signaling network between B cells and Secretory cells from scRNA-Seq data of BALF in COVID-19 patients.

Link of expression matrix and annotation: https://zenodo.org/record/4267609#.YNVskzoRVhF

Link of prior information: https://zenodo.org/record/5031204#.YNVtkzoRVhE

## Installation


### Preparation

In [1]:
# Required packages
    library(Seurat)
    library(Matrix)
    library(parallel)
    library(scMLnet)

Attaching SeuratObject



In [2]:
# Input data
    # import sample data
    GCMat <- readRDS("data.Rdata")
    GCMat<- as(GCMat,"dgCMatrix")
    
    # import sample annotation
    BarCluFile <- "barcodetype.txt"
    BarCluTable <- read.table(BarCluFile,sep = "\t",header = TRUE,stringsAsFactors = FALSE)

In [7]:
head(as.matrix(GCMat))
head(BarCluTable)

Unnamed: 0,AAACCTGAGATGTCGG-1_5,AAACCTGAGGCTCATT-1_5,AAACCTGCAATCCGAT-1_5,AAACCTGCATGGTCAT-1_5,AAACCTGGTTTAGCTG-1_5,AAACCTGTCAATCACG-1_5,AAACCTGTCCGAGCCA-1_5,AAACCTGTCCTCCTAG-1_5,AAACGGGAGAACTCGG-1_5,AAACGGGAGTCAAGCG-1_5,⋯,TTTGGTTCACGAGAGT-1_13,TTTGGTTCACTATCTT-1_13,TTTGGTTCATCGGGTC-1_13,TTTGGTTGTCTCTCTG-1_13,TTTGGTTTCAAACCAC-1_13,TTTGGTTTCGCTTGTC-1_13,TTTGTCAAGAAACGCC-1_13,TTTGTCACAACACCCG-1_13,TTTGTCAGTCTGCAAT-1_13,TTTGTCAGTGTTGGGA-1_13
AL627309.1,0,0,0,0,0,0,0,0,0,0,⋯,0,0,0,0,0,0,0,0,0,0
AL669831.5,0,0,0,0,0,0,0,0,0,0,⋯,0,0,0,0,0,0,0,0,0,0
FAM87B,0,0,0,0,0,0,0,0,0,0,⋯,0,0,0,0,0,0,0,0,0,0
LINC00115,0,0,0,0,0,0,0,0,0,0,⋯,0,0,0,0,0,0,0,0,0,0
FAM41C,0,0,0,0,0,0,0,0,0,0,⋯,0,0,0,0,0,0,0,0,0,0
NOC2L,0,1,0,0,0,0,0,0,0,1,⋯,0,1,0,0,1,0,0,0,2,0


Unnamed: 0_level_0,Barcode,Cluster
Unnamed: 0_level_1,<chr>,<chr>
1,AAACCTGAGATGTCGG-1_5,T cells
2,AAACCTGAGGCTCATT-1_5,macrophages
3,AAACCTGCAATCCGAT-1_5,macrophages
4,AAACCTGCATGGTCAT-1_5,T cells
5,AAACCTGGTTTAGCTG-1_5,T cells
6,AAACCTGTCAATCACG-1_5,T cells


We next define the receiver cell and sender cell that we want to explorer the cell-cell communication between them. In this example, we focus on the inter-/intracellular signaling network between B cells as senders and Secretory cells as receivers.

In [8]:
    types <- unique(BarCluTable$Cluster)
    
    LigClu <- "B cells"       #types[4]
    RecClu <- "Secretory"     #types[8]

### Default Parameters

In [9]:
    pval <- 0.05
    logfc <- 0.15
    LigRecLib <- "LigRec.txt"
    TFTarLib <- "TFTargetGene.txt"
    RecTFLib <- "RecTF.txt"

### Construction of Multi-layer Signaling Networks
If we use the same code as the tutorial from https://github.com/SunXQlab/scMLnet/blob/master/vignettes/Tutorial_of_scMLnet.md, errors would happen. Here we choose to turn the *GCMat* into a matrix to avoid the errors (may lead to a longer analysis time).

In [11]:
    netList <- RunMLnet(as.matrix(GCMat), BarCluFile, RecClu, LigClu, 
                        pval, logfc, 
                        LigRecLib, TFTarLib, RecTFLib)

[1] "check table cell:"
[1] 44513     2
[1] "Rec Cluster:"
[1] "Secretory"
[1] "Lig Cluster:"
[1] "B cells"
[1] "p val:"
[1] 0.05
[1] "logfc:"
[1] 0.15
[1] "get High Exp Gene in Secretory"
[1] "Secretory:1231"
[1] "gene:23916"
[1] "logfc.threshold:"
[1] 0.15


“the condition has length > 1 and only the first element will be used”
“the condition has length > 1 and only the first element will be used”


[1] "T-test in parallel"
[1] "find high gene num:5602"
[1] "-----------------------"
[1] "get High Exp Gene in B cells"
[1] "B cells:185"
[1] "gene:23916"
[1] "logfc.threshold:"
[1] 0.15


“the condition has length > 1 and only the first element will be used”
“the condition has length > 1 and only the first element will be used”


[1] "T-test in parallel"
[1] "find high gene num:723"
[1] "-----------------------"
[1] "Lig_Rec Num:18"
[1] "TF_Target Num:1330"
[1] "XZRec_XZTF Num:4707"
[1] "Rec common in LigRec and RecTF:"
[1] "TF common in RecTF and TFTar:"
[1] "calculate Cor between RecTF"
[1] "calculate Cor between TFTar"


In [13]:
str(netList)

List of 3
 $ LigRec: chr "LTB_CD40"
 $ RecTF : chr [1:41] "CD40_FOSL1" "CD40_NR3C1" "CD40_ZFP36" "CD40_SRF" ...
 $ TFTar : chr [1:488] "ABL1_BAX" "ABL1_BCL2" "ABL1_BCL6" "ABL1_CDKN1A" ...


### Save and Visualization of Multi-layer Signaling Networks

In [16]:
workdir <- "sample"
DrawMLnet(netList,LigClu,RecClu,workdir,plotMLnet = F)

Save Results
Finish!


If we set up the Python home path(python.exe) in Windows and set plotMLnet to TRUE, the signaling networks would be created automatically.

In [None]:
workdir <- "sample"
PyHome <- "D:/Miniconda3/envs/R36/python.exe" #for Window
DrawMLnet(netList,LigClu,RecClu,workdir,PyHome,plotMLnet = T)

Last updated: June 12,2021