# Using netZooR to analyze the regulatory processes of obesity in colon cancer

Authors: Tian Wang<sup>1</sup>, Camila Lopes-Ramos<sup>1</sup>, Marouen Ben Guebila<sup>1</sup>

<sup>1</sup> Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA.

## Introduction to The network Zoo (netZoo)

The Network Zoo (netZoo) is an open-source integrated software suite to reconstruct and analyze gene regulatory networks.

It is available in four programming languages: R (netZooR), Python (netZooPy), MATLAB (netZooM), and C (netZooC), as separate packages. 
<br><br>The netZoo includes the following tools: PANDA (netZooR, netZooPy, netZooM, netZooC), LIONESS (netZooR, netZooPy, netZooM), CONDOR (netZooR, netZooPy), MONSTER (netZooR), ALPACA (netZooR), PUMA (netZooPy, netZooM, netZooC), SAMBAR (netZooR, netZooPy), SPIDER (netZooM), OTTER (netZooR, netZooPy, netZooM), CRANE (netZooR). For more details of each tool and their publications please check the netZoo homepage [https://netzoo.github.io](https://netzoo.github.io/), and the source code in GitHub: https://github.com/netZoo.

## 1. Regulatory network differences associated with obesity in colon cancer
In this case study, we will demostrate the utility of *netZooR* package in analyzing biological networks, by using PANDA, CONDOR, and ALPACA to model colon cancer gene expression data from [The Cancer Genome Atlas (TCGA)](https://gdc.cancer.gov) and analyze the regulatory network differences associated with obesity in colon cancer. We start first by loading the data.

In [None]:
load("/opt/data/netZooR/colonObesity/netZooR_tutorial_coloncancer.RData")

### Install netZooR
The we download and install netZooR package from Github https://github.com/netZoo/netZooR. Since they were already installed on the server, we can skip this part.

In [None]:
#install.packages("devtools")
#library(devtools)
# install netZooR package development repo
#devtools::install_github("netZoo/netZooR@devel")

### Install the dependency packages
Also we can install additional packages to help us carry out the analysis. These lines are commented out in netbooks server since these packages are already installed.

In [None]:
#if (!requireNamespace("BiocManager", quietly = TRUE))
#    install.packages("BiocManager")
#BiocManager::install("fgsea")
#BiocManager::install("limma")

### Load the packages
Now, we can load all the required packages.

In [None]:
library('netZooR')     # load panda, condor, alpaca
library('fgsea')       # for enrichment analysis
library('ggplot2')     # for plotting
library('reshape2')    # to resize data frames
library('limma')       # to compute differential targeting
library('viridisLite') # plot communities
library('visNetwork')  # for network visualization

### 1.1. PANDA
PANDA (Passing Attributes between Networks for Data Assimilation) is a method for constructing gene regulatory networks. It uses message passing to merge 3 different data layers: protein-protein interaction (PPI), gene expression, and transcription factor (TF) motif data.

More details can be found in the published paper [(1)](https://doi.org/10.1371/journal.pone.0064832).

### The pre-processed RNA-seq data of primary colon tumor samples from TCGA
We pre-processed level 3 RNASeq V2 and clinical data for colon cancer from [The Cancer Genome Atlas (TCGA)]((https://tcga-data.nci.nih.gov) on June 16, 2016 . After performing quality control steps, the discovery dataset included 445 primary colon tumor samples and 12817 genes before [treatment](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6169995/). To correct for batch effects, we applied smooth quantile normalization (qsmooth) and stored them in an `.Rdata` file. This file can also be downloaded using [this link](https://netzoo.s3.us-east-2.amazonaws.com/netZooR/colon_cancer_case_study/data/tcga_matched_XY.rdata).

In [None]:
# load tcga ExpressionSet data
load("/opt/data/netZooR/colonObesity/tcga_matched_XY.rdata")
# overview of colon cancer data from TCGA
class(tcga5)
dim(tcga5)

### Calculate Body Mass Index (BMI)
We calculate the Body Mass Index (BMI) using the weight and the height, using the following formula: [weight (kg) / [height (m)]2](https://www.cdc.gov/healthyweight/assessing/bmi/childrens_bmi/childrens_bmi_formula.html).

According to the [CDC](https://www.cdc.gov/healthyweight/assessing/bmi/adult_bmi/index.html#Interpreted), the following different BMI range define different weight status:
<br>Below 18.5:	Underweight
<br>18.5 – 24.9:	Normal or healthy Weight
<br>25.0 – 29.9: Overweight 
<br>30.0 and Above:	Obese.

Then, we use this standard to classify our data:

In [None]:
# calculate BMI values
tcga5$BMI <- as.numeric(tcga5$weight_kg_at_diagnosis)/(as.numeric(tcga5$height_cm_at_diagnosis)/100)^2
# subset BMI numeric vector into category vector
bmi <- as.numeric(pData(tcga5)$BMI)
bmi_cat <- as.character(ifelse(bmi <= 18.5, "UNDER", ifelse((bmi >= 18.5) & (bmi <= 24.9), "NORMAL", ifelse((bmi >= 25) & (bmi <= 29.9), "OVER", "OBESE"))))
bmi_cat[which(is.na(bmi_cat))] <- "NA"
tcga5$BMI_cat <- bmi_cat
# summary of BMI 
table(bmi_cat)

### Expression data files
For our analysis, we only consider two sample groups: **obese** and **normal weight**. Therefore, we extract the normalized expression data of those samples from the entire `ExpressionSet` object of our processed gene expression data, then we write to a `.txt` file to be given as an input to PANDA.
The columnn represents sample ID and the rows represent gene identifiers.

In [None]:
all_expr <- as.data.frame(assayData(tcga5)$logQsmooth)
# expression data of normal weight (i.e. BMI_cat == "NORMAL")
normal_expr <- all_expr[,bmi_cat=="NORMAL"]
head(normal_expr)
#write.table(normal_expr, file = "normal_weight_expr.txt", sep = "\t",
#            row.names = T, col.names = F)

# expression data of obese
obese_expr <- all_expr[,bmi_cat=="OBESE"]

#write.table(obese_expr, file = "obese_expr.txt", sep = "\t",
#            row.names = T, col.names = F)

### Motif prior data
The transcription factor motif prior represents putative regulation events where a transcription factor (TF) binds in the promotor of a gene to regulate its expression, as predicted by the presence of transcription factor binding motifs in the promotor region of the gene. The motif prior is thus a directed network linking transcription factors to their predicted gene targets. These are small example priors for the purposes of demonstrating this method. A complete set of priors by species can be downloaded from: https://sites.google.com/a/channing.harvard.edu/kimberlyglass/tools/resources.

In our case study, we will use a motif prior network of the Human Motif Scan (Homo sapiens; hg38), which is a three-columns table, the first column represents TFs and the second column represents gene identifier. The third column is a binary variable, which means that TF-Gene edge exists if it is equal to 1 and 0 otherwise.

In [None]:
motif <- read.delim("/opt/data/netZooR/colonObesity/motif_hg38.txt", stringsAsFactors=F, header=F)
motif[1:5,]

### Protein-protein interaction
The protein-protein interaction (PPI) network is an undirected network that represents physical and other types of interactions between transcription factor proteins. Here, we use the function `source.PPI()` in netZooR package to obtain PPI network in STRINGdb v10 between the TFs present in the motif prior. The PPI network is a network with three columns, the first two colums are proteins, and the third column represents an interaction score between the two proteins. We provide a precomputed PPI network so we can skip this part.

In [None]:
# obtain Protein-protein cooperative interaction data in STRINGdb by using source.PPI()
#TF <- data.frame(TF=motif$V1)
#PPI <- source.PPI(TF,version="10", species=9606)
#head(PPI)
#write.table(PPI, file = "ppi_hg38.txt", sep = "\t",
#            row.names = F, col.names = F)

### Building a PANDA regulatory network
Now we run PANDA by calling the function `panda.py` in netZooR, pointing it to the parsed expression data, motif prior, and ppi prior to generate the regulatory network for normal weight cohort and obese cohort. We use the "intersection" mode of PANDA, which will select the genes present in both expression data and motif prior, and TFs present in both motif prior and PPI data, to build a PANDA network.

In [None]:
# generate PANDA list (entire PANDA network, indegree network, and outdegree network)
normal_panda <- panda.py("/opt/data/netZooR/colonObesity/normal_weight_expr.txt", "/opt/data/netZooR/colonObesity/motif_hg38.txt", "/opt/data/netZooR/colonObesity/ppi_hg38.txt", modeProcess = "intersection")$panda
obese_panda<- panda.py("/opt/data/netZooR/colonObesity/obese_expr.txt", "/opt/data/netZooR/colonObesity/motif_hg38.txt", "/opt/data/netZooR/colonObesity/ppi_hg38.txt", modeProcess = "intersection")$panda

The final PANDA network consists of four columns: the first column "TF" is TF identifier, the second column "Gene" is Gene identifier, the third column "Motif" is a binary vector that represents the existence of TF-Gene edge in the motif prior, and the fourth column is the edge weight calculated by PANDA, representing the "likelihood" that a transcription factor binds the promotor of and regulates the expression of its target gene.  

In [None]:
head(obese_panda)

Both aggregate PANDA networks of obese and normal weight patients consist of 712 TFs and 12500 Genes. Since PANDA networks are complete, this represents 8900000 edges in total.

In [None]:
dim(normal_panda)
length(unique(normal_panda$TF))
length(unique(normal_panda$Gene))

In [None]:
dim(obese_panda)
length(unique(obese_panda$TF))
length(unique(obese_panda$Gene))

### Explore differential edge in PANDA networks
### Visualize PANDA networks 

In this section we will visualize parts of the network using the JavaScript library `visNetwork`.
There are other functions in netZooR that allow to export to Cytoscape such as by using function `vis.panda.in.cytoscape()` to plot networks and by using function `create.panda.style()` to create a PANDA-sepcific network style in Cytoscape.

First, we select the 200 top-scoring edge of each PANDA network, and plot the sub-PANDA network in visNetwork.

In [None]:
# the 200 highest edge weights of each PANDA network
normal_panda_top200 <- head(normal_panda[order(normal_panda$Score, decreasing = TRUE), ],200)
obese_panda_top200 <- head(obese_panda[order(obese_panda$Score, decreasing = TRUE), ],200)

edges = normal_panda_top200
edges$arrows = "to" 
colnames(edges) <- c("from","to","motif","force","arrows")
nodes <- data.frame(id = unique(as.vector(as.matrix(edges[,c(1,2)]))) , 
                    label=unique(as.vector(as.matrix(edges[,c(1,2)]))))
nodes$group <- ifelse(nodes$id %in% edges$from, "TF", "gene")

net <- visNetwork(nodes, edges, width = "100%")
net <- visGroups(net, groupname = "TF", shape = "square",
                     color = list(background = "teal", border="black"))
net <- visGroups(net, groupname = "gene", shape = "dot",       
                     color = list(background = "gold", border="black"))
visLegend(net, main="Legend", position="right", ncol=1) 

In [None]:
edges = obese_panda_top200
edges$arrows = "to" 
colnames(edges) <- c("from","to","motif","force","arrows")
nodes <- data.frame(id = unique(as.vector(as.matrix(edges[,c(1,2)]))) , 
                    label=unique(as.vector(as.matrix(edges[,c(1,2)]))))
nodes$group <- ifelse(nodes$id %in% edges$from, "TF", "gene")

net <- visNetwork(nodes, edges, width = "100%")
net <- visGroups(net, groupname = "TF", shape = "square",
                     color = list(background = "teal", border="black"))
net <- visGroups(net, groupname = "gene", shape = "dot",       
                     color = list(background = "gold", border="black"))
visLegend(net, main="Legend", position="right", ncol=1) 

### Visualize the top differential edges between normal and obese cohort PANDA network

We wanted to identify potential regulatory interactions that best characterized each of the subtype-specific networks. Therefore, we selected edges based both on the probability that they are “supported” in the network inference, and on whether they are “different” between the subtypes. 

To determine the probability that an edge is “supported,” we took the value of the cumulative distribution function of a normal distribution to assign a probability value between zero and one for each edge (instead of a z-score). 

To determine the probability that an edge is “different” between the networks, we first subtracted the z-score weight values estimated by PANDA for the two networks and then determined the value of the cumulative distribution for this difference. The product of these two probabilities represents the probability than an edge is both “supported” and “different.” We select edges for which this combined probability is greater than 80% (default value is 0.8) [(3)](https://pubmed.ncbi.nlm.nih.gov/25888305/).

We can use function `panda.diff.edges()` to perform above calculation. Here, we use 0.98 as threshold to reduce the number of edges in VisNetwork.

Green edges indicate higher edge weight in the defined condition_name parameter (normal weight in our example), and red edges indicate higher edge weight in the other condition (obese in our example). However, since we took only the positive tail of the difference between the networks, only the green edges are plotted.

In [None]:
diff_panda= head(panda.diff.edges(normal_panda,obese_panda,threshold = 0.98, condition_name = "normal"),200)

edges = diff_panda
edges$arrows = "to" 
colnames(edges) <- c("from","to","motif","force","normal","arrows")
edges$color  = ifelse(edges$force > 0, "green", "red")
nodes <- data.frame(id = unique(as.vector(as.matrix(edges[,c(1,2)]))) , 
                    label=unique(as.vector(as.matrix(edges[,c(1,2)]))))
nodes$group <- ifelse(nodes$id %in% edges$from, "TF", "gene")

net <- visNetwork(nodes, edges, width = "100%")
net <- visGroups(net, groupname = "TF", shape = "square",
                     color = list(background = "teal", border="black"))
net <- visGroups(net, groupname = "gene", shape = "dot",       
                     color = list(background = "gold", border="black"))
visLegend(net, main="Legend", position="right", ncol=1) 

### 1.2. CONDOR

COmplex Network Description Of Regulators (CONDOR), is a method to analyse bipartite community structure of biological networks [(4)](https://pubmed.ncbi.nlm.nih.gov/27618581/).

### Generate CONDOR object from a PANDA network

We could use function `panda.to.condor.object` to convert a PANDA network to an object of the CONDOR algorithm. Since CONDOR requires positive edge weights, this function thresholds the input PANDA edges to the positive edges only.

In [None]:
normal_condor <- panda.to.condor.object(normal_panda)

### Explore CONDOR object
Now we can perform community identification and get the membership of each node in the network.

In [None]:
# cluster nodes and produce overall modularity 
normal_condor <- condor.cluster(normal_condor,project = F)

# print membership of community
normal_condor$red.memb
normal_condor$blue.memb

### Plot communities in each network
Links within communities (colored points) are shown along the diagonal, with links that go between communities in black. Community IDs are plotted along the x-axis.

In [None]:
normal_color_num <- max(normal_condor$red.memb$com)
normal_color <- viridis(normal_color_num, alpha = 1, begin = 0, end = 1, 
direction = 1, option = "D")
condor.plot.communities(normal_condor, color_list=normal_color, 
point.size=0.01, xlab="Gene", ylab="TF")

In [None]:
obese_condor <- panda.to.condor.object(obese_panda)
# cluster nodes and produce overall modularity 
obese_condor <- condor.cluster(obese_condor,project = F)

# print membership of community
obese_condor$red.memb
obese_condor$blue.memb

In [None]:
condor.plot.communities(obese_condor, color_list=normal_color, 
point.size=0.01, xlab="Gene", ylab="TF")

### 1.3. ALPACA

ALtered Partitions Across Community Architectures (ALPACA) is a graph-based approach that compares two networks and identifies de novo the gene modules that best distinguish the networks. 

### Run ALPACA
First, we need to remove the prefix of TF and Gene identifiers before calling ALPACA.

In [None]:
# head(normal_panda)
# head(obese_panda)

normal_panda_1 <- normal_panda
normal_panda_1$TF <- gsub("reg_","",normal_panda_1$TF)
normal_panda_1$Gene <- gsub("tar_","",normal_panda_1$Gene)

obese_panda_1 <- obese_panda
obese_panda_1$TF <- gsub("reg_","",obese_panda_1$TF)
obese_panda_1$Gene <- gsub("tar_","",obese_panda_1$Gene)

head(normal_panda_1)
head(obese_panda_1)

# run ALPACA
alpaca <- panda.to.alpaca(normal_panda_1, obese_panda_1,NULL,verbose = F)

### Interpretation of ALPACA results

In this step, we extract the top 50 genes in each community.

In [None]:
summary(alpaca)
alp_topgene <- alpaca.ExtractTopGenes(alpaca,100)
alp_topgene

### GO term enrichment analysis
Using the selected genes, we perform an enrichment analysis in Gene Ontology (GO) biological process using the function `alpaca.list.to.go`.

In [None]:
alpaca.go.list <- alpaca.list.to.go(alp_topgene[[1]],unique(normal_panda_1$Gene),alp_topgene[[2]])

In [None]:
alpaca.go.list[,c(3,9)] <- alpaca.go.list[,c(9,3)]
alpaca.go.list

Then, we sort the terms by p-value from the smallest to the largest.

In [None]:
alpaca.go.list[order(alpaca.go.list$Pvalue),]

## 2. Sex-linked regulatory processes in obese patients with colon cancer

### 2.1. LIONESS
Linear Interpolation to Obtain Network Estimates for Single Samples(LIONESS), is a method to estimates individual sample networks by applying linear interpolation to the predictions made by existing aggregate network inference approaches (here is PANDA).

In [None]:
# do NOT run this chuck
obese_lioness <- lioness.py("/opt/data/netZooR/colonObesity/obese_expr.txt","/opt/data/netZooR/colonObesity/motif_hg38.txt", "/opt/data/netZooR/colonObesity/ppi_hg38.txt", modeProcess = "intersection")

We can print the first 5 rows of each lioness network.

In [None]:
head(obese_lioness,5)

Then, we can calculate the gene indegree of each network.

In [None]:
length(unique(obese_lioness$TF))
length(unique(obese_lioness$Gene))

The in-degrees of genes is the sum of the weights of inbound edges around a gene.

In [None]:
obese_lioness_indegree <- aggregate(.~Gene,data = obese_lioness[,-1], FUN=sum)
# use gene names as rowname and remove Gene column
rownames(obese_lioness_indegree) <- sub("tar_", "", obese_lioness_indegree$Gene)
obese_lioness_indegree <- obese_lioness_indegree[,-1]
head(obese_lioness_indegree)

### 2.2. Differential targeting analysis with limma
Using the single-sample networks, we can compare the in-degree between males and females using linear regression model (limma package) and adjusting for covariates: stage, age, race.

First, we start by selecting the obese cohort by filtering the variable `OBESE` in expressionSet data tcga5.

In [None]:
obese_eset <- tcga5[,which(bmi_cat=="OBESE")]
obese_eset

Then we build the design matrix adjusting for covariates: stage, age, race in the obsese cohort.

In [None]:
# Define the covariates

gender_ob <- factor(as.character(pData(obese_eset)$gender),levels=c("MALE","FEMALE"))

stage_ob <- (as.character(pData(obese_eset)$uicc_stage))
stage_ob[which(is.na(stage_ob))] <- "NA"    

race_ob <- as.character(pData(obese_eset)$race)
race_ob[which(is.na(race_ob))] <- "NA"

age_ob <- as.numeric(pData(obese_eset)$age_at_initial_pathologic_diagnosis)
age_ob[which(is.na(age_ob))] <- mean(age_ob,na.rm=TRUE)


race_ob <- as.factor(race_ob)
stage_ob <- as.factor(stage_ob)
gender_ob <- as.factor(gender_ob)

# define design matrix
design_ob = model.matrix(~ stage_ob + race_ob + age_ob + gender_ob)

Finally, we compute differential gene targeting (gene in-degree) using limma.

In [None]:
# Run limma
fit_ob = lmFit(as.matrix(obese_lioness_indegree),design_ob)
fit_ob = eBayes(fit_ob)

We can see the top results for females by selecting the variable `gender_obFEMALE` in `topTable`.

In [None]:
tb_ob = topTable(fit_ob,coef="gender_obFEMALE",number=Inf)
head(tb_ob)

### 2.3. Gene Set Enrichment Analysis (GSEA)

Gene set enrichment analysis (GSEA) is a widely used tool for analyzing gene expression data. 

We will use `fgsea()` function from package `fgsea` to run pre-ranked GSEA, where the function requires a list of gene sets, and a named vector of gene-level statistics, where the names should be the same as the gene names in the pathways list

In [None]:
# store named vector of test statistics
indegree_rank_ob <- setNames(object=tb_ob[,"t"], rownames(tb_ob))
head(indegree_rank_ob)

Then we enrich the gene list in KEGG canonical pathways using unsorted gene list.
The gene sets can be downloaded from MSigDB: http://software.broadinstitute.org/gsea/msigdb, here, we use `c2.cp.kegg.v7.1.symbols` pathways.

In [None]:
set.seed(5) # for reproducibility
gmt.file <- "/opt/data/netZooR/colonObesity/c2.cp.kegg.v7.1.symbols.gmt"
pathways <- gmtPathways(gmt.file)

fgseaRes_ob <- fgsea(pathways, indegree_rank_ob, minSize=15, maxSize=500, nperm=1000)
head(fgseaRes_ob)


In [None]:
# Subset to pathways with FDR < 0.05
sig_ob <- fgseaRes_ob[fgseaRes_ob$padj < 0.05,]
# Top 10 pathways enriched in Female
sig_ob$pathway[sig_ob$NES > 0][1:10]

In [None]:
# Top 10 pathways enriched in male
sig_ob$pathway[sig_ob$NES < 0][1:10]

We used an FDR cut-off *0.1* to select the output for significant signatures to draw the Bubble plot of gene sets on y-axis and adjusted p-value (padj) on x-axis. Bubble size indicates the number of genes in each gene set, and bubble color indicates the normalized enrichment score (NES). Blue is for negative NES (enrichment of higher targeted genes in males), and red is for positive NES (enrichment of higher targeted genes in females).

In [None]:
plotBubblePlot <- function(dat,fdrcut=0.05, figTitle='Obese cohort:  Female(Red) vs \n Male(Blue)'){
# Settings
# fdrcut <- 0.05 # FDR cut-off to use as output for significant signatures
dencol_neg <- "blue" # bubble plot color for negative ES
dencol_pos <- "red" # bubble plot color for positive ES
signnamelength <- 4 # set to remove prefix from signature names (2 for "GO", 4 for "KEGG", 8 for "REACTOME")
asp <- 3 # aspect ratio of bubble plot
charcut <- 100 # cut signature name in heatmap to this nr of characters

# Make signature names more readable
a <- as.character(dat$pathway) # 'a' is a great variable name to substitute row names with something more readable
for (j in 1:length(a)){
  a[j] <- substr(a[j], signnamelength+2, nchar(a[j]))
}
a <- tolower(a) # convert to lower case (you may want to comment this out, it really depends on what signatures you are looking at, c6 signatures contain gene names, and converting those to lower case may be confusing)
for (j in 1:length(a)){
  if(nchar(a[j])>charcut) { a[j] <- paste(substr(a[j], 1, charcut), "...", sep=" ")}
} # cut signature names that have more characters than charcut, and add "..."
a <- gsub("_", " ", a)
dat$NAME <- a

# Determine what signatures to plot (based on FDR cut)
dat2 <- dat[dat[,"padj"]<fdrcut,]
dat2 <- dat2[order(dat2[,"padj"]),] 
dat2$signature <- factor(dat2$NAME, rev(as.character(dat2$NAME)))
# Determine what labels to color
sign_neg <- which(dat2[,"NES"]<0)
sign_pos <- which(dat2[,"NES"]>0)
# Color labels
signcol <- rep(NA, length(dat2$signature))
signcol[sign_neg] <- dencol_neg # text color of negative signatures
signcol[sign_pos] <- dencol_pos # text color of positive signatures
signcol <- rev(signcol) # need to revert vector of colors, because ggplot starts plotting these from below

# Plot bubble plot
g<-ggplot(dat2, aes(x=pval,y=signature,size=size))
g+geom_point(aes(fill=NES), shape=21, colour="white")+
  theme_bw()+ # white background, needs to be placed before the "signcol" line
  xlim(0,fdrcut)+
  scale_size_area(max_size=10,guide="none")+
  scale_fill_gradient2(low=dencol_neg, high=dencol_pos)+
  theme(axis.text.y = element_text(colour=signcol))+
  theme(aspect.ratio=asp, axis.title.y=element_blank())+ggtitle(figTitle) # test aspect.ratio
}

plotBubblePlot(as.data.frame(fgseaRes_ob,fdrcut=0.05))

Finally, the top enriched terms in KEGG database were defined as follows:

In Male: 
**ECM-receptor interaction**-The extracellular matrix (ECM) consists of a complex mixture of structural and functional macromolecules and serves an important role in tissue and organ morphogenesis and in the maintenance of cell and tissue structure and function. Specific interactions between cells and the ECM are mediated by transmembrane molecules, mainly integrins and perhaps also proteoglycans, CD36, or other cell-surface-associated components. These interactions lead to a direct or indirect control of cellular activities such as adhesion, migration, differentiation, proliferation, and apoptosis. In addition, integrins function as mechanoreceptors and provide a force-transmitting physical link between the ECM and the cytoskeleton. Integrins are a family of glycosylated, heterodimeric transmembrane adhesion receptors that consist of noncovalently bound alpha- and beta-subunits.

**NOD-like receptor signaling pathway - Homo sapiens (human)**-Specific families of pattern recognition receptors are responsible for detecting various pathogens and generating innate immune responses. The intracellular NOD-like receptor (NLR) family contains more than 20 members in mammals and plays a pivotal role in the recognition of intracellular ligands. NOD1 and NOD2, two prototypic NLRs, sense the cytosolic presence of the bacterial peptidoglycan fragments that escaped from endosomal compartments, driving the activation of NF-{kappa}B and MAPK, cytokine production and apoptosis. On the other hand, a different set of NLRs induces caspase-1 activation through the assembly of multiprotein complexes called inflammasomes. The activated of caspase-1 regulates maturation of the pro-inflammatory cytokines IL-1B, IL-18 and drives pyroptosis.