# CARNIVAL output analysis

This is a tutorial to present how the functions/analysis developed in the network_tools repo can be used.
The _data_ folder contains the output of CARNIVAL (edges and nodes information) run for two Differential Expression Analysis comparing two groups (C1 and C2) of Hepatoblastoma pacients against Normal Tumor. 
The t-values from the DEA where use to calculate Dorothea TFs and progeny scores.
These were then the input of the CARNIVAL, together with the input network (omnipath_200303_all_geneSymbol_noComplex.csv)

In [1]:
import os
import pandas as pd
print ("pandas version: ", pd.__version__)

import networkx as nx
print ("networkX version: ", nx.__version__)

import rpy2.rinterface
%load_ext rpy2.ipython

edgesC1 = pd.read_csv('data/HB_C1vsNT_CARNIVAL_edges.csv', header = 0)
edgesC2 = pd.read_csv('data/HB_C2vsNT_CARNIVAL_edges.csv', header = 0)

nodesC1 = pd.read_csv('data/HB_C1vsNT_CARNIVAL_nodes.csv', header = 0)
nodesC2 = pd.read_csv('data/HB_C2vsNT_CARNIVAL_nodes.csv', header = 0)
omnipathNetwork = pd.read_csv('data/omnipath_200303_all_geneSymbol_noComplex.csv', header = 0)

pandas version:  1.0.1
networkX version:  2.4


  from pandas.core.index import Index as PandasIndex


In [None]:
# Here we can run python code directly

In [None]:
%%R

# adding this %%R creates the enviroment to run R code.
# we can pass information from python to R easily with
# check this documentation to know more:
# https://www.linkedin.com/pulse/interfacing-r-from-python-3-jupyter-notebook-jared-stufft/

# Create graph objects and calculate general network measurments
### Use of graph_measurments.py

We can produce a graph from the interaction files using the _carnival2directGraph_.
This function uses networkX module to create a direct graph where the weights are _sign * weight_.
If there are duplicated edges, the one with higher weight is keept (turn verbose=True to report this).

Having the graph, we can caluculate several general measurments using _graph_measurments_.
By default it gives number of nodes and edges, density, average betweenness and degree centrality.
Set extended=True to get extra measurments such as diameter, average closeness and eigenvector centrality,
and eccentricity.

From this representation, and from this representation
Finally, if we want to know which are the iniciators and the effectors of the network, 
we can call _get_initiators_effectors_, which produces 2 sets with the iniciators and the effectors.

In [32]:
import graph_measurments as gm

#create a directed graph
c1G = gm.carnival2directGraph(weigthSample = edgesC1, inverse=False, verbose=True)

#calculate iniciators and effectors
c1I, c1E = gm.get_initiators_effectors(weigthSample=edgesC1, inverse=False)
print("set of iniciators")
print(c1I)

#calculate measurments
measureC1 = gm.get_measurments(DG= c1G, extended=False)
print("dictionary of measurments")
print(measureC1)

# Check all path that connect 2 nodes
paths = nx.all_simple_paths(c1G, 'PCSK7', 'HIF1A')
print("All paths connecting PCSK7 and HIF1A")
print(list(paths))

set of iniciators:  {'ZNF76', 'PCSK7', 'NSD3', 'JAZF1', 'AKAP8', 'TLX1', 'KAT6A', 'POU2AF1', 'PTPRM', 'SP2', 'IRF2BP1', 'BLVRA', 'SUMO2', 'CDKN2B', 'NANOS1', 'RAP1GDS1', 'NKRF', 'SMARCB1', 'SIKE1', 'ATAD2', 'TMEM173', 'GZMA', 'PCGF2', 'PHLPP2', 'MED14', 'PTMA', 'HSBP1', 'E2F7', 'TRIO', 'E2F6', 'HOXB8', 'ZNF521', 'CSNK2A2', 'CDH15', 'FHL5', 'SMARCAD1'}
dictionary of measurments:  {'nNodes': 119, 'nEdges': 113, 'density': 0.008047286711294687, 'avg betweenness centrality': 0.0003073806663038966, 'avg degree centrality': 0.016094573422589374}
All paths connecting PCSK7 and HIF1A
[['PCSK7', 'PPP2CA', 'AKT3', 'EP300', 'STAT3', 'HIF1A'], ['PCSK7', 'PPP2CA', 'AKT3', 'CREB1', 'CREBBP', 'STAT3', 'HIF1A'], ['PCSK7', 'PPP2CA', 'AKT2', 'EP300', 'STAT3', 'HIF1A'], ['PCSK7', 'PPP2CA', 'AKT2', 'CREB1', 'CREBBP', 'STAT3', 'HIF1A']]


# Create adjacency matrices and compare them pair-wise

We can create adjacency matrices based on the inicial network feeded to CARNIVAL (_createAdjacencyMatrix_).
This 'framework' is useful to make the matrices comparison simpler.

Once the adjacency matrices are produced, they can be compared using _compareAdjacencies_.
This can compare two matrices based only on the occurency of the interaction in both matrices (weighted=F),
or it can take into account the weights during the comparison (weighted=T).
The output of this function is a list of 3 matrices:
    - sharedMTX: shared interactions between both matrices
    - uMtx1: unique interactions for matrix 1
    - uMtx2: unique interactions for matrix 2

In [3]:
%%R -i edgesC1 -i edgesC2 -i omnipathNetwork -o sharedMTX -o uMtx1 -o uMtx2

# load the functions
source('compare_topology_adjacency.r')

# create adjacency matrices with sign (always; - inhibition, + activation) and weights (optional)
adMTX = lapply(list(edgesC1,edgesC2), createAdjacencyMatrix, scafoldNET=omnipathNetwork, weighted=T)
print("Adjacency matrix dimensions")
print(lapply(adMTX, dim))

# compare adjacency matrices based only in the existance of an interaction (weighted=F)
matComparison = compareAdjacencies(adjMAT1=adMTX[[1]], adjMAT2=adMTX[[2]], weighted=F)
print("Type of interactions per matrix (no weight taken into account)")
print(lapply(matComparison, table))

# compare adjacency matrices based on the interaction and its weight (weighted=T)
matComparisonWeight = compareAdjacencies(adjMAT1=adMTX[[1]], adjMAT2=adMTX[[2]], weighted=T)
print("Type of interactions per matrix (using weights)")
print(lapply(matComparisonWeight, table))
 
# Return comparison to python. They will be kept in R and can be used any time
sharedMTX = data.frame(matComparison[[1]], stringsAsFactors=F)
uMtx1 = data.frame(matComparison[[2]], stringsAsFactors=F)
uMtx2 = data.frame(matComparison[[3]], stringsAsFactors=F)

[1] "Adjacency matrix dimensions"
[[1]]
[1] 3075 3136

[[2]]
[1] 3075 3136

[1] "Type of interactions per matrix (no weight taken into account)"
$sharedMTX

  -1    0    1 
  34 7928   48 

$uMtx1

  -1    0    1 
  14 7979   17 

$uMtx2

  -1    0    1 
  15 7969   26 

[1] "Type of interactions per matrix (using weights)"
$sharedMTX

-100    0  100 
  16 7962   32 

$uMtx1

-100  -71  -70  -54  -50  -46  -30  -29  -23    0   26   30   31   46   47   54 
   6    3    1    8    1    9    1    2    1 7945    1    1    1    3    1    3 
  69   70   71  100 
   1    1    1   20 

$uMtx2

-100  -68  -60  -56  -52  -50  -48  -44  -25  -16  -14    0   12   16   32   44 
   5    1    1    1    3   10    5    1    4    1    1 7935    1    1    3    1 
  48   50   51   52   56   68  100 
   5    4    1   10    1    3   12 



The objects _sharedMTX_, _uMtx1_, and _uMtx2_ are available now in both spaces: python and R.
Using _adjacency2DG_, we can transform the adjacency matrices in networkX direct graphs,
and then use the functions of _graph_measurments_

In [33]:
sDG = gm.adjacency2DG(adjaMTX = sharedMTX)

gm.get_measurments(DG= sDG, extended=False)

{'nNodes': 100,
 'nEdges': 82,
 'density': 0.008282828282828282,
 'avg betweenness centrality': 0.0001082251082251082,
 'avg degree centrality': 0.01656565656565657}