# 08 Performing Analytics on a Multimodal Cancer Network

Mambo is built on top os [SNAP platform](http://snap.stanford.edu/) and can leverage [state-of-the-art large-scale network analytics methods provided by SNAP](http://snap.stanford.edu/proj/snap-www/). 

To perform analytics, we first convert the multimodal network into a directed graph (TNGraph), an undirected graph (TUNGraph), or a network (TNEANet), depending on the desired analysis. We then conduct the analyss and run machine learning algorithms on the converted graph. 

TUNGraph, TNGraph, and TNEGraph are three main graph types in SNAP, the library upon which Mambo is built:

* [TUNGraph](https://snap.stanford.edu/snappy/doc/reference/graphs.html#tungraph) is an undirected graph (single edge between a pair of nodes), 
* [TNGraph](https://snap.stanford.edu/snappy/doc/reference/graphs.html#tngraph) is a directed graph (single directed edge between a pair of nodes),
* [TNEGraph](https://snap.stanford.edu/snap/doc/snapuser-ref/de/de8/classTNEGraph.html) is a directed multi-graph (multiple directed edges between a pair of nodes).

The fourth graph type is [TNEANet](https://snap.stanford.edu/snap/doc/snapuser-ref/d6/db6/classTNEANet.html), which is similar to TNEGraph but also allows attributes on nodes and edges. 

In [4]:
import snap
import time

from utils.network_utils import get_num_elem_per_mode

### We begin by loading the multimodal cancer network from the file

In [5]:
# This network was constructed and saved to a file in notebook:
# 06 Constructing a Multimodal Network from Mode and Link Tables
filename = "output/cancer_example/cancer_example.graph"
FIn = snap.TFIn(filename)
Graph = snap.TMMNet.Load(FIn)

### Print the number of modes and links to check the network has been loaded correctly

In [30]:
print 'Modes: %d' % Graph.GetModeNets()
print 'Link types: %d' % Graph.GetCrossNets()

Modes: 5
Link types: 32


### Convert the network into a directed network

In [11]:
crossnetids = snap.TIntV()
crossneti = Graph.BegCrossNetI()
while crossneti < Graph.EndCrossNetI():
    crossnetids.Add(crossneti.GetCrossId())
    crossneti.Next()
        
nodeattrmapping = snap.TIntStrStrTrV()
edgeattrmapping = snap.TIntStrStrTrV()
    
start_time = time.time()
DirectedNetwork = Graph.ToNetwork(crossnetids, nodeattrmapping, edgeattrmapping)
end_time = time.time()
print "Converting to TNEANet  takes %s seconds" % (end_time - start_time)

Converting to TNEANet  takes 3.14202690125 seconds


### Analyze the resulting network, first print basic network statistics

In [17]:
snap.PrintInfo(DirectedNetwork, "Python type PNEANet", "output/output.txt", False)

In [22]:
map(lambda x: x.replace("\n", ""), open("output/output.txt").readlines())

['Python type PNEANet: Directed Multigraph',
 '  Nodes:                    24119',
 '  Edges:                    1805297',
 '  Zero Deg Nodes:           9744',
 '  Zero InDeg Nodes:         18300',
 '  Zero OutDeg Nodes:        10590',
 '  NonZero In-Out Deg Nodes: 4973',
 '  Unique directed edges:    900672',
 '  Unique undirected edges:  892822',
 '  Self Edges:               0',
 '  BiDir Edges:              19756',
 '  Closed triangles:         446625',
 '  Open triangles:           786261314',
 '  Frac. of closed triads:   0.000568',
 '  Connected component size: 0.537543',
 '  Strong conn. comp. size:  0.018823',
 '  Approx. full diameter:    13',
 '  90% effective diameter:  5.025004']

### Calculate network diameter

In [23]:
print "Diameter %d" % snap.GetBfsFullDiam(DirectedNetwork, 10)

Diameter 10


### Compute size distribution of weakly connected components (WCCs)

In [25]:
CntV = snap.TIntPrV()
snap.GetWccSzCnt(DirectedNetwork, CntV)
sizestring = ""
for p in CntV:
    sizestring += "%d\t\t%d\n" % (p.GetVal1(), p.GetVal2())
print 'WCC Size\tCount'
print sizestring

WCC Size	Count
1		9744
2		6
3		1
15		1
471		1
909		1
12965		1



## Network Science Analytics and Machine Learning on Graphs

Mambo provides [a variety of other network analytics methods](https://snap.stanford.edu/snappy/doc/reference/index-ref.html), which are available through SNAP. 

Prominent examples include: 
* **Connected components**
* **Breadth and depth first search**
* **Node centrality measures**
* **Network community detection**
* **Triads and clustering coefficient**
* **K-core computations**
* **Approximate neighborhoods**
* **Eigen and singular value decomposition**

[Reference manual](https://snap.stanford.edu/snappy/doc/reference/index-ref.html) provides code documentation, tutorials and examples of usage. 