# 4. Networkx

[NetworkX](https://networkx.github.io/documentation/networkx-1.10/overview.html) is a Python language software package for the creation, manipulation, and study of the structure, dynamics, and function of complex networks.

With NetworkX you can load and store networks in standard and nonstandard data formats, generate many types of random and classic networks, analyze network structure, build network models, design new network algorithms, draw networks, and much more.

Creating a network begins with conceptualising our data into **nodes** and **edges**:

In [None]:
from IPython.display import Image
Image("GraphNodesEdges_1000.png")

Where nodes refer to the elements themselves or objects, and edges refer to the *relationships* between the nodes.

In [None]:
import networkx as nx

In [None]:
G = nx.Graph()
G

By definition, a Graph is a collection of nodes (vertices) along with identified pairs of nodes (called edges, links, etc). In NetworkX, nodes can be any hashable object e.g. a text string, an image, an XML object, another Graph, a customized node object, etc. (Note: Python’s None object should not be used as a node as it determines whether optional function arguments have been assigned in many functions.)

## Nodes

We can add one node or several at a time:

In [None]:
G.add_node(1)
G.nodes()

In [None]:
G.add_nodes_from([2, 3])
G.nodes()

## Edges

Adding edges between nodes is equally trivial:

In [None]:
G.add_edge(1,2)
G.edges()

In [None]:
G.add_edges_from([(1,3), (2,3)])
G.edges()

We can remove a node, and all associated edge connections with it:

In [None]:
G.remove_node(1)

In [None]:
G.nodes()

In [None]:
G.edges()

Or remove everything:

In [None]:
G.clear()
G.nodes()

If we add nodes that already exist, Networkx quietly ignores any duplicates:

In [None]:
G.add_nodes_from([1,2,3])
G.add_edges_from([(1,2),(1,3),(2,3)])
G.add_node(2)
G.nodes()

In [None]:
G.add_nodes_from("spam")
G.nodes()

In [None]:
G.number_of_nodes()

In [None]:
G.number_of_edges()

You might notice that nodes and edges are not specified as NetworkX objects. This leaves you free to use meaningful items as nodes and edges. The most common choices are numbers or strings, but a node can be any hashable object (except None), and an edge can be associated with any object x using `G.add_edge(n1,n2,object=x)`.

As an example, n1 and n2 could be protein objects from the RCSB Protein Data Bank, and x could refer to an XML record of publications detailing experimental observations of their interaction.

We have found this power quite useful, but its abuse can lead to unexpected surprises unless one is familiar with Python. If in doubt, consider using `convert_node_labels_to_integers()` to obtain a more traditional graph with integer labels.

### Accessing Edges

In addition to `nodes()`, `edges()` and `neighbors()`, iterator versions can save you from creating large lists when you were just going to iterate through them anyway:}

In [None]:
for i in G.edges:
    print(i)

### Direct Access

In [None]:
G[1] # do not change the resulting dict

In [None]:
G[1][2] 

In [None]:
G[1][2]['color']='blue'
G[1][2]

In [None]:
G.add_weighted_edges_from([('s', 'p', 0.5), ('p', 'm', .1), ('m', 3, .7), ('a', 2, .3)])
G.edges(data=True)

### Node Attributes

You can also add these attributes to nodes:

In [None]:
G.node[1]['room'] = 700
G.nodes(data=True)

## Directed Graphs

The DiGraph class provides additional methods specific to directed edges, e.g. `DiGraph.out_edges()`, `DiGraph.in_degree()`, `DiGraph.predecessors()`, `DiGraph.successors()` etc. To allow algorithms to work with both classes easily, the directed versions of `neighbors()` and `degree()` are equivalent to `successors()` and the sum of `in_degree()` and `out_degree()` respectively even though that may feel inconsistent at times.

In [None]:
DG = nx.DiGraph()
DG.add_weighted_edges_from([(1,2,.5), (1,3,.75),(2,3,.25)])
DG.nodes()

In [None]:
DG.out_degree()

In [None]:
DG.degree()

In [None]:
DG.successors(1)

In [None]:
# convert to undirected
G = nx.Graph(DG)
G

## Graph Generators and Operations

In [None]:
peterson = nx.petersen_graph()
print(peterson.nodes())
print(peterson.edges())

In [None]:
tutte=nx.tutte_graph()
maze=nx.sedgewick_maze_graph()
tet=nx.tetrahedral_graph()

Constructive generators for classic graphs:

In [None]:
K5 = nx.complete_graph(5)
K5.edges()

In [None]:
K35 = nx.complete_bipartite_graph(3,5)
print(K35.nodes())
print(K35.edges())

## Reading/Writing Graphs

Reading a graph stored in a file using common graph formats, such as edge lists, adjacency lists, GML, GraphML, pickle, LEDA and others.

In [None]:
K10 = nx.complete_graph(10)
nx.write_gml(K10, "toy_graph.gml")

In [None]:
mygraph = nx.read_gml("toy_graph.gml")

## Analysing Graphs

In [None]:
for i in nx.connected_components(mygraph):
    print(i)

In [None]:
nx.degree(mygraph)

In [None]:
nx.clustering(mygraph)

## Drawing Graphs

NetworkX is not primarily a graph drawing package but basic drawing with Matplotlib as well as an interface to use the open source Graphviz software package are included. These are part of the networkx.drawing package and will be imported if possible.

In [None]:
import matplotlib.pyplot as plt
%matplotlib inline

In [None]:
nx.draw(mygraph)

In [None]:
nx.draw_circular(mygraph)
plt.savefig("circular_complete.png")

In [None]:
nx.draw_spectral(mygraph)

In [None]:
nx.draw_random(mygraph)

## Task

Gene-Gene interactions (GGI) are part of the large network of gene-protein analysis that occurs in *Bioinformatics*. Here we draw from a subset of GGIs that have a weighted value between interacting genes in GGI.txt.

Build a Networkx Graph from GGI.gml.

Draw it as a circle, with $\alpha=0.5$, node size = 20. Weight the connections using the *Weight* attribute for each connection, color this using a cmap from Matplotlib.

In [None]:
import numpy as np
import pandas as pd
# your codes here