# [NTDS'19] demo 4: Network models with NetwokX
[ntds'19]: https://github.com/mdeff/ntds_2019
adapted from [NetworkX demo of NTDS 2017](https://github.com/mdeff/ntds_2017/blob/master/demos/04_networkx.ipynb)

Effrosyni Simou, [EPFL LTS4](http://lts4.epfl.ch)

In this session we will get introduced to NetworkX, explore some of the most common network models, look at their basic properties and compare them.

# Creating graphs using network models
There are many libraries that deal with creation and manipulation of graph data. We will use NetworkX to create basic network models, as they are already implemented in the library. You can find a full documentation of NetworkX 2.3 that you installed in the environment of this class [here](https://networkx.github.io/documentation/stable/).

In [None]:
%matplotlib inline

import random
import matplotlib.pyplot as plt
import networkx as nx
import numpy as np
import collections
import warnings
warnings.filterwarnings('ignore')

Create an Erdos Renyi graph with $N=100$ vertices, and a probability of connecting each pair of vertices equal to $p=0.15$.

In [None]:
N = 100  # number of nodes
p = 0.15  # probability of connection
er = nx.erdos_renyi_graph(N,p)

You can retrieve the adjacency matrix of the graph, from the Graph object ``er`` as follows:

In [None]:
er_adj = nx.adjacency_matrix(er, range(N))
er_adj = er_adj.todense()

You can now visualise the adjacency matrix:

In [None]:
plt.spy(er_adj);

# Plot graphs using NetworkX

With NetworkX and Matplotlib we can also plot a graph. For example, we can plot the Erdos Renyi graph that we created before as follows:

In [None]:
nx.draw(er)

It's easy to add or remove edges, but also nodes. If we add an edge between nodes that don't yet exist, they will be automatically created.

In [None]:
er.add_node(100)

In [None]:
er.nodes()

Similarly, you can add and remove a collection of nodes or edges, and add and remove one node or edge:
* Adding nodes with:
    - **G.add_node** : One node at a time
    - **G.add_nodes_from** : A container of nodes
* Adding edges with:
    - **G.add_edge**: One edge at a time
    - **G.add_edges_from** : A container of edges
    
    
* Removing nodes with:
    - **G.remove_node** : One node at a time
    - **G.remove_nodes_from** : A container of nodes
* Removing edges with:
    - **G.remove_edge**: One edge at a time
    - **G.remove_edges_from** : A container of edges
    
You can check the number of edges with **G.size()**.

Add an edge between two non-existant vertices. Remove all nodes up to node 50. Draw the graph after each change.

In [None]:
er.add_edge(101,102)
nx.draw(er)

In [None]:
er.remove_nodes_from(range(50))
nx.draw(er)
er.nodes()

In [None]:
er.size()

## Exercise:
Create a Barabasi-Albert graph and a Watts-Strogatz graph and plot them.

In [None]:
# Create a Barabasi-Albert graph
ba = # your code here

In [None]:
# Create a Watts-Strogartz graph
ws = # your code here

# Degree distribution of known network models

**G.degree()** returns a ``DegreeView`` object with pairs of nodes  and their degree. If we specify a node, **G.degree()** will return the degree of that node.

Create an Erdos-Renyi network and plot a histogram of node degrees.  

In [None]:
N = 100  # number of nodes
p = 0.15  # probability of connection
er = nx.erdos_renyi_graph(N,p)

In [None]:
d = er.degree()

In [None]:
print(d)

In [None]:
# Erdos-Renyi node degree histogram
degree_sequence = sorted([d for n, d in er.degree()], reverse=True)  # degree sequence
degreeCount = collections.Counter(degree_sequence)
deg, cnt = zip(*degreeCount.items())

fig, ax = plt.subplots()
plt.bar(deg, cnt, width=0.80, color='b');
plt.title("Degree Histogram");
plt.ylabel("Count");
plt.xlabel("Degree");

## Exercise: 
Try to fit a Poisson distribution.

In [None]:
# Poisson distribution
def poisson(mu,k):
    return np.exp(-mu) * mu**k * (np.math.factorial(k)**-1)

In [None]:
# Erdos-Renyi node degree histogram
degree_sequence = sorted([d for n, d in er.degree()], reverse=True)  # degree sequence
degreeCount = collections.Counter(degree_sequence)
deg, cnt = zip(*degreeCount.items())

fig, ax = plt.subplots()
plt.bar(deg, cnt, width=0.80, color='b')

# Poisson distribution
mu = 2*er.size()/100;
k = np.linspace(1,25,25);
deg = [100*poisson(mu,i) for i in k]
plt.plot(k, deg);

plt.title("Degree Histogram");
plt.ylabel("Count");
plt.xlabel("Degree");

## Exercise: 
Repeat the same exercise for the Barabasi-Albert and Watts-Strogatz networks.

In [None]:
# your code here

# Create a Random manifold-based network

We can also create a graph on our own. This sort of manifold-based graph is often used in practice when we need a graph representation of data laying on a manifold. Generate 100 two-dimensional data points, both values between 0 and 1. They should come from a uniform random distribution. These will be the coordinates of your nodes. Connect the nodes if their Eucledian distance is smaller than the threshold 0.2. In that case, the weight of the edge should be equal to $w(i,j) = \exp \left(-{\frac {dist(i,j)^{2}}{2\theta ^{2}}}\right)$. For this experiment, set $\theta$ to 0.9. 

In [None]:
def random_gaussian(nodes_coords, theta, threshold):
    adj = np.zeros((len(nodes_coords),len(nodes_coords)))
    for i in range(len(nodes_coords)):
        for j in range(i+1, len(nodes_coords)):
            arr = np.linalg.norm(nodes_coords[i]-nodes_coords[j])
            if (arr<threshold):
                d = np.exp(-arr/(2*theta**2))
                adj[i][j] = d
                adj[j][i] = d
    return adj

In [None]:
# generate the coordinates of the nodes 
nodes_coords = np.random.rand(100, 2)

In [None]:
# generate the adjacency matrix of the manifold-based graph
adj = random_gaussian(nodes_coords, theta=0.9, threshold=0.2)

In [None]:
# plot the adjacency matrix
plt.spy(adj);

Plot the graph using NetworkX. 
* Hints: 
    - `nx.from_numpy_array(adj)` creates a graph object from an adjacency matrix (in numpy form)
    - `nx.draw(G,pos)` will draw vertices at coordinates specified in pos. Variable pos is a dictionary assigning a pair of coordinates to each node.

In [None]:
g = nx.from_numpy_matrix(adj)
plt.spy(nx.adjacency_matrix(g).todense());

In [None]:
pos = dict(zip(range(100),nodes_coords))

In [None]:
nx.draw(g, pos)

Plot a degree distribution of this graph. What can you say about the distribution?

In [None]:
# node degree histogram
degree_sequence = sorted([d for n, d in g.degree()], reverse=True)  # degree sequence
degreeCount = collections.Counter(degree_sequence)
deg, cnt = zip(*degreeCount.items())

fig, ax = plt.subplots()
plt.bar(deg, cnt, width=0.80, color='b');
plt.title("Degree Histogram");
plt.ylabel("Count");
plt.xlabel("Degree");