<span>
<b>Python version:</b>  >=3.7<br/>
<b>Networkx version:</b>  >=2.3<br/>
<b>Last update:</b> 22/11/2021
</span>

<a id='top'></a>
# *Chapter 2: Networks & Graphs*

``Networkx`` is a python library designed to provide support to analysis of complex networks.

In this notebook are introduced some of the main features of the library and an overview of its functionalities.

**Note:** this notebook is purposely not 100% comprehensive, it only discusses the basic things you need to get started. <br/> A complete documentation (and tutorial) is available on the project [website](https://networkx.github.io/documentation/latest/)

**Note 2:** textbooks approaching network analysis (practice and theory) using ``networkx`` are: 
- "Complex Network Analysis in Python"
Dmitry Zinoviev, The Pragmatic Programmer. 2018.	
- "Firstcourse in network science"
Menczer, Fortunato, and Davis. 2020.

## Importing the library
As a first step just import the ``networkx`` library.

In [None]:
import networkx as nx

In [None]:
import warnings
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
warnings.filterwarnings('ignore')

In our example we will not only analyse graphs but also visualise them: for this reason we have to import also ``matplotlib``.

In [None]:
%matplotlib inline 

## Design our first graph

``networkx`` provides support for several graph models. 

Among them:
- undirected graphs, available through the ``Graph`` class
- directed graphs, available through the ``DiGraph`` class

In this brief tutorial we will focus only on undirected graphs.

We can build a graph adding nodes as well as edges as follows:

In [None]:
g = nx.Graph(name='first network')

#g.add_node("a")
g.add_edge("a", "b")
g.add_edge("a", "c")
g.add_edge("b", "c")

nx.draw(g, with_labels=True)

In [None]:
g.name

Nodes and edges can also be easly removed

In [None]:
g.remove_node("a")
g.remove_edge("b", "c")

#re-draw the graph
nx.draw(g, with_labels=True)

## Reading a graph from file
``networkx`` natively supports several network file formats.

Among them one the most frequently used in online repository is the *edgelist* one.

An edge list is a text file (usually saved as .csv) in which each line identifies an edge. <br/>
For instance, the triangle defined before can be described as:

    a,b
    b,c
    c,a

To read edgelist file just write

In [None]:
pd.read_csv("network.csv", 
            sep=',', header=None, names=['node1', 'node2']).head()

In [None]:
g = nx.read_edgelist("network.csv", 
                     delimiter=",", nodetype=int)

In [None]:
print(g)

Similarly a graph can be written to file using ``nx.write_edgelist(g, filename)``.

For all the I/O methods refer to the [official documentation](https://networkx.github.io/documentation/latest/reference/readwrite/index.html)

## Accessing nodes and edges
Given a ``Graph`` object is it possible to iterate over its nodes with a simple ``for`` loop

In [None]:
for n in g.nodes():
    # do something
    pass

Following a similar rationale is it also possible to loop over the edge set

In [None]:
for e in g.edges():
    # do something
    pass

All graph entities can be used to store additional attributes (weights, labels...). 

For furhter details refer to the [official documentation](https://networkx.github.io/documentation/latest/tutorial.html#adding-attributes-to-graphs-nodes-and-edges)

## Network base statistics
``networkx`` allows to manipulate nodes as well as edges, count them, and extract relevant global features.

In [None]:
g.number_of_nodes()

In [None]:
g.number_of_edges()

In [None]:
g.is_directed()

## Degrees and Degree distribution
Node degree can be easily obtained as follows:

In [None]:
g.degree(7) # degree for node 1

In [None]:
g.degree()

Similarly the average degree can be computed with

In [None]:
sum(dict(g.degree()).values())/float(len(g))

An easy way to compute, and visualise, the degree distribution is the following

In [None]:
hist = nx.degree_histogram(g)

plt.plot(range(0, len(hist)), hist, ".")
plt.title("Degree Distribution")
plt.xlabel("Degree")
plt.ylabel("#Nodes")
plt.loglog()
plt.grid(alpha=0.2)
plt.show()

## Graph components
``networkx`` allows to select node specific views of the original graph

In [None]:
list(g.neighbors(0)) # obtain the list of neighobors for node 0

In [None]:
ego = nx.ego_graph(g, 273) # ego network of the node 0
nx.draw(ego, with_labels=True)

In [None]:
ego = nx.ego_graph(g, 306) # ego network of the node 0
nx.draw(ego, with_labels=True)

Using the same rationale also connected components can be extracted

In [None]:
nx.number_connected_components(g)

In [None]:
comps = list(nx.connected_components(g)) # get a list of connected components (for decreasing size)
comp_0 = nx.subgraph(g, comps[0]) # build a subgraph on the second component
nx.draw(comp_0)

In [None]:
comp_1 = nx.subgraph(g, comps[84]) # build a subgraph on the second component
nx.draw(comp_1)

## Paths and Diameter
Shortest paths can be extracted as well using the following syntax

In [None]:
nx.shortest_path(g, source=0, target=30)

In [None]:
nx.shortest_path_length(g, source=0, target=30)

Moreover, the network diameter can be computed as follows

In [None]:
nx.diameter(g.subgraph(comps[0])) # we compute the diameter on the giant component

## Triangles, density and clustering
Other indexes that can be computed using the library are 

In [None]:
nx.density(g)

In [None]:
nx.triangles(g)[0] # count the triangles each node is involved in (and access the value of node 0)

In [None]:
nx.clustering(g)[0] # compute the local clustering coefficient for all nodes (and access the value for node 0)

In [None]:
nx.average_clustering(g) # compute the global clustering coefficient