## Network analysis with networkx
<p style='text-align: justify;'>Network analyses become more difficult the larger the network is, but noteworthy/interesting correlations can only be derived from large networks. It is therefore necessary to use the computing power of a computer. With the package networkx, Python offers a user-friendly package of functions for creating and analysing graphs (networks).</p>

### Generating a graph:

<p style='text-align: justify;'>Creating a graph with networkx is comparatively simple and requires only a few commands:</p>

```Python
G = nx.Graph()
#generates an empty undirected graph
G = nx.DiGraph()
#generates an empty directed graph

G.add_node(1)
G.add_nodes_from([2,3])
#add single node or a list of nodes

G.add_edge(1,2)
G.add_edges_from([(1,2),(1,3)])
#add single edge or a list of edges

nx.draw(G, with_labels=True)
plt.show()
#show teh graph

G.remove_node(1)
G.remove_edge(1,2)
#removes a node or an edge

G.clear()
#removes the whole content of the graph
```
### Exercise:
> - Import networkx as nx and matplotlib.pyplot as plt
> - Generate a graph with 9 nodes and nine edges and display it.

### Query graph parameters:
<p style='text-align: justify;'>With a network of this size, all parameters are still easy to keep track of, it is easy to see how the nodes are linked or generally how many nodes and edges occur.
<br>With large graphs, this is usually no longer possible graphically, which is why networkx offers the possibility of querying nodes and edges; within the framework of a script, these functions can also be used for arithmetic operations:</p>

```Python
G.nodes() #all nodes
G.edges() #all edges
G.edges(1) #all edges starting from node 1
G.degree() #number of edges starting from any node
G.degree(1) #number of edges starting from node 1
```
<p style='text-align: justify;'>
Instead of individual nodes, lists of nodes can also be passed to the functions.
</p>    

```Python
G.adj[1] #all nodes connected to node 1
list(g.adjacency()) 
#all nodes and their neighbours
```
<p style='text-align: justify;'>
In addition to these functions, there are other functions to query basic properties of a graph, for more information see:</p>

https://networkx.github.io/documentation/stable/reference/functions.html

### Exercise:
> - Test the functions with the previously created graph.

### Attributes:
<p style='text-align: justify;'>Attributes can be assigned to graphs, nodes and edges, e.g. one can define that a link between two nodes has a higher weight than another. <br>Either you assign a weight to the edges when you create them, or you can add any attributes later.</p>

```Python
G.add_weighted_edges_from([(1,2, 0.25),(2,3, 0.5)])
#the first two numbers define the edge, the third the weighting

G = nx.Graph(Typ="Prokaryot") 
#assigns an attribute to the graph when it is created 
G.graph['Typ'] = "Eukaryot"
#assigns an attribute to the graph afterwards

G.nodes[1]['Größe'] = "5"
#assigns an attribute to a node
G.add_nodes_from([6,7,8],Attribute = 10) #Attribute stands for any variable
#assigns an attribute to a list of nodes

G.edges[1,2]['Distanz'] = "2"
#assigns an attribute to an edge
G.add_edgess_from([(6,3),(7,8),(8,9)],Attribute = 10) #Attribute stands for any variable
#assigns an attribute to a list of edges
```
<p style='text-align: justify;'>The attributes of graphs, nodes and edges are automatically displayed with functions. However, they can also be displayed specifically for each class.</p>

```Python
G.nodes.data()
G.edges.data()
G.graph()
#show the attributes of the classes
```

### Exercises:
> - assign attributes to the previously created graph, its nodes and edges
> - use again the function `list(g.adjacency())`
> - disply the attributes

### Create / load graphs faster:
<p style='text-align: justify;'>To quickly load larger networks, entire lists of nodes and edges can be loaded or graphs can be created directly from a file or dataframe.</p>

```Python
G = nx.from_pandas_edgelist(df, source = 'Spalte1', target = 'Spalte2', edge_attr = ['Spalte3', 'Spalte4'], create_using = nx.DiGraph())
```
<p style='text-align: justify;'>(creates a directed graph with edges from nodes from column1 to nodes from column2 and assigns the respective value from columns 3 and 4 as attribute to the edges)</p>

### Exercises:
> - import the package Pandas as pd
> - load the file: Routen.csv in a variable
> - display the head of the dataframe (df.head())

> - Use the dataframe to create a directed graph whose edges have the attributes number and distance.
> - Display all nodes and their neighbours.

> - Display the graph.

### Algorithms for analysis:
<p style='text-align: justify;'>In addition to the possibilities for querying basic properties of a graph, networkx offers many integrated algorithms for the analysis of graphs. In the following, only a few are discussed. Further information on the individual algorithms as well as other possibilities of networkx can be found under:</p>

https://networkx.github.io/documentation/stable/reference/algorithms/index.html

```Python
nx.density(G) 
#Value between 0/1, represents the connectedness of a graph
nx.average_node_connectivity(G) 
#represents teh stability of a graph

nx.has_path(G, Knoten1, Knoten2) 
#Is there a path between the two nodes?
nx.shortest_path(G, Knoten1, Knoten2,weight='Distanz') 
#shortest path between node 1 and node 2
nx.shortest_path(G,weight='Distanz') 
#List all shortest paths between the nodes of a graph (weight is optional)
nx.average_shortest_path_length(G, weigt='Distanz') 
#Average length of teh shortest paths (weight is optional)

nx.degree_centrality(G) 
#Lists for each node its importance in the network (number of outgoing edges).
nx.eigenvector_centrality_numpy(G,weight='Distanz') 
#Lists for each node its importance in the network (influence on the network, as well as importance of the neighbouring nodes).
```

### Exercises:
> - Analyse the graph created at the beginning

> - Analyse the flight path graph, find the most important nodes in the network and think about the fastest way to travel in the network.

## More applications:
<p style='text-align: justify;'>Finally, a function for tree graphs, such as phylogenetic trees: With this algorithm, the last common ancestor (lca) of two nodes of the "tree" can be found quickly and easily::</p>

```Python
nx.lowest_common_ancestor(G, Knoten1, Knoten2)
```
### Exercise:
> - Execute the following code cell to load a graph
> - Search the lca of different node pairs in this graph

In [1]:
baum = nx.DiGraph()
baum.add_nodes_from([1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20])
baum.add_edges_from([(1,2),(2,3),(2,4),(3,5),(4,6),(5,7),(5,9),(5,11),(6,8),(6,10),(7,13),(7,15),(9,17),(9,19),(8,12),(10,14),(12,16),(12,18),(16,20)])
nx.draw(baum,with_labels=True)
plt.show()

NameError: name 'nx' is not defined