In [1]:
import networkx as nx
from custom import load_data as cf
from networkx.algorithms import bipartite
from nxviz import CircosPlot
import numpy as np
import matplotlib.pyplot as plt

%load_ext autoreload
%autoreload 2
%matplotlib inline
%config InlineBackend.figure_format = 'retina'

# Introduction

Bipartite graphs are graphs that have two (bi-) partitions (-partite) of nodes. Nodes within each partition are not allowed to be connected to one another; rather, they can only be connected to nodes in the other partition.

Bipartite graphs can be useful for modelling relations between two sets of entities. We will explore the construction and analysis of bipartite graphs here.

![bipartite graph](https://upload.wikimedia.org/wikipedia/commons/thumb/e/e8/Simple-bipartite-graph.svg/600px-Simple-bipartite-graph.svg.png)

Let's load a [crime data](http://konect.uni-koblenz.de/networks/moreno_crime) bipartite graph and quickly explore it.

> This bipartite network contains persons who appeared in at least one crime case as either a suspect, a victim, a witness or both a suspect and victim at the same time. A left node represents a person and a right node represents a crime. An edge between two nodes shows that the left node was involved in the crime represented by the right node.

In [None]:
G = cf.load_crime_network()
G.edges(data=True)[0:5]

In [None]:
G.nodes(data=True)[0:10]

## Projections

Bipartite graphs can be projected down to one of the projections. For example, we can generate a person-person graph from the person-crime graph, by declaring that two nodes that share a crime node are in fact joined by an edge.

![bipartite graph](https://upload.wikimedia.org/wikipedia/commons/thumb/e/e8/Simple-bipartite-graph.svg/600px-Simple-bipartite-graph.svg.png)

### Exercise

Find the bipartite projection function in the NetworkX `bipartite` module [docs](https://networkx.github.io/documentation/networkx-1.10/reference/algorithms.bipartite.html), and use it to obtain the `unipartite` projection of the bipartite graph.

In [None]:
person_nodes = 
pG = 
pG.nodes(data=True)[0:5]

### Exercise

Try visualizing the person-person crime network by using a Circos plot. Ensure that the nodes are grouped by gender and then by number of connections.

In [None]:
nodes = sorted(____, key=lambda x: (____________, ___________))
edges = pG.edges()
edgeprops = dict(alpha=0.1)
node_cmap = {0:'blue', 1:'red'}
nodecolor = [__________________ for n in nodes]

fig = plt.figure(figsize=(6,6))
ax = fig.add_subplot(111)
c = CircosPlot(nodes, edges, radius=10, ax=ax, fig=fig, edgeprops=edgeprops, nodecolor=nodecolor)
c.draw()
c.fig.savefig('images/crime-person.png', dpi=300)

### Exercise

Use a similar logic to extract crime links.

In [None]:
crime_nodes = _________
cG = _____________  # cG stands for "crime graph"

### Exercise

Can you plot how the crimes are connected, using a Circos plot? Try ordering it by number of connections.

In [None]:
for n in cG.nodes():
    ___________

c = CircosPlot(___________)
___________
plt.savefig('images/crime-crime.png', dpi=300)

### Exercise

NetworkX also implements centrality measures for bipartite graphs, which allows you to obtain their metrics without first converting to a particular projection. This is useful for exploratory data analysis. 

Try the following challenges, referring to the [API documentation](https://networkx.github.io/documentation/networkx-1.9/reference/algorithms.bipartite.html) to help you:

1. Which crimes have the most number of people involved?
1. Which people are involved in the most number of crimes?

In [None]:
# Degree Centrality
bpdc = _______________________
sorted(___________, key=lambda x: ___, reverse=True)

## Bonus Lecture: Matrix Representation

Bipartite graphs have a natural matrix representation, known as the **biadjacency matrix**. Nodes on one partition are the rows, and nodes on the other partition are the columns.

NetworkX's `bipartite` module provides a function for computing the biadjacency matrix of a bipartite graph.

In [None]:
mat = bipartite.biadjacency_matrix(G, row_order=sorted(person_nodes), column_order=sorted(crime_nodes))
mat

With the biadjacency matrix, you can do some cool & fancy matrix operations. For example, if the rows are `people` and the columns are `crimes`, then if I do the matrix multiplication of the `people by crime` matrix and its transpose (`crime by people`), we will get back the `people-by-people` projection of the graph.

In [None]:
%%timeit
mat @ mat.T

In [None]:
%%timeit 
bipartite.projected_graph(G, person_nodes)

Note how it's much faster. The tradeoff, though, is that we lose the rich metadata that might be encoded on the nodes and edges of the graph.

The diagonal encodes the original degree of any given node in the graph. Let's check that.

In [None]:
person_projection = (mat @ mat.T)
person_projection.diagonal()
noi = np.argmax(person_projection.diagonal())  # this is the node that has the highest degree.

person_projection[noi, noi]   # the original degree in the bipartite graph.

In [None]:
len(G.neighbors(sorted(person_nodes)[noi]))  # original number of neighbors

Thus far, this is just a teaser as to what you can do with the matrix representation of a graph! I hope it's whetted your appetite for more!