### CS4423 - Networks
Angela Carnevale<br/>
School of Mathematical and Statistical Sciences<br/>
NUI Galway

#### 1. Graphs and Graph Theory


# Week 3, Lecture 1: 
# Graphs, Relations and Matrices. Bipartite Graphs.

 Together with `networkx`, we import `numpy` for matrix operations.

In [None]:
import networkx as nx
import numpy as np
opts={'with_labels':True,'node_color':'y'}

## Graphs are Relations

We have seen that graphs can be interpreted as relations with certain properties:

A **simple** graph with node set $X$ is a **symmetric**, **irreflexive** relation on $X$. 

The relation on the nodes that of *being adjacent*.  

The **adjacency matrix** of a graph is a special case of adjacency matrix of a relation.


## Composition and Adjacency Matrices.

* Relations can be composed, like functions.  

* If $R$ is a relation
from a set $X$ to a set $Y$, and if $S$ is a relation from $Y$ to a set $Z$,
then the __composite relation__ $R \circ S$ is the relation
from $X$ to $Z$, defined by $x (R \circ S) z$ if there is a
an element $y \in Y$ such that $x R y$ and $y S z$.

* The adjacency matrix of the composite relation $R \circ S$
is *essentially* the (matrix) product of the adjacency matrices
of the individual relations $R$ and $S$. 

Let's see how this works in practice...

We'll construct two relations starting from (part of) the data gathered through our survey on TV shows watched. 

By the way, here's a summary of the results:

![TV survey summary](https://angelacarnevale.github.io/TV_survey_summary.png)

... and here's a list of the first $8$ responses that we will use to construct a relation:

In [None]:
!cat TV_cut.txt

If we now consider $X=\{1,2,3,4,5,6,7,8\}$ the set of the first $8$ respondents and $Y=\{BB,GoT,Succ,DG,ST,TW,PB,NP,MrR,SG\}$ the set of all 10 TV shows in the list, the above data defines a relation $R\subset X\times Y$ with adjacency matrix:

$$ A=\begin{bmatrix}
0&0&0&0&1&0&0&1&1&0\\
1&1&0&1&1&1&0&1&1&1\\
1&1&0&1&1&0&0&0&1&1\\
0&1&0&1&0&0&0&0&0&1\\
1&1&0&1&1&0&1&1&0&1\\
1&0&0&1&1&0&1&1&0&1\\
0&1&1&1&1&1&0&1&0&1\\
0&0&0&0&0&0&0&0&1&0\\\end{bmatrix}$$

(here rows are indexed by elements of $X$ and columns by elements of $Y$).

Here's our adjacency matrix as a `numpy` array

In [None]:
A=np.array([[0,0,0,0,1,0,0,1,1,0],
 [1,1,0,1,1,1,0,1,1,1],
 [1,1,0,1,1,0,0,0,1,1],
 [0,1,0,1,0,0,0,0,0,1],
 [1,1,0,1,1,0,1,1,0,1],
 [1,0,0,1,1,0,1,1,0,1],
 [0,1,1,1,1,1,0,1,0,1],
 [0,0,0,0,0,0,0,0,1,0]])
print(A)

What if we compose the relation $R:X\to Y$ with its *transpose* $R^T:Y\to X$? 

The transpose of the relation $R$ is simply the one that has $A^T$ as adjacency matrix. So at the level of matrices, this means multiplying $A$ by $A^T$, which we can do in `numpy` as follows:

In [None]:
B=A@A.transpose()
print(B)

How can we interpret the entries of the above matrix in terms of TV shows and respondents?

We've seen that for two relations $R$ and $S$ that can be composed,

* If $A = (a_{ij})$ is the adjacency matrix of $R$, and $B = (b_{jk})$ the adjacency matrix of $S$,
then the $i,k$-entry of the product $AB$ is
$$(AB)_{ik} = \sum_{j} a_{ij} b_{jk},$$
which is exactly the __number__ of elements $y \in Y$ such that $x_i R
y$ and $y S z_k$.  




* This tells us that all it needs for $x_i$ to be $(R \circ S)$-related
to $z_k$ is this number to be at least $1$.  

* Hence, replacing all
nonzero entries in the product matrix $AB$ with $1$ yields the
adjacency matrix of the composite $R \circ S$.

* In our case, the entry in position $(i,j)$ is **exactly the number of shows in common between the $i$th and $j$th respondent**.

* We can use the observation above to transform the matrix $B$ into an adjacency matrix of a graph.

* In `numpy`, one can use **boolean indexing** and other convenient methods to do so.

First, we simply replace all the entries of $B$ which are stricly greater than $1$ with $1$s:

In [None]:
B[B>1]=1
print(B)

Then, we fill the diagonal with zeros:

In [None]:
np.fill_diagonal(B,0)
print(B)

The above matrix is the adjacency matrix of the composite relation. It tells us something about the **common interests** of the $8$ actors under consideration: there is a $1$ in position $(i,j)$ if and only if the $i$th respondant and the $j$th respondent share at least one show on their watchlist. Here, we can see that the $1$st respondent and the $4$th, for instance, do not have any show in common (from our list).

This is now the adjacency matrix of a homogeneous relation (on $X$) and we can therefore construct a graph. We'll do so directly from our `numpy` adjacency matrix.

In [None]:
G=nx.from_numpy_matrix(B)

In [None]:
nx.draw(G,**opts)

A graph like this one, representing the patterns of intersections of a family of sets, is also known as an [**intersection graph**](https://en.wikipedia.org/wiki/Intersection_graph). We'll encounter this graph in a different guise soon.

Note that we could have used the *transpose* construction (that is, we could have taken $R^T\circ R$):

In [None]:
C=A.transpose()@A
print(C)

This time, the $(i,j)$ entry of the matrix tells us the number of people that have watched both the $i$th and the $j$th show. 
For instance, we note that nobody watched both *Breaking Bad* and *Succession* ($c_{1,3}=0=c_{3,1}$), and that nobody watched both *Peaky Blinders* and *Mr Robot* ($c_{7,9}=0=c_{9,7}$).

As an exercise, you could construct an adjacency matrix and corresponding graph from the matrix $C$ (see also below).

## Bipartite Graphs

A (simple) graph $G = (X, E)$ is called **bipartite**, if the vertex set $X$ is a disjoint union
of two sets $X_1$ and $X_2$ so that each edge in $E$ links a vertex in $X_1$ with a vertex in $X_2$. 

We can think of the vertices in the two sets as **coloured** with different colours. For instance, we can think of nodes in $X_1$ as white nodes and those in $X_2$ as black nodes.

Here is a sample bipartite graph $B$, specified to the `Graph` constructor by its edge list.

In [None]:
edges = [(0,5), (1,5), (1,6), (1,7), (1,8), 
  (2,8), (3,5), (3,9), (4,7), (4,8), (4,9)]
B = nx.Graph(edges)
nx.draw(B, **opts)

In this graph, the **white** nodes can be taken  as the set $X_1 = \{0,1,2,\dots,4\}$ 
and the **black** nodes as $X_2 = \{5,6,\dots,9\}$.
The drawing command `nx.draw` takes as optional argument a dictionary `pos` that specifies for
each node a (relative) position in the drawing.  Here, the node is the key and the 
position is a pair of $x$,$y$-coordinates.  In this example we can use the (integer) quotient
and remainder, as returned by the python method `divmod` to quickly compute a dictionary of positions
that have the white nodes on the left, and the black nodes on the right.

In [None]:
divmod(7, 5)

In [None]:
pos = {x: divmod(x, 5) for x in range(10)}
print(pos)

In [None]:
nx.draw(B, pos, **opts)

Node colors can be specified as a *list* assigned to the keyword argument `node_color`.  We can use the $x$-coordinates of the node positions for that purpose.

In [None]:
color = [pos[x][0] for x in B.nodes()]
color

In [None]:
print(B.nodes)

In [None]:
opts2 = { "with_labels": True, "node_color": color, "font_color": 'r' }
nx.draw(B, **opts2)

In [None]:
nx.draw(B, pos, **opts2)

A **(vertex)-coloring** of a graph $G$ is an assignment of (finitely many) colors to the nodes of $G$,
so that any two nodes which are connected by an edge have *different* colors.

A graph is called **$N$-colorable**, if it has a vertex coloring with (at most) $N$ colors.

**Theorem.** Let $G$ be a graph.  The following are equivalent:

* $G$ is bipartite;

* $G$ is $2$-colorable;
 
* each cycle in $G$ has even length.

(We'll give precise definitions of **cycle** and **length** in a bit)



2D grids are naturally bipartite:

In [None]:
G44 = nx.grid_2d_graph(4, 4)
nx.draw(G44)

How would you find a $2$-coloring of this graph?

##  Code Corner

### `python`

* `divmod`: [[doc]](https://docs.python.org/3/library/functions.html#divmod) the built-in quotient-and-remainder

In [None]:
divmod(-7, 5)

### `Numpy`

* `toarray`:  convert a sparse matrix into a proper array

* `fill_diagonal`: fill the diagonal entries of an array with a given value.

### `networkx`

* `adjacency_matrix` computes the adjacency matrix of a graph

  * `from_numpy_matrix` constructs a graph from its adjacency matrix

##  Exercises

1. Use the `complete_graph` function in `networkx` to construct a $5 \times 5$ matrix
   with entries $0$ on the diagonal and all other entries $1$.

2. Construct an adjacency matrix and a graph from the matrix $C$ above. What does it mean for two vertices in this graph to be adjacent at the level of the TV shows?