# Computing transitive closure on a relations and graphs

Review slides https://pages.mtu.edu/~nilufer/classes/cs2311/2012-march/cs2311-s12-ch9-relations-part2.pdf#slide.1
This clarifies that transitive closure is something we can compute by thinking of a relation as a graph.
It also clarifies that the complete transitive closure is the power of the matrix raised to n (number of elements) because that covers all paths up to length n.

A relation can be visualized as a bipartite (two part) graph. 
In [[https://rpubs.com/pjmurphy/317838][R we can visualize a bipartite graph]] directly with plot function.
This builds on [[https://www.rdocumentation.org/packages/igraph/versions/1.2.4.2/topics/graph_from_adjacency_matrix][creating a graph from the adjacency matrix.]]
When we have two distinct sets, we need to construct the adjacency matrix that has vertices combined from all the elements of both sets.
It's easy to [[https://en.wikipedia.org/wiki/Adjacency_matrix#Of_a_bipartite_graph][construct this adjacency matrix of a bipartite graph]].

We can draw these networks in R.
[[https://hal.archives-ouvertes.fr/hal-01722543/][It might be easier to use the ggplot2 routines]].  https://hal.archives-ouvertes.fr/hal-01722543/
Or, once defined in a standard format, a dedicated tool for network visualization.
[[https://medium.com/@Elise_Deux/list-of-free-graph-visualization-applications-9c4ff5c1b3cd][There are many to choose from.]]

We learn by discovering relationships between ideas.

We can document these connections using the notation of sets and recording the relations between items in those sets.
Combining these relations can help us discover new connections between items. Combining relations is call relation composition.

We can "see" relations by viewing them as a graph.
In graph form, relations are treated as networks of connected nodes.
The connections between items in the relation are visualized as edges between the nodes in the graph.
Graphs allow us to visually explore connected items and discover new relationships by following the edges that make up the paths between nodes.

For example, social networks contain many familiar relations like "friend of" and can be drawn as a graph where nodes represent people and edges between them represent their friendships.
Another network we interact with every day is the World Wide Web.   Relations between web pages are defined by the hyperlinks between pages. The web can be visualized as a network called the web graph.  Here the web pages are nodes and the links between web pages are the edges.

We might also have data sets that describe two different relations between items and composing them could help discover secondary relationships.  For example, there are data sets that document drugs which are known to activate specific genes and other data sets that tells us which genes activate other genes. By composing these relations, we can discover drugs which could be investigated to activate genes indirectly through other genes.  This would give us a way to control genes that can't be directly controlled through known drug interactions. 

Let's summarize these ideas with an example of a relation draw as a graph. It is a simple social network for the "friends of" relation between four imaginary people $\{ a, b, c, d \}$. We draw it using the [R's igraph library](https://igraph.org/r/doc/) and powerful plotting routines.  The nodes in the graph represent the people and the edges their "friends of" relation.  We can visually compose the "friend of a friend" transitive relation and discover that $(b,d)$ are members of that relation because $b$ is a "friend of" $c$ and $c$ is a "friend of" $d$.

In [None]:
library(igraph)
options(repr.plot.width=5, repr.plot.height=5)
adjm <- matrix(c(0, 0, 1, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0), nc=4,
               dimnames=list(c('a', 'b', 'c', 'd'), c('a', 'b', 'c', 'd')))
g <- graph_from_adjacency_matrix( adjm, mode="undirected" )
plot(g)

## Sets and relations

Let's review the language of relations and their composition.

Let's say we have a set of people $\{ a, b, c, d \}$ and know a parent-child relation, $R = \{(a,b)\}$,  and two sibling relations, $S=\{(b,c),(b,d)\}$, on that set.
Given this information, we can compose, or compute, the relation of aunts or uncles between items in the set.
That is we can deterime the composed relation $S \circ R=\{(a,c),(a,d)\}$, which tells us that person $a$ is a "nephew or neice of" persons $c$ and $d$.
Or conversly, that persons $c$ and $d$ are an "aunt or uncle of" person $a$.
An aunt (or uncle) is a person who is a sibling of a parent.
Therefore, if we know $a$ is a "child of" $b$, $aRb$, and $b$ is "sibling of" $c$ and $d$,  $bSc$ and $bSd$, then the composition of the relation, $R \circ S$, tells us which people have aunt or uncle relationships.

This new, composed relation $S \circ R$ defines a transitive relation.
Transitive means if $a$ is related to $b$ and $b$ is related to $c$ then $a$ is also related to $c$.
In other words, the connection between $a$ and $c$ transits across $b$.
In graph terms, there is a path between $a$ and $c$ that passes through $b$.

Transitive relationships can transit any number of intermediate items.
For example, we could have a transitive relationship between $a$ and $d$ because we know relationships where $a$ is related to $b$, $b$ is related to $c$, and $c$ is related to $d$. In a graph, we would call that a path between $a$ and $d$ of length three.

Transitivity is a property of some relations. Not all relations are transitive. 
When a relation is transitive, however, it provides a powerful tool for discovering connections between items.

The transitive closure of a set contains all the relationships in a set that result from applying the transitive relation on the set.
A closure is just the complete collection of all the relationships in which we are interested.
The transitive closure can be computed by following all the connections between the nodes that are defined by the transitive relation, repeatedly applying the relation to the set until all pairs are documented.
If we have a relation $R$, we can compute the transitive closure $R^*$, which is a new relation that contains the original relation and the complete collection of transitive relationships of any length between items in the set.

## Matrix notation for relations

Relations can be conveniently represented as a matrix.
The $n$ items in the set define the $n$ rows and $n$ columns of an $nxn$ matrix.
The relation is represented by putting a one in the matrix entry $i,j$ if there is a relation between element $i$ and $j$ of the set and a zero if there is not.
This gives us a binary matrix of $1$'s and $0$'s.

Let's represent our relations above in matrix form.
We'll use the R language in these examples, mainly because it has native support for matrix representations which simplifies this discussion and it has great [plotting support if we want to visualize our networks](https://hal.archives-ouvertes.fr/hal-01722543).

Let's define a list to represent the items in our set $\{a, b, c, d\}$.  These are names of our people.  Defining them as a list makes it easier to use the names when we we create the matrix. Good labels make data easier to read.

In [None]:
person = c('a','b','c','d')

Our first relation $R$ described the "child of" relation between $a$ and $b$.  Let's represent this in matrix form as $M_R$

[In R we create a matrix with the matrix() function](https://www.tutorialspoint.com/r/r_matrices.htm) and pass it our data, matrix dimensions, and names for the dimensions. We'll use the person names to name the dimensions.  Since we have four items in the set, the matrix will be four rows by four columns and have 16 entries.  The data is represented as a list of zeros and ones, where a one represents a relation between two items in our set.  We will read the entries by row to make the input list a little easier to read for those of us who read matrices in row-major order.  In this example, the second entry in the list will populate the cell in row $a$ and column $b$, meaning $a$ is a "child of" $b$.

In [None]:
M_R <- matrix( c(0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), nrow=4, byrow=TRUE, dimnames=list(person,person))
print(M_R)

We'll also create a matrix $M_S$ for our "sibling of" relation.  Recall that $b$ is a sibling of $c$ and $d$, which means there are ones in entries $(b,c)$ and $(b,d)$ of the matrix.

In [None]:
M_S <- matrix( c(0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0), nrow=4, byrow=TRUE, dimnames=list(person,person))
print(M_S)

## Composing Relations with Matrix Multiplication

Composing relations represented as a matrix is simple.
[We can use matrix multiplication to compute transitive relations.](http://www.facweb.iitkgp.ac.in/~niloy/COURSE/Autumn2008/DiscreetStructure/scribe/Lecture07CS1039.pdf)
If we have two relations $R$ and $S$ over a set of $n$ items, then we an represent each relation in its $nxn$ binary matrix form as $M_R$ and $M_S$.
The composition of relation $S \circ R$ is the multiplication of matrix $M_R$ and matrix $M_S$.  Concisely,  $S \circ R = M_{(R \circ S)} = M_R \odot M_S$.  Binary matrix multiplication, indicated by $\odot$, is discussed below.

This composition gives the transitive relations of length two, i.e. $aRb$ and $bSc$. We can see from the  result of the matrix computation that $a$ is indeed a "neice or nephew of" $c$ and $d$ because the entries $(a,c)$ and $(a,d)$ contain a $1$.

In [None]:
M_RS=M_R %*% M_S
print(M_RS)

## Transitive Closure and Matrix Multiplication

More generally, we can compute transitive relations of any length by repeated matrix multiplication.
Each multiplication increases the path length by one.

This makes sense.
If we start with the matrix $M_R$, it contains all the relations of length one.
That is, direct relationships between any pair of items in the set.
If $R$ contains transitive relations, then we can resolve any missing pairs.
If we multiply $M_R * M_R$, we get the transitive relations of length two.
If we do it again and multiply $M_R * (M_R * M_R)$ we get the transitive relations of length three.
The longest interesting path is at most n-1 steps away.
This would be a transitive relation composed by stepping through each item in the set.
This means we can compute the transitive closure of a relation $R$ on an $n$-item set by multiplying the $M_R$ matrix by itself $n-1$ times.
This is simply the $n^{th}$ power of the matrix.
Therefore the transitive closure $R^*$ can be computed as $M_{(R^*)} = (M_R)^n$.

Our use of matrix multiplication for relation composition has been ignoring a detail about the summation step in matrix multiplication.
We know that $1+1=2$. Using matrix multiplication for relation composition, if there are any rows and columns that both have $1$'s in the same dimension, a transitive relation, the summation will be $2$, $3$, or what ever number of connections exist between the two items.
Since we are only interested in the existance of a relation, the only value ever needed in the result is a $1$, the presence of a relation.
We can instead use Boolean matrix multiplication, indicated by $\odot$, to achieve that outcome.
In Boolean math we use logical AND for multiplication and logic OR for addition.
Here $1 x 1$ is still $1$ and $1 x 0$ is still $0$.
Boolean multiplication retains the relation filtering function.
But now, $1 + 1 = 1$.
Boolean addition records the existance of a relation without being concerned about the degree of connectivity.

We can substitute ordinary matrix multiplication for Boolean matrix operation by simply normalizing all non-zero elements in the final result to $1$. This makes sure the final values remain $1$ and $0$.

## The Adjacency Matrix and Graphs

There's another perspective on the relation-as-a-matrix representation that makes the transitive closure extremely useful to a broad set of applications.
We can represent a graph of n verticies as an nxn adjacency matrix.
The adjacency matrix is a matrix with values of 1 for vertices i,j that have an edge connecting them in the graph and values of 0 when there is no edge.
A binary matrix that maps elements of a set according to some relation is identical to an adjacency matrix for a graph.
This means we can represent interesting graphs like social networks or maps of the web as adjacency matrix.
We can then compute the transitive closure of these relations and find all nodes that are connected to each other either directly or through intermediate nodes.
Limiting the path length, for example to two, would make easy work of determining who is connected by a friend-of-friends relation.
This is a powerful tool for communitity detection in graphs.

## Compute Considerations for Matrix Multiplication

So, is it practical to just multiply the matrices together to compute the transitive closure?
Whenever we do work on a computer we should be interested in, meaning concerned about, the amount of time it takes to do that work.
If our data sizes are small, then we might get away with a naive approach for computing transitive closure.
By naive we mean a brute force computation that does all the multiplication and addition steps you'd expect to do for matrix multiplication.
Multiplying two nxn matrices means n^3 operations, so the naive approach is bounded by O(n^3).
But since we have to do n of these matrix multiplications, that really works out to O(n^4).

Deciding of our data is too big depends on that computation decision.
If we have 100 elements in our set, then a naive computation of transitive closure would be 100^4 or 100million operations.
Probably not impossible to wait for, but it increases exponentially for every factor of 10 increase in our set of elements.
Data becomes big pretty quickly at that rate.
A one-thousand node network would take 1000 times longer than the 100million operations that it took for our 100 element set.
This could quickly start to feel like waiting or, worse, limit the amount of information we are able to consider.

With social networks or the web graph we easily imagine thousands, millions or even billions of nodes in our network.
Even if we are only interested in nodes connected by one or two intermediate nodes we are still bounded by O(n^3).
The naive approach quickly becomes unmanagable at the scale of thousands of nodes.

In 1969 Strassen introduced the idea of fast matrix multiplication and was able to demonstrate performance bounded by O(n^(2.8)).
These methods have been improved and the state of the art for fast matrix multiply is now about O(n^(2.373)).
Keeping in mind that if we want the transitive closure we need still need to do this n times, bumping us above O(n^3).

Are there faster methods?
The well-known Floyd-Warshall algorithm for computing the shortest path between all pairs of nodes in a weighted graph can be applied to this problem.
It runs in O(n^3), for n-node networks.
We don't really care about weights for transitive closure, but finding the cheapest path is a nice feature to have in real world networks.
Floyd-Warshall does this using a dynamic program technique to keep the computational bound at O(n^3).
It inspects paths of increasing length between each node and keeps track of the shortest one.
Inspecting each of the n^2 node pairs in the graph n times keeps it bound at at O(n^3).
Because it finds the shortest path between all pairs, we end up with a complete set of paths across the graph at the end of the compution.
That is we have the transitive closure in O(n^3) operations.

Let's revisit the earlier observation that we are working with binary matrices we can therefore use binary matrix multiplication.
We replace the element-wise multiplication by Boolean AND operations and sum by OR operations.
This means the value for a cell in the result matrix will be 1 if there is a 1 in any of the matching i,j and j,i cells.
The Boolean OR operations means the result will be 1 if any one of these cell-level operations for the row and column multiplication is one.

We can rewrite our left-side matrix of the multiplication as set of rows that just records the postion of the ones in that row.
We then step through each element of these these i rows and inspect the corresponding element for its position k in the right side matrix in cell k,j.
If the we find a 1 in that position then we know there is a path and set the value of the resulting cell to 1.
We can then move on to the next row.
The expected time for most matrices is O(n^2).
There are worst case peformance of O(n^3) but that's not any worse than the solutions above.

This gives use an O(n^2) algorithm for computing paths between nodes.
This means we can compute a 1-million node network in same time it took us for computing a 1000 node network with our naive approach above.
That's a huge increase in the size of networks we can explore.
There's a log of value in thinking about perfomance when working with computers.