# Differences in Block Matrices

In the previous section, we saw a situation where we had two networks which we thought could be effectively characterized with RDPG. Further, we knew that the nodes were the same across both networks: that is, we knew that node $1$ in the first network was the same as node $1$ in the second network, so on and so forth all the way up to node $n$. In this situation, we found that we could test whether the latent positions for the underlying RDPGs are the same; that is, whether $H_0: X_1 = X_2R$ against $H_A: H_1 \neq X_2R$. We called this the two-sample hypothesis test for RDPGs. 

What if we can take this a step further, however, and we can say that the networks are realizations of SBMs? How can we check whether the block matrices are the same? If you remember from [Chapter 5.4](#link?), we learned that SBMs are just a "special case" of the RDPG. What this means is that, since the SBM is an RDPG, the SBM also has a latent position matrix. Therefore, we could test whether the networks are the same by just using the two-sample hypothesis test for the RDPG. The interpretation of rejecting/accepting the alternative hypothesis here was that the latent positions were the same/different across the two networks, and therefore, they share the same probability matrix. This is excellent news, so are we done?

Not quite yet; as it turns out, when we think that the networks are realizations of SBMs, there are a lot more interesting questions we can ask about the probabilities that might arise. Specifically, we can deduce many useful ways in which two probability matrices for SBMs might be different, but *still* share similar characteristics.

For this example, we will introduce a new scenario. We have two networks which summarize the traffic patterns between $n=100$ towns (represented by the nodes in our network) across $K=3$ states (represented by the communities in our network). The first fourty towns are in Pennsylvania (the first community), the second thirty towns are in New Jersey (the second community), and the third thirty towns are in New York (the third community). The community assignment vector $\vec z$ looks like this:

For a month, we measure the number of drivers who commute from one town to the other in a specified time window, and if more than $1,000$ drivers regularly make this commute, we add an edge between the pair of towns. In general, we know that people tend to commute more frequently within their state, so the probabilities that an edge exists between a pair of towns in the same state exceeds the probabilities that an edge exists between a pair of towns which are not in the same state. Now, here's the twist: we have measured the first network between 8 AM and 8 PM (covering the bulk of the work day), and the second network between 8 PM and 8 AM (covering the bulk of night time). We know that a lot of people in New Jersey tend to commute to new York for the work day, so we the probability of an edge existing between a New Jersey town and a New York town are higher during the day than the night. We don't think that driving patterns themselves really change, but we do think that the probability of an edge existing is about $50\%$ higher during the daytime. Further, since New York is a large hub for workers, This applies to towns for all pairs of communities in the network. The block matrices look like this:

We then sample two networks with the above parameters, giving us the following two networks for the day and the night:

How can we ask whether the block matrices have similarities?

## Testing whether the block matrices in an SBM are different

Based on what we learned above, we know ahead of time that the block matrices for the SBMs are different. However, how can we actually test this? Well, let's start by being clear about what we mean by "different". To make this a little big more mathemattical, we'll introduce some new variables for the block matrices during the day time ($B^{(d)}$) and at night time ($B^{(n)}$) clearly. The block matrices are:

\begin{align*}
    B^{(d)} &= \begin{bmatrix}
    b^{(d)}_{11} & b^{(d)}_{12} & b^{(d)}_{13} \\
    b^{(d)}_{21} & b^{(d)}_{22} & b^{(d)}_{23} \\
    b^{(d)}_{31} & b^{(d)}_{32} & b^{(d)}_{33}
    \end{bmatrix}; \;\;\; B^{(n)} = \begin{bmatrix}
    b^{(n)}_{11} & b^{(n)}_{12} & b^{(n)}_{13} \\
    b^{(n)}_{21} & b^{(n)}_{22} & b^{(n)}_{23} \\
    b^{(n)}_{31} & b^{(n)}_{32} & b^{(n)}_{33}
    \end{bmatrix}
\end{align*}
The hypothesis we want to test is the null hypothesis that the block matrices are the same, $H_0: B^{(d)} = B^{(n)}$, against the alternative hypothesis that the block matrices are different, $H_A: B^{(d)} \neq B^{(n)}$. For a matrix, remember that two matrices are equal if all of the entries are identical, and two matrices are unequal if at least one of the entries are unequal. We can rerformulate the null and alternative hypotheses with this logic.

For the null hypothesis, $H_0: B^{(d)} = B^{(n)}$, the statement is therefore equivalent to saying that for all pairs of communities $k$ and $l$, $b^{(d)}_{kl} = b^{(n)}_{kl}$. We will write each of these statements down as individual hypotheses for all pairs of communities, using the convention $H_{0, kl}: b_{kl}^{(d)} = b^{(n)}_{kl}$. The null hypothesis $H_0$ is therefore equivalent to saying that for every pair of communities $k$ and $l$, $H_{0,kl}$ is true. For the alternative hypothesis, $H_A: B^{(d)} \neq B^{(n)}$, the statement is therefore equivalent to saying that for at least one pair of communities $k$ and $l$, $b^{(d)}_{kl} \neq b^{(n)}_{kl}$. We will write down each of these statements as well as individual hypotheses for all pairs of communities, using the convention $H_{A, kl} : b_{kl}^{(d)} \neq b^{(n)}_{kl}$. The alternative hypothesis $H_A$ is therefore equivalent to saying that for at least one pair of communities $k$ and $l$, that $H_{A,kl}$ is true.

Now that we have broken a statement about two matrices down into numerous statements about two probabilities, we have almost completed our job. As it turns out, we have already seen the way we will test this, back in [Chapter 8.3](#link?)! Seeking to test whether a pair of block probabilities between communities $k$ and $l$ are the same, $H_{0,kl}: b_{kl}^{(d)} =  b_{kl}^{(n)}$, against whether the pair of block probabilities between communities $k$ and $l$ are different, $H_{A, kl}:  b_{kl}^{(d)} \neq  b_{kl}^{(n)}$, is the [two-sample testing problem](#link?)! How did we address this problem before?

We used Fisher's exact test. 

## Testing whether the block matrices in an SBM are multiples of one another