
# Network modularity using the Zachary Karate network

A commonly used example of a network used for network analysis is Zachary's Karate Club. Wayne Zachary introduced this dataset in his paper "An Information Flow Model for Conflict and Fission in Small Groups" published in 1977 and has been widely used ever since.

The network models the interactions between 34 members of a karate club, each node representing an individual and each link/edge represents individuals who interact outside of the karate club (i.e. socialise outside of the club).

There are two important individuals in the club - the officer John A (node 33) and the instructor Mr. Hi. However, a conflict occured between John A and Mr. Hi, causing the club to split into two separate clubs, one lead by John A and the other by Mr. Hi.

It is reasonable to assume that each member's decision to join either new club would be influenced by their relationships between other members of the club. As we have the data on these relationships, i.e. the network, we can attempt to predict which new club each person will join using the principles of network analysis.

First, we will begin by installing the required packages in Python for our analysis:

In [None]:
# Install networkx and install the required packages
!pip install -q networkx

import sys
import networkx as nx 
import matplotlib.pyplot as plt
import numpy as np
import math
import pandas as pd
import matplotlib as mpltlb

%matplotlib inline

We will then import the graph for Zachary's karate club:

In [None]:
#Let's import the ZKC graph:
ZKC_graph = nx.karate_club_graph()

One way to represent a network mathematically is to use an adjacency matrix, denoted $A$. Each element of the matrix $A_{i, j}$ will be equal to $1$ if member $i$ interacts with member $j$ outside of the club, and vice-versa. Otherwise, $A_{i,j}$ will be equal to $0$, indicate $i$ and $j$ do not interact outside the club.

Displaying $A$ as a "spy" plot will help us look into what is happening in the graph:

In [None]:
A = nx.convert_matrix.to_numpy_matrix(ZKC_graph)

# Create a little plot of the adjacency matrix that shows which nodes are
# connected to which other nodes
fig, axs = plt.subplots()
axs.spy(A, markersize=5)
print(A)

# Check the size of the adjacenty matrix
print(A.shape)

The blue squares in this matrix indicate values of 1, and the white areas show values of zero. The structure of the network is already pretty clear from this matrix, which means that the members must be ordered in a fairly obvious way. The members 0 to 15 are fairly closely connected, while those from 24 to 34 are also well connected.

Let's propose a two-partition grouping for the network. Let's put the odd members in one group, and the even members in another group. There are 33 group members in total.

In [None]:
Groups = np.array([1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2])
print(Groups)
print(len(Groups))
print(A.shape)

TotalNodes = 34
TotalLinks = 0

for i in range(34):
  for j in range(34):
    if A[i, j] == 1:
      TotalLinks = TotalLinks + 1

Nodes_in_module_1 = 0
Nodes_in_module_2 = 0

for i in range(34):
  if Groups[i] == 1:
    Nodes_in_module_1 = Nodes_in_module_1 + 1;
  if Groups[i] == 2:
    Nodes_in_module_2 = Nodes_in_module_2 + 1;

print("There are",TotalLinks,"links in total")
print("There are",Nodes_in_module_1,"nodes in module 1")
print("There are",Nodes_in_module_2,"nodes in module 2")


Let's calculate the modularity of this proposed partition, using the methods that were shown on the slides.

In [None]:
Within_module_1 = 0
Within_module_2 = 0

for i in range(34):
  for j in range(34):
    if A[i, j] == 1 and Groups[i] == 1 and Groups[j] == 1:
      Within_module_1 = Within_module_1 + 1

for i in range(34):
  for j in range(34):
    if A[i, j] > 0 and Groups[i] > 1 and Groups[j] > 1:
      Within_module_2 = Within_module_2 + 1

Q1 = Within_module_1/(Nodes_in_module_1*TotalLinks/TotalNodes) 
Q2 = Within_module_2/(Nodes_in_module_2*TotalLinks/TotalNodes) 
Q = Q1 + Q2

print("The modularity of the first node is",Q1)
print("The modularity of the second node is",Q2)
print("The modularity of the partition is",Q)

print(Within_module_1,"edges were entirely within the first module")
print(Within_module_2,"edges were entirely within the second module")


Now let's plot the proposed partition.

In [None]:
# create the figure
fig, axs = plt.subplots()

# make a vector of locations for each of the nodes
location = np.zeros([34, 2])

# initially just assign them a random value
for n in range(34):
  location[n, :] = np.random.rand(1, 2)

for i in range(34):
  if Groups[i] == 1:
    location[i, 0] = location[i, 0]+2
    axs.plot(location[i, 0], location[i, 1], 'go', markersize = 10)

for i in range(34):
  if Groups[i] == 2:
    axs.plot(location[i, 0], location[i, 1], 'mo', markersize = 10)

# now need to plot the connections between members
for i in range(34):
  for j in range(i, 34):

    # if a connection between i and j exists, plot the connection
    if A[i, j] == 1:
      axs.plot([location[i, 0], location[j, 0]], [location[i, 1], location[j, 1]], 'k', alpha = 0.5, linestyle='dashed', linewidth = 0.5)

# show the figure
plt.show()