# P02-03: Modularity Optimisation

*April 30 2020*  

In the third and last unit of this week's practice lecture, we explore the partition quality function $Q(n, C)$ that captures how well a given mapping of nodes to communities is aligned with the topology of a network. We use this function to develop a simple community detection algorithm that is based on the heuristic optimisation of partition quality. We also implement the community assortativity coefficient, which allows us to compare the optimal partition quality (i.e. modularity) to the maximally possible modularity in a given network.

In [1]:
import pathpy as pp
import numpy as np

import sqlite3
from tqdm import tqdm

import random
import seaborn as sns
import matplotlib.pyplot as plt

plt.style.use('default')
sns.set_style("whitegrid")

We can now test our function in the example network from the lecture.

In [2]:
n = pp.Network(directed=False)
n.add_edge('a', 'b')
n.add_edge('b', 'c')
n.add_edge('a', 'c')
n.add_edge('b', 'd')
n.add_edge('d', 'f')
n.add_edge('d', 'g')
n.add_edge('d', 'e')
n.add_edge('e', 'f')
n.add_edge('f', 'g')
n.plot()

In [3]:
C, q = pp.algorithms.community_detection.modularity_maximisation(n, iterations=100)
print('Community partition =', C)
print('Modularity =', q)

Community partition = {'g': 0, 'e': 0, 'b': 2, 'a': 2, 'd': 0, 'f': 0, 'c': 2}
Modularity = 0.3641975308641976


We can also use the function `color_map` to generate a color mapping that can be directly passed to the `plot` function:

In [4]:
n.plot(node_color=pp.algorithms.community_detection.color_map(n, C))

## Modularity-based Community Detection in Empirical networks

We conclude this weeks practice lecture by an application of modularity-based community detection to empirical networks. We limit ourselves to the undirected networks `highschool`, `physician`, and `lotr` in our database:

In [5]:
n_highschool = pp.io.sql.read_network('networks.db', sql='SELECT source, target FROM "highschool"', directed=False)
n_physicians = pp.io.sql.read_network('networks.db', sql='SELECT source, target FROM "physicians"', directed=False)
n_lotr = pp.io.sql.read_network('networks.db', sql='SELECT source, target FROM "lotr"', directed=False)

We use `matplotlib` to plot the number of detected communities against the number of iterations performed by the optimisation algorithm in our implementation of the method `find_communities`. This can help us determine that the algorithm has converged:

In [18]:
C, q = pp.algorithms.community_detection.modularity_maximisation(n_lotr, iterations=2000)



maximising modularity:   0%|                                                                  | 0/2000 [00:00<?, ?it/s][A
maximising modularity:   1%|▎                                                       | 11/2000 [00:00<00:19, 102.12it/s][A
maximising modularity:   1%|▌                                                       | 22/2000 [00:00<00:19, 102.98it/s][A
maximising modularity:   2%|▉                                                       | 34/2000 [00:00<00:18, 105.65it/s][A
maximising modularity:   2%|█▎                                                      | 45/2000 [00:00<00:18, 104.57it/s][A
maximising modularity:   3%|█▌                                                      | 56/2000 [00:00<00:18, 103.53it/s][A
maximising modularity:   3%|█▉                                                      | 67/2000 [00:00<00:18, 103.69it/s][A
maximising modularity:   4%|██▏                                                     | 78/2000 [00:00<00:18, 103.80it/s][A
maximising modu

maximising modularity:  47%|██████████████████████████▏                             | 934/2000 [00:15<00:35, 29.71it/s][A
maximising modularity:  47%|██████████████████████████▎                             | 938/2000 [00:15<00:34, 30.46it/s][A
maximising modularity:  47%|██████████████████████████▍                             | 942/2000 [00:15<00:33, 31.15it/s][A
maximising modularity:  47%|██████████████████████████▍                             | 946/2000 [00:15<00:34, 30.94it/s][A
maximising modularity:  48%|██████████████████████████▌                             | 950/2000 [00:15<00:35, 29.72it/s][A
maximising modularity:  48%|██████████████████████████▋                             | 953/2000 [00:16<00:35, 29.65it/s][A
maximising modularity:  48%|██████████████████████████▊                             | 957/2000 [00:16<00:35, 29.41it/s][A
maximising modularity:  48%|██████████████████████████▉                             | 961/2000 [00:16<00:35, 29.63it/s][A
maximising modul

maximising modularity:  62%|██████████████████████████████████                     | 1237/2000 [00:36<01:37,  7.80it/s][A
maximising modularity:  62%|██████████████████████████████████                     | 1238/2000 [00:36<01:33,  8.19it/s][A
maximising modularity:  62%|██████████████████████████████████                     | 1239/2000 [00:36<01:29,  8.47it/s][A
maximising modularity:  62%|██████████████████████████████████                     | 1240/2000 [00:37<01:28,  8.54it/s][A
maximising modularity:  62%|██████████████████████████████████▏                    | 1241/2000 [00:37<01:25,  8.89it/s][A
maximising modularity:  62%|██████████████████████████████████▏                    | 1242/2000 [00:37<01:23,  9.03it/s][A
maximising modularity:  62%|██████████████████████████████████▏                    | 1243/2000 [00:37<01:21,  9.29it/s][A
maximising modularity:  62%|██████████████████████████████████▏                    | 1244/2000 [00:37<01:20,  9.41it/s][A
maximising modul

maximising modularity:  68%|█████████████████████████████████████▋                 | 1370/2000 [00:56<02:06,  4.96it/s][A
maximising modularity:  69%|█████████████████████████████████████▋                 | 1371/2000 [00:56<02:05,  4.99it/s][A
maximising modularity:  69%|█████████████████████████████████████▋                 | 1372/2000 [00:56<02:05,  5.01it/s][A
maximising modularity:  69%|█████████████████████████████████████▊                 | 1373/2000 [00:56<02:04,  5.05it/s][A
maximising modularity:  69%|█████████████████████████████████████▊                 | 1374/2000 [00:57<02:06,  4.96it/s][A
maximising modularity:  69%|█████████████████████████████████████▊                 | 1375/2000 [00:57<02:07,  4.91it/s][A
maximising modularity:  69%|█████████████████████████████████████▊                 | 1376/2000 [00:57<02:07,  4.90it/s][A
maximising modularity:  69%|█████████████████████████████████████▊                 | 1377/2000 [00:57<02:06,  4.93it/s][A
maximising modul

maximising modularity:  75%|█████████████████████████████████████████▎             | 1502/2000 [01:31<02:52,  2.89it/s][A
maximising modularity:  75%|█████████████████████████████████████████▎             | 1503/2000 [01:31<02:51,  2.90it/s][A
maximising modularity:  75%|█████████████████████████████████████████▎             | 1504/2000 [01:32<02:50,  2.91it/s][A
maximising modularity:  75%|█████████████████████████████████████████▍             | 1505/2000 [01:32<02:55,  2.82it/s][A
maximising modularity:  75%|█████████████████████████████████████████▍             | 1506/2000 [01:32<02:56,  2.81it/s][A
maximising modularity:  75%|█████████████████████████████████████████▍             | 1507/2000 [01:33<02:54,  2.82it/s][A
maximising modularity:  75%|█████████████████████████████████████████▍             | 1508/2000 [01:33<02:59,  2.75it/s][A
maximising modularity:  75%|█████████████████████████████████████████▍             | 1509/2000 [01:33<02:55,  2.79it/s][A
maximising modul

maximising modularity:  82%|████████████████████████████████████████████▉          | 1634/2000 [02:51<04:33,  1.34it/s][A
maximising modularity:  82%|████████████████████████████████████████████▉          | 1635/2000 [02:51<04:30,  1.35it/s][A
maximising modularity:  82%|████████████████████████████████████████████▉          | 1636/2000 [02:52<04:26,  1.37it/s][A
maximising modularity:  82%|█████████████████████████████████████████████          | 1637/2000 [02:53<04:27,  1.36it/s][A
maximising modularity:  82%|█████████████████████████████████████████████          | 1638/2000 [02:54<04:31,  1.33it/s][A
maximising modularity:  82%|█████████████████████████████████████████████          | 1639/2000 [02:54<04:33,  1.32it/s][A
maximising modularity:  82%|█████████████████████████████████████████████          | 1640/2000 [02:55<04:35,  1.31it/s][A
maximising modularity:  82%|█████████████████████████████████████████████▏         | 1641/2000 [02:56<04:27,  1.34it/s][A
maximising modul

maximising modularity:  88%|████████████████████████████████████████████████▌      | 1766/2000 [04:39<02:54,  1.34it/s][A
maximising modularity:  88%|████████████████████████████████████████████████▌      | 1767/2000 [04:40<02:46,  1.40it/s][A
maximising modularity:  88%|████████████████████████████████████████████████▌      | 1768/2000 [04:40<02:49,  1.37it/s][A
maximising modularity:  88%|████████████████████████████████████████████████▋      | 1769/2000 [04:41<02:45,  1.39it/s][A
maximising modularity:  88%|████████████████████████████████████████████████▋      | 1770/2000 [04:42<02:41,  1.43it/s][A
maximising modularity:  89%|████████████████████████████████████████████████▋      | 1771/2000 [04:42<02:39,  1.43it/s][A
maximising modularity:  89%|████████████████████████████████████████████████▋      | 1772/2000 [04:43<02:36,  1.46it/s][A
maximising modularity:  89%|████████████████████████████████████████████████▊      | 1773/2000 [04:44<02:43,  1.38it/s][A
maximising modul

maximising modularity:  95%|████████████████████████████████████████████████████▏  | 1898/2000 [06:08<01:10,  1.46it/s][A
maximising modularity:  95%|████████████████████████████████████████████████████▏  | 1899/2000 [06:09<01:10,  1.43it/s][A
maximising modularity:  95%|████████████████████████████████████████████████████▎  | 1900/2000 [06:10<01:09,  1.43it/s][A
maximising modularity:  95%|████████████████████████████████████████████████████▎  | 1901/2000 [06:11<01:07,  1.47it/s][A
maximising modularity:  95%|████████████████████████████████████████████████████▎  | 1902/2000 [06:11<01:06,  1.47it/s][A
maximising modularity:  95%|████████████████████████████████████████████████████▎  | 1903/2000 [06:12<01:05,  1.49it/s][A
maximising modularity:  95%|████████████████████████████████████████████████████▎  | 1904/2000 [06:13<01:04,  1.49it/s][A
maximising modularity:  95%|████████████████████████████████████████████████████▍  | 1905/2000 [06:13<01:02,  1.51it/s][A
maximising modul

In [20]:
# n_lotr.plot(node_color=pp.algorithms.community_detection.color_map(n_lotr, C))

In [6]:
C, q_opt, nums = find_communities(n_highschool, iterations=2000)
q_max = Qmax(n_highschool, C)

print("Number of communities =", len(set(C.values())))
print("Modularity =", q_opt)
print("Community assortativity coefficient =", q_opt/q_max)
plt.clf()
plt.plot(nums)
plt.show()

NameError: name 'find_communities' is not defined