# Testing modularity code for HyperNetX

Testing new code taking advantage of the data structure in HNX2 for the hypergraph_modularity module.
The code can be installed from the following forked repo (branch = modularity):

```
pip install git+https://github.com/ftheberge/HyperNetX.git@modularity#egg=hypernetx 
```


# Updates to HNX2.0 modularity module

### Unchanged functions:

- dict2part(D)
- part2dict(A)
- linear(d, c)
- majority(d, c)
- strict(d, c)
- two_section(HG)
- kumar(HG, delta=0.01, verbose=False)

### No longer required

- precompute_attributes(H)
- _compute_partition_probas(HG, A)
- _degree_tax(HG, Pr, wdc)
- _edge_contribution(HG, A, wdc)
- _delta_ec(HG, P, v, a, b, wdc)
- _bin_ppmf(d, c, p)
- _delta_dt(HG, P, v, a, b, wdc)

### New version 

- modularity(HG, A, wdc=linear)
- last_step(HG, L, wdc=linear, delta=0.01, verbose=False)

### New (hidden) functions

- _last_step_unweighted
- _last_step_weighted


In [1]:
import pandas as pd
import numpy as np
import igraph as ig
import hypernetx as hnx
import hypernetx.algorithms.hypergraph_modularity as hmod ## we re-wrote some of those functions
import pickle
import matplotlib.pyplot as plt
%matplotlib inline
from collections import Counter
import warnings
warnings.simplefilter('ignore')
print('HNX version:',hnx.__version__)
Datadir = "./Data/"

HNX version: 2.0.4


# Experiment with h-ABCD hypergraphs

We generated 4 h-ABCD hypergraphs with parameters:

* -n 1000 -d 2.5,5,50 -c 1.5,50,200 -x 0.5 -q 0.0,0.4,0.3,0.2,0.1 -w :**linear** -s 1234 -o linear_1000
* same as above with **strict**, **majority**
* -n 1000 -d 2.5,5,50 -c 1.5,50,200 -x 0.5 -q 0.0,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1 -w :linear -s 1234 -o linear_large_edges_1000


In [2]:
## pick one on the 4 examples
file_vertex_labels = Datadir+'linear_1000_assign.txt'
file_hyperedges = Datadir+'linear_1000_he.txt'

In [3]:
## read data, build hypergraph
with open(file_hyperedges, 'r') as file:
    # Read all the lines of the file into a list
    lines = file.readlines()
hyperedges = [[y for y in x.replace('\n','').split(',')] for x in lines]

## for test purpose - add edges of size 1, multi-edges
#hyperedges.extend([['1'],['2'],['3']])
#hyperedges.extend([['1','1'],['2'],['3','3']])

with open(file_vertex_labels, 'r') as file:
    # Read all the lines of the file into a list
    vertex_labels = np.array([int(y) for y in file.read().splitlines()])

H = hnx.Hypergraph(hyperedges)
## optional - add random edge weights for test purpose
#for e in H.edges:
#    H.edges[e].weight = np.random.choice(10)+1
H.shape    

(1000, 3385)

In [4]:
%%time
## Cluster the 2-section graph (with Louvain)
G = hmod.two_section(H)
G.vs['louvain'] = G.community_multilevel(weights='weight').membership
ML = hmod.dict2part({v['name']:v['louvain'] for v in G.vs})


CPU times: user 176 ms, sys: 3.96 ms, total: 180 ms
Wall time: 180 ms


In [5]:
%%time
## Compute qH's
print('qH-linear:',hmod.modularity(H, ML, wdc=hmod.linear))
print('qH-majority:',hmod.modularity(H, ML, wdc=hmod.majority))
print('qH-strict:',hmod.modularity(H, ML, wdc=hmod.strict))


qH-linear: 0.36274807787558344
qH-majority: 0.3846497301541902
qH-strict: 0.3122958376208558
CPU times: user 50.2 ms, sys: 3.73 ms, total: 53.9 ms
Wall time: 52.8 ms


In [6]:
%%time
## Cluster the hypergraph (with Kumar's)
KU = hmod.kumar(H, verbose=True)


pass completed, max edge weight difference: 0.4896602658788774
pass completed, max edge weight difference: 0.24534711964549483
pass completed, max edge weight difference: 0.1237370753323486
pass completed, max edge weight difference: 0.060298498276710986
pass completed, max edge weight difference: 0.03037912358444117
pass completed, max edge weight difference: 0.019591026587887743
pass completed, max edge weight difference: 0.010293189931068439
pass completed, max edge weight difference: 0.0031758139463318596
CPU times: user 8.1 s, sys: 23.3 ms, total: 8.12 s
Wall time: 8.12 s


In [7]:
%%time
## Compute qH's
print('qH-linear:',hmod.modularity(H, KU, wdc=hmod.linear))
print('qH-majority:',hmod.modularity(H, KU, wdc=hmod.majority))
print('qH-strict:',hmod.modularity(H, KU, wdc=hmod.strict))


qH-linear: 0.3658125350155276
qH-majority: 0.38097480164587144
qH-strict: 0.32741539570353195
CPU times: user 44.2 ms, sys: 6.39 ms, total: 50.6 ms
Wall time: 49.3 ms


In [8]:
%%time
## try improving selected qH via simple heuristic
KU_ls = hmod.last_step(H, KU, wdc=hmod.linear, verbose=True)

initial qH: 0.3658125350155276
110 moves, new qH: 0.3858984258713437
18 moves, new qH: 0.38853167207781947
CPU times: user 24.2 s, sys: 108 ms, total: 24.3 s
Wall time: 24.2 s


In [9]:
%%time
## Compute qH with current HNX function
print('qH-linear:',hmod.modularity(H, KU_ls, wdc=hmod.linear))
print('qH-majority:',hmod.modularity(H, KU_ls, wdc=hmod.majority))
print('qH-strict:',hmod.modularity(H, KU_ls, wdc=hmod.strict))

qH-linear: 0.38853167207781947
qH-majority: 0.41795392766262496
qH-strict: 0.32015322871426566
CPU times: user 43.9 ms, sys: 8 ms, total: 51.9 ms
Wall time: 49.1 ms
