# Testing modularity code for HyperNetX

Testing new code taking advantage of the data structure in HNX2 for the hypergraph_modularity module.
The code can be installed from the following forked repo (branch = modularity):

```
pip install git+https://github.com/ftheberge/HyperNetX.git@modularity#egg=hypernetx 
```


# Updates to HNX2.0 modularity module

### Unchanged functions:

- dict2part(D)
- part2dict(A)
- linear(d, c)
- majority(d, c)
- strict(d, c)
- two_section(HG)
- kumar(HG, delta=0.01, verbose=False)

### No longer required

- precompute_attributes(H)
- _compute_partition_probas(HG, A)
- _degree_tax(HG, Pr, wdc)
- _edge_contribution(HG, A, wdc)
- _delta_ec(HG, P, v, a, b, wdc)
- _bin_ppmf(d, c, p)
- _delta_dt(HG, P, v, a, b, wdc)

### New version 

- modularity(HG, A, wdc=linear)
- last_step(HG, L, wdc=linear, delta=0.01, verbose=False)

### New (hidden) functions

- _last_step_unweighted
- _last_step_weighted


In [9]:
import pandas as pd
import numpy as np
import igraph as ig
import hypernetx as hnx
import hypernetx.algorithms.hypergraph_modularity as hmod ## we re-wrote some of those functions
import pickle
import matplotlib.pyplot as plt
%matplotlib inline
from collections import Counter
import shutup
shutup.mute_warnings()
print('HNX version:',hnx.__version__)
Datadir = "data/"

HNX version: 2.0.4


# Experiment with h-ABCD hypergraphs

We generated 4 h-ABCD hypergraphs with parameters:

* -n 1000 -d 2.5,5,50 -c 1.5,50,200 -x 0.5 -q 0.0,0.4,0.3,0.2,0.1 -w :**linear** -s 1234 -o linear_1000
* same as above with **strict**, **majority**
* -n 1000 -d 2.5,5,50 -c 1.5,50,200 -x 0.5 -q 0.0,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1 -w :linear -s 1234 -o linear_large_edges_1000


In [12]:
## pick one on the 4 examples
file_vertex_labels = Datadir+'linear_1000_assign.txt'
file_hyperedges = Datadir+'linear_1000_he.txt'

In [13]:
## read data, build hypergraph
with open(file_hyperedges, 'r') as file:
    # Read all the lines of the file into a list
    lines = file.readlines()
hyperedges = [[y for y in x.replace('\n','').split(',')] for x in lines]

## for test purpose - add edges of size 1, multi-edges
#hyperedges.extend([['1'],['2'],['3']])
#hyperedges.extend([['1','1'],['2'],['3','3']])

with open(file_vertex_labels, 'r') as file:
    # Read all the lines of the file into a list
    vertex_labels = np.array([int(y) for y in file.read().splitlines()])

H = hnx.Hypergraph(hyperedges)
## optional - add random edge weights for test purpose
#for e in H.edges:
#    H.edges[e].weight = np.random.choice(10)+1
H.shape    

(1000, 3385)

In [16]:
%%time
## Cluster the 2-section graph (with Louvain)
G = hmod.two_section(H)
G.vs['louvain'] = G.community_multilevel(weights='weight').membership
ML = hmod.dict2part({v['name']:v['louvain'] for v in G.vs})


CPU times: user 399 ms, sys: 6.47 ms, total: 406 ms
Wall time: 435 ms


In [17]:
%%time
## Compute qH's
print('qH-linear:',hmod.modularity(H, ML, wdc=hmod.linear))
print('qH-majority:',hmod.modularity(H, ML, wdc=hmod.majority))
print('qH-strict:',hmod.modularity(H, ML, wdc=hmod.strict))


qH-linear: 0.3709337930881103
qH-majority: 0.3833705858261428
qH-strict: 0.3384402033530553
CPU times: user 105 ms, sys: 6.93 ms, total: 112 ms
Wall time: 106 ms


In [18]:
%%time
## Cluster the hypergraph (with Kumar's)
KU = hmod.kumar(H, verbose=True)


pass completed, max edge weight difference: 0.4896602658788774
pass completed, max edge weight difference: 0.2458714918759232
pass completed, max edge weight difference: 0.12054098966026587
pass completed, max edge weight difference: 0.06374076809453472
pass completed, max edge weight difference: 0.03346057976366322
pass completed, max edge weight difference: 0.017357367060561298
pass completed, max edge weight difference: 0.00992660635155096
CPU times: user 13.2 s, sys: 64.4 ms, total: 13.3 s
Wall time: 13.4 s


In [6]:
%%time
## Compute qH's
print('qH-linear:',hmod.modularity(H, KU, wdc=hmod.linear))
print('qH-majority:',hmod.modularity(H, KU, wdc=hmod.majority))
print('qH-strict:',hmod.modularity(H, KU, wdc=hmod.strict))


qH-linear: 0.3818650183516174
qH-majority: 0.39332336234821413
qH-strict: 0.3517127330702746
CPU times: user 93.5 ms, sys: 6.18 ms, total: 99.7 ms
Wall time: 94.9 ms


In [7]:
%%time
## try improving selected qH via simple heuristic
KU_ls = hmod.last_step(H, KU, wdc=hmod.linear, verbose=True)

initial qH: 0.3818650183516174
94 moves, new qH: 0.39606423386553097
29 moves, new qH: 0.39903083601492056
CPU times: user 42 s, sys: 511 ms, total: 42.5 s
Wall time: 42.8 s


In [8]:
%%time
## Compute qH with current HNX function
print('qH-linear:',hmod.modularity(H, KU_ls, wdc=hmod.linear))
print('qH-majority:',hmod.modularity(H, KU_ls, wdc=hmod.majority))
print('qH-strict:',hmod.modularity(H, KU_ls, wdc=hmod.strict))

qH-linear: 0.39903083601492056
qH-majority: 0.42453589938853936
qH-strict: 0.3384679383387232
CPU times: user 108 ms, sys: 4.6 ms, total: 113 ms
Wall time: 110 ms
