## Using the following packages:

* pip install python-igraph
* pip install partition-igraph
* pip install hypernetx


In [None]:
import pandas as pd
import igraph as ig
import partition_igraph
import hypernetx as hnx
import pickle

## import the hypergraph functions -- HNX version
get_ipython().magic('run ./H_functions.py');


# Summary of functions for HNX hypergraphs

### Build hypergraph and pre-compute key quantities

We build the hypergraph HG using:

```python
HG = hnx.Hypergraph(dict(enumerate(Edges)))
```

where 'Edges' is a list of sets; edges are then indexed as 0-based integers,
so to preserve unique ids, we represent nodes as strings.
For example Edges[0] = {'0','2'}

Once the HNX hypergraph is built, the following function is called to 
compute node strengths, d-degrees and binomial coefficients:

```python
HNX_precompute(HG)
```

### Partitions

We use two representations for partitions: list of sets (the parts) or dictionary.
Those functions are used to map from one to the other:

```python
dict2part(D)
part2dict(A)
```

### H-modularity

The function to compute H-modularity for HG w.r.t. partition A (list of sets covering the vertices):

```python
HNX_modularity(HG, A, wcd=linear)
```

where 'wcd' is the weight function (default = 'linear'). Other choices are 'strict'
and 'majority', or any user-supplied function with the following format:

```python
def linear(d,c):
    return c/d if c>d/2 else 0
```

where d is the edge size, and d>=c>d/2 the number of nodes in the majority class.

### Two-section graph

Build the random-walk based 2-section graph given some hypergraph HG:

```python
G = HNX_2section(HG)
```

where G is an igraph Graph.

### Clustering: Kumar algorithm

Given hypergraph HG, compute a partition of the vertices as per Kumar's algorithm described in [1].

```python
K = HNX_Kumar(HG, delta=.01)
```

where delta is the convergence stopping criterion. Partition is returned as a dictionary.

[1] Kumar T., Vaidyanathan S., Ananthapadmanabhan H., Parthasarathy S., Ravindran B. (2020) *A New Measure of Modularity in Hypergraphs: Theoretical Insights and Implications for Effective Clustering*. In: Cherifi H., Gaito S., Mendes J., Moro E., Rocha L. (eds) Complex Networks and Their Applications VIII. COMPLEX NETWORKS 2019. Studies in Computational Intelligence, vol 881. Springer, Cham. https://doi.org/10.1007/978-3-030-36687-2_24


### Clustering: Simple qH-based algorithm

Given hypergraph HG and initial partition L, 
compute a partition of the vertices as per Last-Step algorithm described in [2].

```python
A = HNX_LastStep(HG, L, wdc=linear, delta = .01)
```

where 'wcd' is the the weight function (default = 'linear') and delta is the convergence stopping criterion.
Returned partition is a list of sets.

[2] B. Kaminski, P. Pralat and F. Théberge, *Community Detection Algorithm Using Hypergraph Modularity*, to appear in the proceedings of Complex Networks 2020, Springer.


# Toy example

In [None]:
## build an hypergraph from a list of sets (the hyperedges)
## using 'enumerate', edges will have integer IDs
E = [{'A','B'},{'A','C'},{'A','B','C'},{'A','D','E','F'},{'D','F'},{'E','F'}]
HG = hnx.Hypergraph(dict(enumerate(E)))
hnx.draw(HG)


In [None]:
## compute node strength (add unit weight is none), d-degrees, binomial coefficients
HNX_precompute(HG)


In [None]:
## the edges (unit weights added by default)
HG.edges.elements


In [None]:
## the nodes (here strength = degree since all weights are 1)
HG.nodes.elements


In [None]:
HG.d_weights


In [None]:
## compute modularity qH for the following partitions:
A1 = [{'A','B','C'},{'D','E','F'}]
A2 = [{'B','C'},{'A','D','E','F'}]
A3 = [{'A','B','C','D','E','F'}]
A4 = [{'A'},{'B'},{'C'},{'D'},{'E'},{'F'}]

print('linear:',HNX_modularity(HG,A1),HNX_modularity(HG,A2),HNX_modularity(HG,A3),HNX_modularity(HG,A4))
print('strict:',HNX_modularity(HG,A1,strict),HNX_modularity(HG,A2,strict),HNX_modularity(HG,A3,strict),HNX_modularity(HG,A4,strict))
print('majority:',HNX_modularity(HG,A1,majority),HNX_modularity(HG,A2,majority),HNX_modularity(HG,A3,majority),HNX_modularity(HG,A4,majority))


In [None]:
## 2-section graph
G = HNX_2section(HG)
G.vs['label'] = G.vs['name']
ig.plot(G,bbox=(0,0,250,250))


In [None]:
## 2-section clustering with ECG
G.vs['community'] = G.community_ecg().membership
dict2part({v['name']:v['community'] for v in G.vs})


In [None]:
## Clustering with Kumar's algorithm
dict2part(HNX_Kumar(HG))

In [None]:
## hypergraph clustering -- start from partition A4 defined above
print('start from:',A4)
A = HNX_LastStep(HG,A4)
print('final partition:',A)


# Game of Thrones scenes hypergraph

REF: https://github.com/jeffreylancaster/game-of-thrones

We built an hypergraph from the game of thrones scenes with he following elements:

* **Nodes** are characters in the series
* **Hyperedges** are groups of character appearing in the same scene(s)
* **Hyperedge weights** are total scene(s) duration in seconds involving those characters

We kept hyperedges with at least 2 characters.
Moreover, we discarded characters with degree below 5.

We saved the following:

* *Edges*: list of sets where the nodes are 0-based integers represents as strings
* *Names*: dictionary; mapping of nodes to character names
* *Weights*: list; hyperedge weights (in same order as Edges)


In [None]:
## load the GoT dataset
Edges, Names, Weights = pickle.load(open( "../Data/GoT.pkl", "rb" ))

## Build weighted hypergraph 

In [None]:
## Nodes are represented as strings from '0' to 'n-1'
HG = hnx.Hypergraph(dict(enumerate(Edges)))
## add edge weights
for e in HG.edges:
    HG.edges[e].weight = Weights[e]
## add full names
for v in HG.nodes:
    HG.nodes[v].name = Names[v]
## pre-compute required quantities for modularity and clustering
HNX_precompute(HG)

### Modularity (qH) on a random partition

Should be close to 0 and can be negative.

In [None]:
## generate a random partition into K parts to compare results
K = 5
V = list(HG.nodes)
p = np.random.choice(K, size=len(V))
RandPart = dict2part({V[i]:p[i] for i in range(len(V))})
## compute qH
HNX_modularity(HG, RandPart)

### Get the 2-section graph (with igraph) and cluster with Louvain


In [None]:
## build 2-section
G = HNX_2section(HG)
## Louvain algorithm
ML = G.community_multilevel(weights='weight')
G.vs['louvain'] = ML.membership
part = dict2part({v['name']:v['louvain'] for v in G.vs})
## Compute qH
print(HNX_modularity(HG, part))

### Cluster with Kumar's algorithm


In [None]:
## run Kumar's algorithm, get partition
KU = HNX_Kumar(HG)
G.vs['kumar'] = [KU[v['name']] for v in G.vs]
## Compute qH
print(HNX_modularity(HG, dict2part(KU)))

### Cluster with simple H-based (Last Step) Algorithm

We use Louvain or Kumar algorithm on the 2-section as the required initial partition

In [None]:
## Louvain parition already computed
part = dict2part({v['name']:v['louvain'] for v in G.vs})
## H-based last step
LS = HNX_LastStep(HG, part)
## Compute qH
HNX_modularity(HG, LS)


### Example: top nodes in cluster with Daenerys Targaryen


In [None]:
## Index for 
inv_map = {v: k for k, v in Names.items()}
JS = inv_map['Daenerys Targaryen']
## JS's cluster
JS_part = part2dict(LS)[JS]
## Build dataframe: all nodes in JS_part
L = []
for n in LS[JS_part]:
    L.append([Names[n],HG.nodes[n].strength])
D = pd.DataFrame(L, columns=['character','strength'])
D.sort_values(by='strength',ascending=False).head(5)