## All-or-nothing Modularity using β and γ parameters

There are two ways to call the Louvain code for all-or-nothing modularity.
One works directly with an intensity function Ω, and is recommended. In this notebook, we illustrate another approach, using parameters β and γ.

Tuning γ has the effect of modifying the sizes of the clusters.

Tuning β allows one to specity which hyperedge sizes are most relavant.

**Load HyperModularity package**

Right now requires activating the package. 
Later when the package is registered this will be replaced with a simple "using HyperModularity" statement.

In [50]:
using Revise
using Pkg; Pkg.activate("../.")
using HyperModularity

[32m[1mActivating[22m[39m environment at `~/GitHubRepos/Working-directory/HyperModularity/Project.toml`


**Load dataset**

In [7]:
dataset = "contact-high-school-classes"
maxsize = 5	# max hyperedge size
minsize = 2	# min hyperedge size
return_labels = true
H, L = read_hypergraph_data(dataset,maxsize,minsize,return_labels)

# In many cases it is convenient to have the hypergraph stored as an edgelist and weights vector
EdgeList, weights = HyperModularity.hyperedge_formatting(H);

**Simple Version**

Use the simplest version of all-or-nothing Louvain. This does not require you to set intensity function Ω or parameters β and γ. Instead, an intensity function is implicitly learned from an initial clustering. Select one of the following options for input "startclusters" (the initial clustering):

* "singletons": learn Ω from clustering where all nodes are in singletons
* "cliquelouvain": learn Ω from clustering obtained by performing clique expansion and running graph louvain
* "starlouvain": learn Ω from clustering obtained by star expansion + graph louvain.

This can be used as a first step in finding a clustering. For best performance, alternate between learning parameters β and γ and finding an updated clustering vector Z.


In [8]:
start = "cliquelouvain"
gamma_res = 3.0 # can additionally toggle the resolution parameter for expansion + louvain initializer
Z = Simple_AON_Louvain(H,startclusters = start; gamma = gamma_res);

One step of all-or-nothing HyperLouvain

Louvain Iteration 1
Louvain Iteration 2
Main loop took 0.008133888244628906 seconds
One step of all-or-nothing HyperLouvain

Louvain Iteration 1
No nodes moved clusters
Main loop took 0.00012111663818359375 seconds


**Explicitly Setting β and γ**

You can also run an all-or-nothing Louvain after just setting vectors of parameters γ and β; weights for the volume part and the cut part of the objective respectively. See paper for details, in particular the relationship with intensity funciton Ω.

Optional parameters:

* **maxits** -- maximum iterations of main step of Louvain. Default = 100
* **verbose** -- if true, print out algorithm progress. Default = true
* **clusterpenalty** -- extra penalty on number of clusters; encourages fewer clusters. Default = 0.0
* **Z0** -- warm start clustering, on which Louvain improves modularity. Default is all singletons
* **rangflag** -- if true, scan order of nodes is random. Default is false


In [110]:
β, γ = HyperModularity.Kaminski_default(H) # uses special case as defined by Kaminski et al. 2020
Z = AON_Louvain(H,β,γ; maxits = 20);
omega = HyperModularity.betagamma_to_omega(β, γ)
modu,likeli = modularity_aon(H,Z,omega;likelihood = true)
nk = length(unique(Z))
println("\nModularity = $modu, Loglikelihood = $likeli, $nk clusters")

One step of all-or-nothing HyperLouvain

Louvain Iteration 1
Louvain Iteration 2
Louvain Iteration 3
Main loop took 0.015101194381713867 seconds
One step of all-or-nothing HyperLouvain

Louvain Iteration 1
Louvain Iteration 2
Main loop took 0.00031495094299316406 seconds
One step of all-or-nothing HyperLouvain

Louvain Iteration 1
No nodes moved clusters
Main loop took 9.703636169433594e-5 seconds

Modularity = -313.0816501824064, Loglikelihood = -160.20520872218805, 7 clusters


## Full coordinate ascent framework

The best way to use all-or-nothing Louvain is to alternate between learning β and γ and updating the clustering based on new parameters to greedily maximize the log likelihood.

**Store hypergraph information in alternate format**.
First we store the hypergraph as an edge list, edge weights, and degree vector. This will speed up later computations when repeatedly learning updated parameters β and γ.

In [87]:
d,n,kmax,e2n,n2e,weights,edge_lengths = HyperModularity.alternate_hypergraph_storage(H);

The full framework obtains an initial clustering from a graph projection + Louvain.
The coordinate ascent framework alternates between learning parameters β and γ from a clustering vector,
and then computing an updated clustering vector from these parameters using the Louvain-style algorithm.

This code is still experimental and there are couple caveats to note:

* The modularity and likelihood values computed based on β and γ are scaled differently from modularity values computed when using an intensity function Ω. For the purposes of telling when another iteration improves modularity or likelihood, this should not matter.
* **Numerical issues**: NaN may be returned for modularity or likelihood when learned β, γ, and omega values involves zero entries or infinite entries.

In [108]:
randflag = false
verbose = false
maxits = 100
Z = CliqueExpansionModularity(H);
β, γ, omega = learn_omega_aon(e2n,weights,Z,kmax,d,n)

for i ∈ 1:5
    AllClusterings = AON_Louvain(n2e,e2n,weights,d,edge_lengths,β,γ,kmax,randflag,maxits,verbose)
    Z = AllClusterings[:,end]; 
    modu,likeli = modularity_aon(H,Z,omega;likelihood = true)
    β, γ, omega = learn_omega_aon(e2n,weights,Z,kmax,d,n)
    nk = length(unique(Z))
    println("modularity = $modu, loglikelihood = $likeli, $nk clusters")
end

modularity = NaN, loglikelihood = NaN, 6 clusters
modularity = NaN, loglikelihood = NaN, 7 clusters
modularity = NaN, loglikelihood = NaN, 9 clusters
modularity = -352.4153728037049, loglikelihood = -202.7169851738345, 9 clusters
modularity = -352.4153728037049, loglikelihood = -202.7169851738345, 9 clusters


In [111]:
# We see we have obtained the ground truth clustering with this approach
mutualInformation(Z, L, true)

0.9295180058447912