Ok, so in this notebook we are going to generate a *dyadic* graph, which we will then attempt to cluster. 

In [49]:
## Generate a graph
using StatsBase
using Combinatorics

include("jl/omega.jl")
include("jl/HSBM.jl")
include("jl/hypergraph_louvain.jl")
include("jl/inference.jl")

# parameters

n_ = 50
n = 2*n_
Z = vcat(repeat([1],n_), repeat([2], n_))
ϑ = dropdims(ones(1,n) + rand(1,n), dims = 1)

# defining group intensity function Ω
μ = mean(ϑ)


# because of the way the code is structured, we need to allow kmin = 1, 
# but we set Ω = 0 for all size-1 edges below. 

kmax = 2
kmin = 1

fk = k->(2 .*μ*k)^(-k)
fp = x->harmonicMean(x)^10

Ω_dict = Dict{Vector{Int64}, Float64}()

for k = kmin:kmax, p in partitions(k)
    Ω_dict[p] = fk(sum(p))*fp(p)/10
end

for p in keys(Ω_dict)
    if sum(p) == 1
        Ω_dict[p] = 0
    end
end

Ω = buildΩ(Ω_dict; by_size=true);

In [50]:
H = sampleSBM(Z, ϑ, Ω; kmax=kmax, kmin = kmin)
# number of edges
l = length(H.E[2])
# proportion of edges in same cluster
c = mean([Z[e[1]] == Z[e[2]] for e in keys(H.E[2])]) 

println("The graph has $l edges and $(100*round(c, digits=3)) % of them are within-cluster.")

The graph has 854 edges and 95.89999999999999 % of them are within-cluster.


In [51]:
# encouraging that this does indeed tend to decrease. I don't think it's required to be monotonically decreasing (need to check), so heuristically this looks ok-ish

Ω̂ = buildΩ(estimateΩ(H, Z); by_size=true)

Z_ = copy(Z)

for i = 1:5
    Z_ = HyperLouvain(H,kmax,Ω̂)
    Ω̂  = buildΩ(estimateΩ(H, Z_); by_size=true)
    println("The log-likelihood of the Louvain partition is $(round(logLikelihood(H, Z_, Ω̂),digits=3)).")
end


Louvain Iteration 1
No nodes moved clusters
The log-likelihood of the Louvain partition is -2744.315.

Louvain Iteration 1
No nodes moved clusters
The log-likelihood of the Louvain partition is -2744.315.

Louvain Iteration 1
No nodes moved clusters
The log-likelihood of the Louvain partition is -2744.315.

Louvain Iteration 1
No nodes moved clusters
The log-likelihood of the Louvain partition is -2744.315.

Louvain Iteration 1
No nodes moved clusters
The log-likelihood of the Louvain partition is -2744.315.


In [52]:
Zsing = collect(1:n)

# likelihoods with true parameters

println("The log-likelihood of the true partition is $(round(logLikelihood(H, Z, Ω, ϑ),digits=3)).")
println("The log-likelihood of the singleton partition is $(round(logLikelihood(H, Zsing, Ω, ϑ),digits=3)).")

The log-likelihood of the true partition is -2233.149.
The log-likelihood of the singleton partition is -4697.921.
