# Generating random hypergraphs

Here we show how to use HyperGraphs.jl to generate random hypergraphs.

Note that in the examples below we use `-1` a number of times; this is because `0` is a valid degree and cardinality value, meaning that we have to offset iteration counts to align with the first object being index `0` and not `1` as usual.

There are two obvious ways to generate random hypergraphs; this can be done either
- by randomly populating `m` hyperedges with `k` vertices, with each `k` being randomly drawn from a distribution that describes the probability of observing hyperedges of some given cardinality; or
- by iterating over all possible combinations of vertices (i.e. elements of the power set of V), where each combination appears as the vertex set of a hyperedge with some probability (that probability being either fixed i.e. the same for each combination, or with each combination having a different probability).

The latter way is more computationally demanding and so we choose to focus on the first way here. This is similar to the approach taken in
> Schawe, Hendrik, and Laura Hernández. 2022. ‘Higher Order Interactions Destroy Phase Transitions in Deffuant Opinion Dynamics Model’. _Communications Physics_ **5**(1): 1–9. https://doi.org/10.1038/s42005-022-00807-4.

There the authors note that iterating over all possible hyperedges is not feasable, choosing instead to randomly draw the number of `k`-hyperedges to create.

In [1]:
using HyperGraphs

## 1. Generating random `k`-uniform hypergraphs

We start with the simple case of a `k`-uniform hypergraph, where each hyperedge has the same, fixed, cardinality `k`. Note that in the case where `k = 2` we recover a random graph. The parameters are:
- `n` the number of vertices,
- `m` the number of hyperedges,
- `k` the cardinality of each hyperedge.

In [2]:
n, m, k = 20, 7, 3

(20, 7, 3)

Given `n` and assuming vertices are integers, we can get the vertex set `V`.

In [3]:
V = 1:n

1:20

It is then straightforward to create random hyperedges by randomly drawing `k` vertices in `V`, `m` times:

In [4]:
es = [HyperEdge(rand(V, k)) for _ in 1:m]

7-element Vector{HyperEdge{Int64}}:
 HyperEdge{Int64}([17, 12, 1])
 HyperEdge{Int64}([19, 4, 11])
 HyperEdge{Int64}([16, 17, 8])
 HyperEdge{Int64}([20, 20, 8])
 HyperEdge{Int64}([1, 15, 5])
 HyperEdge{Int64}([14, 6, 15])
 HyperEdge{Int64}([9, 13, 10])

These hyperedges may then be used to create a hypergraph, and we can check that the resulting hypergraph is `k`-uniform.

In [5]:
x = HyperGraph(es)
iskuniform(x, k)

true

Note that hyperedges are generated with very few assumptions; particularly, multisets are naturally allowed since the same vertex in `V` may be drawn multiple times.

## 2. Generating non-uniform random hypergraphs

Here we focus on the more general case of non-uniform random hypergraphs, which arises as a generalisation of the approach above.

A useful reference is
> Dewar, Megan, John Healy, Xavier Pérez-Giménez, Paweł Prałat, John Proos, Benjamin Reiniger, and Kirill Ternovsky. 2018. ‘Subhypergraphs in Non-Uniform Random Hypergraphs’. _ArXiv:1703.07686 [Math]_, March. http://arxiv.org/abs/1703.07686.

There, each hyperedge of each cardinality appears in the random hypergraph with some probability. As discussed above, we depart from this approach because it is too demanding to iterate over all possible combinations of vertices. Instead we use a vector describing cardinality counts, i.e. how many hyperedges of cardinality `k` we will randomly draw (in a similar way as above).

The parameters are:
- `n` the number of vertices,
- `ks` a vector describing cardinality counts.

In [6]:
n, ks = 20, [0, 0, 5, 2]

(20, [0, 0, 5, 2])

Note that `ks` now encodes both cardinality information and number of hyperedges `m`:
- each entry of `ks` corresponds to the cardinality of the index of that entry, minus `1` to offset the indexing to allow for cardinalities of `0`;
- the sum of entries of `ks` gives `m` as defined in the example above.

Explicitly, the vector `ks` defined as above means we are asking for `5` hyperedges of cardinality `2` and `2` hyperedges of cardinality `3`, thus a total of `m = sum(ks) = 7` hyperedges. This is simply a generalisation of the example above, where the equivalent `ks` would have been `ks = [0, 0, 0, 7]`.

Again, we can get `V` by doing

In [7]:
V = 1:n

1:20

We can then generate random hyperedges by doing

In [8]:
es = Vector{HyperEdge{Int64}}()
for (k, m) in enumerate(ks) 
    append!(es, [HyperEdge(rand(V, k-1)) for _ in 1:m])
end
es

7-element Vector{HyperEdge{Int64}}:
 HyperEdge{Int64}([17, 16])
 HyperEdge{Int64}([18, 19])
 HyperEdge{Int64}([16, 7])
 HyperEdge{Int64}([1, 15])
 HyperEdge{Int64}([17, 2])
 HyperEdge{Int64}([3, 1, 5])
 HyperEdge{Int64}([9, 7, 20])

The for loop generates `m` hyperedges of cardinality `k` for each `k` in the indices of `ks`, with `m` being given by the entries of `ks`.

These hyperedges may be used to generate a hypergraph, and we can check that the resulting hypergraph has as many hyperedges as the sum of `ks`, and that the cardinality counts correspond to the ones asked for.

In [9]:
x = HyperGraph(es)
(nhe(x) == length(es) == sum(ks), cardinality_counts(x) == ks)

(true, true)

Alternatively, `ks` may be drawn from a discrete distribution which support is the possible cardinality values. 

We illustrate this with a simple uniform distribution; we only show how to generate `ks` this way, since a random hypergraph arising from that `ks` may be generated in the same way as above.

In [10]:
using Distributions
d = DiscreteUniform(0, 3)
ks = [rand(d) for k in 0:3]

4-element Vector{Int64}:
 2
 2
 3
 3

Here, cardinalities range from `0` to `3`, thus defining the support of our distribution `d`.

This generalises random `k`-uniform hypergraphs, where the two approaches are related via a Dirac distribution parameterised with some `k`.

In [11]:
k = 3
ks .* Int.(pdf.(Dirac(k), 0:3))

4-element Vector{Int64}:
 0
 0
 0
 3

## 3. Closing remarks: some words on a more general approach to generating non-uniform random hypergraphs

We have discussed above how iterating over each possible vertices combinations is computationally too demanding. Here we illustrate how it would be done, using a simple case that does not require too much compute time. This approach will not efficiently scale but we hope this simple example provides some intuition for a more general way of generating random hypergraphs.

In [12]:
n, m, k, V = 20, 7, 3, 1:n

(20, 7, 3, 1:20)

We can use the `powerset` function from `Combinatorics` to get all possible hyperedge vertex sets (not counting multisets).

In [13]:
using Combinatorics
collect(powerset(V))

1048576-element Vector{Vector{Int64}}:
 []
 [1]
 [2]
 [3]
 [4]
 [5]
 [6]
 [7]
 [8]
 [9]
 [10]
 [11]
 [12]
 ⋮
 [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 13, 14, 15, 16, 17, 18, 19, 20]
 [1, 2, 3, 4, 5, 6, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]
 [1, 2, 3, 4, 5, 6, 7, 8, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]
 [1, 2, 3, 4, 5, 6, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]
 [1, 2, 3, 4, 5, 6, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]
 [1, 2, 3, 4, 5, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]
 [1, 2, 3, 4, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]
 [1, 2, 3, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]
 [1, 2, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]
 [1, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]
 [2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]
 [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]

The cell below shows how this could be used, here to generate a `3`-uniform hypergraph with `7` hyperedges.

In [14]:
vs = rand(collect(combinations(V, 3)), 7)
es = HyperEdge.(vs)
x = HyperGraph(es)

HyperGraph{Int64}([3, 9, 20, 2, 17, 8, 10, 14, 1, 4, 6, 5, 11, 15, 7, 19], HyperEdge{Int64}[HyperEdge{Int64}([3, 9, 20]), HyperEdge{Int64}([2, 3, 17]), HyperEdge{Int64}([8, 10, 14]), HyperEdge{Int64}([1, 4, 6]), HyperEdge{Int64}([4, 5, 10]), HyperEdge{Int64}([3, 11, 15]), HyperEdge{Int64}([5, 7, 19])])