# A Latent Space Model for Hypergraphs

* Let $G=(V,E)$ be a hypergraph, where $E$ is the collection of hyperedges.
* Let $H$ be the collection of admisible sets of nodes, (e.g., if we only consider hyperedges of size less than 10). 
* $G$ is modeled as a collection of random variables ${X_e: e\in H}$.
* $X_e\sim Poisson(\lambda_e)$, which means we allow multi-hyperedges.
* The rate $\lambda_e$ depends on the features of nodes in $e$: $$\lambda_e=|e|^\alpha\sum_k\prod_{i\in e}\theta_{ik},$$ where $\theta_i$: the latent feature vector for node $i$.
* Assume $X_e$'s are independent of each other given $\theta=(\theta_i)$.
* The distribution of $G$ is given by $$p(G|\theta)=\prod_{e\in H} p(x_e|\theta)$$.
* Take a Bayesian approach to estimate $\theta$ and sample $\theta$ from the posterior using MCMC.
    * Propose $\theta'$.
    * Draw $G'$ given $\theta'$.
        * Start from $G$.
        * Every step pick $e\in H$ at random.
        * Draw $X_e$ from $Poisson(\lambda_e(\theta')$.
    * Move to $\theta'$ with probability $\rho$.

In [1]:
import numpy as np
import scipy.sparse as ssp
import random
from collections import Counter
from scipy.stats import dirichlet, truncnorm

#### Initialization

In [2]:
E = Counter([(1,2),(2,3),(3,4,5)]) # set of hyperedges
M = len(E) # number of edges
V = range(6)# list of nodes
N = len(V) # number of nodes
K=3 # dimension of hidden space
averageSize=2
#theta = np.random.dirichlet([1.0/K]*K,size=N).T
sigma=0.5
theta = truncnorm.rvs((0-1.0/K)/sigma,(1-1.0/K)/sigma,1.0/K,sigma,size=(K,N))
piTheta=reduce(lambda a,b: a*b, (truncnorm.pdf(theta[j,i],(0-1.0/K)/sigma,(1-1.0/K)/sigma,1.0/K,sigma) for i in xrange(N) for j in xrange(K)))
alpha=1

In [3]:
def Lambda(e,theta,alpha):
    return theta[:,list(e)].prod(axis=1).sum()*(len(e)**alpha)

def proposeG(givenE,averageSize):
    if random.random()<0.5: # choose an edge
        e=random.choice(givenE.keys())
        X=np.random.poisson(Lambda(e,theta,alpha))
    else: # choose a nonedge
        while True:
            n=np.random.poisson(averageSize)
            e=tuple(sorted(random.sample(V,min(max(n,1),N))))
            if e not in givenE:
                break
        X=np.random.poisson(Lambda(e,theta,alpha))
    return (e,X)
    
def sampleG(theta,alpha,E):
    change=Counter()
    for i in xrange(100):
        n=np.random.poisson(averageSize)
        e=tuple(sorted(random.sample(V,min(max(n,1),N))))
        lambdae=Lambda(e,theta,alpha)
        if lambdae==0:
            X=0
        else:
            X=np.random.poisson(lambdae)
        if e not in change:
            if E[e]!=X:
                change[e]=X
        else:
            if change[e]!=X:
                change[e]=X
    
    return change

#### Simulate $\theta$

In [4]:
for i in xrange(100):
    # Propose new theta
    #thetaP=np.array([ np.random.dirichlet(theta[:,i]) for i in xrange(theta.shape[1])]).T
    thetaP=np.array([ [truncnorm.rvs((0-i)/sigma,(1-i)/sigma,i,sigma) for i in k] for k in theta])
    piThetaP=reduce(lambda a,b: a*b, (truncnorm.pdf(thetaP[j,i],(0-1.0/K)/sigma,(1-1.0/K)/sigma,1.0/K,sigma) for i in xrange(N) for j in xrange(K)))
    # Sample G' from new theta
    EP=sampleG(theta,alpha,E)
    # Calculate transition probability
    rho=1.0
    for e in EP:
        lambdae=Lambda(e,theta,alpha)
        lambdaep=Lambda(e,thetaP,alpha)
        if lambdae==0 or lambdaep==0:
            rho=0
            break
        rho*=(lambdae/lambdaep)**(EP[e]-E[e])
    rho*=piThetaP/piTheta
    # Move
    if random.random()<rho:
        theta=thetaP
        piTheta=piThetaP

NameError: name 'thetaP' is not defined