Skip to content

Latest commit

 

History

History
59 lines (32 loc) · 4.61 KB

README.md

File metadata and controls

59 lines (32 loc) · 4.61 KB

Hybrid-Membership Latent Distance Model (HM-LDM)

Python 3.8.3 and Pytorch 1.9.0 implementation of the Hybrid-Membership Latent Distance Model (HM-LDM).

Description

A central aim of modeling complex networks is to accurately embed networks in order to detect structures and predict link and node properties. The latent space models (LSM) have become prominent frameworks for embedding networks and include the latent distance (LDM) and eigenmodel (LEM) as the most widely used LSM specifications. For latent community detection, the embedding space in LDMs has been endowed with a clustering model whereas LEMs have been constrained to part-based non-negative matrix factorization (NMF) inspired representations promoting community discovery. We presently reconcile LSMs with latent community detection by constraining the LDM representation to the D-simplex forming the hybrid-membership latent distance model (HM-LDM). We show that for sufficiently large simplex volumes this can be achieved without loss of expressive power whereas by extending the model to squared Euclidean distances, we recover the LEM formulation with constraints promoting part-based representations akin to NMF. Importantly, by systematically reducing the volume of the simplex, the model becomes unique and ultimately leads to hard assignments of nodes to simplex corners. We demonstrate experimentally how the proposed HM-LDM admits accurate node representations in regimes ensuring identifiability and valid community extraction. Importantly, HM-LDM naturally reconciles soft and hard community detection with network embeddings exploring a simple continuous optimization procedure on a volume constrained simplex that admits the systematic investigation of trade-offs between hard and mixed membership community detection.

Unipartite network example based on the inferred HM-LDM memberships for AstroPh and Facebook Networks

drawing drawing drawing drawing
AstroPh (p=2) AstroPh (p=1) Facebook (p=2) Facebook (p=1)

A Bipartite Example with a Drug-Gene Network

drawing drawing
Drug-Gene (p=2) Drug-Gene (p=1)

Installation

pip install -r requirements.txt

Our Pytorch implementation uses the pytorch_sparse package. Installation guidelines can be found at the corresponding Github repository.

Learning identifiable graph representations with HM-LDM

RUN:   python main.py

optional arguments:

--epochs   number of epochs for training (default: 10K)

--scaling_epochs   number of epochs for learning initial scale for the random effects (default: 2K)

--cuda   CUDA training (default: True)

--LP   performs link prediction (default: True)

--D   dimensionality of the embeddings (default: 8)

--lr   learning rate for the ADAM optimizer (default: 0.1)

--p   L2 norm power (default: 1)

--dataset   dataset to apply HM-LDM (default: grqc)

--sample_percentage   sample size network percentage, it should be equal or less than 1 (default: 1)

--delta_sq   delta^2 hyperparameter controlling the volume of the simplex

CUDA Implementation

The code has been primarily constructed and optimized for running in a GPU-enabled environment.

References

N. Nakis, A. Celikkanat, and M. Mørup, Hybrib-Membership-Latent-Distance-Model, 11th International Conference on Complex Networks and their Applications, CNA 2022.