Skip to content

yunbum-kook/icdm20-hyperff

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

HyperFF: Model For Generating Realistic Hypergraphs

Source code for (1) analysis of real-world hypergraphs and (2) our model HyperFF for generating realistic hypergraphs, described in the paper Evolution of Real-world Hypergraphs: Patterns and Models without Oracles, Yunbum Kook, Jihoon Ko and Kijung Shin, IEEE ICDM 2020.

In this work, we (A) establish structural and temporal patterns of real-world hypergraphs and (B) devise a stochastic model for generating realistic hypergraphs. That is,

(A) Establishment of structural and dynamical patterns in real-world hypergraphs

  • Structural: Heavy-tailed distributions of degrees, hyperedge sizes, intersection sizes, and singular values of incidence matrices.
  • Dynamical: Diminishing overlaps of hyperedges, densification, and shrinking diameter.

(B) Stochastic model HyperFF (Hyper Forest Fire) for hypergraph generation with the following merits

  • Realistic: It exhibits all seven observed patterns and the five structural patterns reported in the previous study.
  • Self-contained: It does not rely on oracles or external information, and it is parameterized by just two scalars.
  • Emergent: Its simple and interpretable mechanisms on individual nodes non-trivially produce the examined patterns at the macroscopic level.

Requirements

To get requirements ready, run the following command on your terminal:

pip install -r requirements.txt

Datasets

** Please download the datasets from the links in the table, unzip them, and put the folders unzipped under the "./data" folder so that the hierarchy would be like, for example,

data
  |__contact-high-school
  |__email-Eu-full
src

The datasets used in the paper are collected by Austin R. Benson and listed as follows:

Name #Nodes #Edges Description Download
contact-high-school (contact) 327 172,035 Social Interaction Link
email-Eu-full (email) 1,005 235,263 Email Link
tags-ask-ubuntu (tags) 3,029 271,233 Q&A Link
NDC-substances-full (substances) 5,556 112,919 Drug Link
threads-math-sx (threads) 176,445 719,792 Q&A Link
coauth-DBLP-full (coauth) 1,924,991 3,700,067 Coauthorship Link

Reproduction of Analysis and HyperFF

After downloading all the required datasets, you may run the first line of the following on your terminal:

main.py [-h] [-p BURNING] [-q EXPANDING] [-n NODES] dataset

positional arguments:
  dataset               Select dataset for analysis

optional arguments:
  -h, --help            show this help message and exit
  -p BURNING, --burning BURNING
                        Select the burning probability p (if the target dataset is 'model')
  -q EXPANDING, --expanding EXPANDING
                        Select the expanding probability q (if the target dataset is 'model')
  -n NODES, --nodes NODES
                        Select the number of nodes n (if the target dataset is 'model')

For example, python main.py substances reproduces the results from NDC-substances (substances) dataset. To reproduce the results in the paper generated by HyperFF, run python main.py -p 0.51 -q 0.2 -n 10000 model

Terms and Conditions

If you use this code as part of any published research, please consider acknowledging our IEEE ICDM 2020 paper.

@inproceedings{kook2020hyperff,
  title={Evolution of Real-world Hypergraphs: Patterns and Models without Oracles},
  author={Kook, Yunbum and Ko, Jihoon and  and Shin, Kijung},
  booktitle={IEEE International Conference on Data Mining (ICDM)},
  year={2020},
}

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages