Graph-Skeleton

This repository is for "Graph-Skeleton: ～1% Nodes are Sufficient to Represent Billion-Scale Graph" accepted by WWW2024.

Overview

In this paper, we focus on a common challenge in graph mining applications: compressing the massive background nodes for a small part of target nodes classification, to tackle the storage and model deployment difficulties on very large graphs. The proposed Graph-Skeleton first fetches the essential background nodes under the guidance of structural connectivity and feature correlation, then condenses the information of background nodes with three different condensation strategies. The generated skeleton graph is highly informative and friendly for storage and graph model deployment.

Install

conda install py-boost
pip install boxprint
pip install colors.py

Data Download

The currently code demo is based on the dataset DGraph-Fin . Please download the DGraphFin.zip and move it under directory datasets, unizp the dataset folder and organize as follows:

.
--datasets
   └─DGraphFin
     └─dgraphfin.npz

Preliminary Studies

You can run the scripts in fold ./preliminary_exploration to implement the exploration studies (Section 2 [Empirical Analysis]) in our paper. The edge cutting type is controlled by hyper-parameters --cut, which includes random cut (cut_random), cut T-T (tt), cut T-B (tb), cut B-B (bb).

cd preliminary_exploration
python dgraph/gnn_mini_batch.py --cut tt
python dgraph/gnn_mini_batch.py --cut cut_random --rd_ratio 0.5

Graph-Skeleton Compression

Compile Graph-Skeleton

cd skeleton_compress
mkdir build
cd build
cmake ..
make
cd ..

Graph-Skeleton Generation

You can run the script ./skeleton_compress/skeleton_compression.py to generate skeleton graphs. Please note that in our original paper, hyper-parameters --d is set as [2,1], you can also modify the setting of d to change the node fetching distance. By setting different values of hyper-parameters --cut, different strategies (i.e., $\alpha$, $\beta$ and $\gamma$) will be utilized for graph condensation.

python skeleton_compression.py

Graph-Skeleton Downstream Target Classification

You can run the scripts in fold ./skeleton_test to deploy the generated skeleton graphs on downstream GNNs for target nodes classification task.

cd skeleton_test/dgraph
python gnn_mini_batch.py --cut skeleton_gamma

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
figs		figs
preliminary_exploration		preliminary_exploration
skeleton_compress		skeleton_compress
skeleton_test/dgraph		skeleton_test/dgraph
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Graph-Skeleton

Overview

Install

Data Download

Preliminary Studies

Graph-Skeleton Compression

Graph-Skeleton Downstream Target Classification

About

Releases

Packages

Contributors 2

Languages

caolinfeng/GraphSkeleton

Folders and files

Latest commit

History

Repository files navigation

Graph-Skeleton

Overview

Install

Data Download

Preliminary Studies

Graph-Skeleton Compression

Graph-Skeleton Downstream Target Classification

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages