GitHub - chenzhao/light-dist-gnn

A lightweight distributed GNN library for full batch node property prediction.

Features/Changelog

Complete refactoring of CAGNET.
Distributed utilities such as log, timer, etc.
Node feature cached training.
Partitioned graph cache on disk.
More datasets. Most large graphs from pyg, dgl, ogb supported.
Training depends on pytorch only.
Distributed GAT training.
Latest pytorch version supported.
CSR graph supported.
Half precision training supported.

Getting started

Setup a clean environment.

conda create --name gnn
conda activate gnn

Install pytorch (needed for training) and other libraries (needed for downloading datasets).

// Cuda 10:
conda install pytorch torchvision torchaudio cudatoolkit=10.2 -c pytorch-lts
conda install -c dglteam dgl-cuda10.2
conda install pyg -c pyg -c conda-forge
pip install ogb

// Cuda 11:
conda install pytorch torchvision torchaudio cudatoolkit=11.1 -c pytorch-lts -c nvidia
conda install -c dglteam dgl-cuda11.1
conda install pyg -c pyg -c conda-forge
pip install ogb

Compile and install spmm. (Optional. CUDA dev environment needed.)

cd spmm_cpp
python setup.py install

Prepare datasets (edit the code according to your needs).

//This may take a while.
python prepare_data.py

Train.

python main.py

Experiments for Sancus: Staleness-Aware Communication-Avoiding Full-Graph Decentralized Training in Large-Scale Graph Neural Networks

Check the steps in Getting started .
Check dataset, epoch, and num of GPUs in main.py.
Check model settings in dist_train.py
Check cache methods in models.
Run and see the result.

Contact

Contact chenzhao@ust.hk for any problems.

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
coo_graph		coo_graph
dist_utils		dist_utils
models		models
spmm_cpp		spmm_cpp
.gitignore		.gitignore
dist_train.py		dist_train.py
evaluate_dist_env.py		evaluate_dist_env.py
main.py		main.py
prepare_data.py		prepare_data.py
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

A lightweight distributed GNN library for full batch node property prediction.

Features/Changelog

Getting started

Experiments for Sancus: Staleness-Aware Communication-Avoiding Full-Graph Decentralized Training in Large-Scale Graph Neural Networks

Contact

About

Releases

Packages

Languages

chenzhao/light-dist-gnn

Folders and files

Latest commit

History

Repository files navigation

A lightweight distributed GNN library for full batch node property prediction.

Features/Changelog

Getting started

Experiments for Sancus: Staleness-Aware Communication-Avoiding Full-Graph Decentralized Training in Large-Scale Graph Neural Networks

Contact

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages