Skip to content
Branch: master
Find file History
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
..
Failed to load latest commit information.
dis_sampling
README.md
gcn_cv_sc.py
gcn_ns_sc.py
graphsage_cv.py
multi_process_train.py
run_store_server.py
train.py

README.md

Stochastic Training for Graph Convolutional Networks

Dependencies

  • MXNet nightly build
pip install mxnet --pre

Neighbor Sampling & Skip Connection

cora: test accuracy ~83% with --num-neighbors 2, ~84% by training on the full graph

DGLBACKEND=mxnet python examples/mxnet/sampling/train.py --model gcn_ns --dataset cora --self-loop --num-neighbors 2 --batch-size 1000 --test-batch-size 5000

citeseer: test accuracy ~69% with --num-neighbors 2, ~70% by training on the full graph

DGLBACKEND=mxnet python examples/mxnet/sampling/train.py --model gcn_ns --dataset citeseer --self-loop --num-neighbors 2 --batch-size 1000 --test-batch-size 5000

pubmed: test accuracy ~78% with --num-neighbors 3, ~77% by training on the full graph

DGLBACKEND=mxnet python examples/mxnet/sampling/train.py --model gcn_ns --dataset pubmed --self-loop --num-neighbors 3 --batch-size 1000 --test-batch-size 5000

reddit: test accuracy ~91% with --num-neighbors 3 and --batch-size 1000, ~93% by training on the full graph

DGLBACKEND=mxnet python examples/mxnet/sampling/train.py --model gcn_ns --dataset reddit-self-loop --num-neighbors 3 --batch-size 1000 --test-batch-size 5000 --n-hidden 64

Control Variate & Skip Connection

cora: test accuracy ~84% with --num-neighbors 1, ~84% by training on the full graph

DGLBACKEND=mxnet python examples/mxnet/sampling/train.py --model gcn_cv --dataset cora --self-loop --num-neighbors 1 --batch-size 1000000 --test-batch-size 1000000

citeseer: test accuracy ~69% with --num-neighbors 1, ~70% by training on the full graph

DGLBACKEND=mxnet python examples/mxnet/sampling/train.py --model gcn_cv --dataset citeseer --self-loop --num-neighbors 1 --batch-size 1000000 --test-batch-size 1000000

pubmed: test accuracy ~79% with --num-neighbors 1, ~77% by training on the full graph

DGLBACKEND=mxnet python examples/mxnet/sampling/train.py --model gcn_cv --dataset pubmed --self-loop --num-neighbors 1 --batch-size 1000000 --test-batch-size 1000000

reddit: test accuracy ~93% with --num-neighbors 1 and --batch-size 1000, ~93% by training on the full graph

DGLBACKEND=mxnet python examples/mxnet/sampling/train.py --model gcn_cv --dataset reddit-self-loop --num-neighbors 1 --batch-size 10000 --test-batch-size 5000 --n-hidden 64

Control Variate & GraphSAGE-mean

Following Control Variate, we use the mean pooling architecture GraphSAGE-mean, two linear layers and layer normalization per graph convolution layer.

reddit: test accuracy 96.1% with --num-neighbors 1 and --batch-size 1000, ~96.2% in Control Variate with --num-neighbors 2 and --batch-size 1000

DGLBACKEND=mxnet python examples/mxnet/sampling/train.py --model graphsage_cv --batch-size 1000 --test-batch-size 5000 --n-epochs 50 --dataset reddit --num-neighbors 1 --n-hidden 128 --dropout 0.2 --weight-decay 0

Run multi-processing training

Run the graph store server that loads the reddit dataset with four workers.

python3 examples/mxnet/sampling/run_store_server.py --dataset reddit --num-workers 4

Run four workers to train GraphSage on the reddit dataset.

python3 ../incubator-mxnet/tools/launch.py -n 4 -s 1 --launcher local python3 examples/mxnet/sampling/multi_process_train.py --model graphsage_cv --batch-size 1000 --test-batch-size 5000 --n-epochs 1 --graph-name reddit --num-neighbors 1 --n-hidden 128 --dropout 0.2 --weight-decay 0
You can’t perform that action at this time.