We provide scripts to prepare DLR and GNN datasets in the datagen
folder.
Due to limited time and disk space, the preprocessing of embeddings is not included.
In this artifact, the embeddings are initialized without loading the correct embedding values from the dataset.
This only affects the numerical correctness of training/inference, and does not affect the computation workflow.
By default, GNN datasets will be placed in /datasets_gnn
:
tree /datasets_gnn -L 2
/datasets_gnn
├── data-raw # original downloaded dataset
│ ├── com-friendster
│ ├── com-friendster.tar.zst
│ ├── mag240m_kddcup2021
│ ├── mag240m_kddcup2021.zip
│ ├── papers100M-bin
│ └── papers100M-bin.zip
├── gnnlab # converted dataset for GNNLab
│ ├── com-friendster
│ ├── mag240m-homo
│ └── papers100M-undir
└── wholegraph # converted dataset for UGache and WholeGraph
├── com_friendster
├── mag240m_homo
└── ogbn_papers100M
Run the following commands to download and process GNN datasets:
# run following commands in GNN container
cd /ugache/datagen/gnn
python friendster.py
python mag240M.py
python papers100M.py
Apart from downloading ~300GB raw data, the preprocess may take around 1 hour.
The final datasets in gnnlab
and wholegraph
occupy 130GB, while the data-raw
directory occupies up to 600GB.
By default, DLR datasets will be placed in /datasets_dlr
:
tree /datasets_dlr -L 2
/datasets_dlr
├── data-raw # original downloaded dataset
│ ├── criteo_tb
└── processed # converted dataset
├── criteo_tb
└── syn_a12_s100_c800m
Since there's no permanent url to download criteo TB dataset, please download it manually from ailab.criteo.com or aliyun, and place day_0.gz
~ day_23.gz
under /datasets_dlr/data-raw/criteo_tb/
.
Then, run the following commands to process DLR datasets:
# run following commands in DLR container
cd /ugache/datagen/dlr
python syn.py
cd criteo
bash criteo.sh
Depending on your network, downloading and preprocessing full criteo TB dataset may take up to 24 hours and consume around 2TB disk volume. The final dataset in processed
occupies 700GB.
Our experiments have been automated using scripts. Each figure in our paper is considered as one experiment and is associated with a subdirectory in ugache/eval
. The script will automatically run the experiment, save the logs into files, parse the output data from the files, and plot corresponding figure.
tree /ugache/eval -L 2
/ugache/eval
├── dlr
│ ├── figure11-4v100
│ ├── figure11-8a100
│ ├── figure11-8v100
│ ├── figure12-4v100
│ ├── figure12-8a100
│ ├── figure12-8v100
│ ├── figure16
└── gnn
├── figure11-4v100
├── figure11-8a100
├── figure11-8v100
├── figure12-4v100
├── figure12-8a100
├── figure12-8v100
├── figure13
├── figure14
└── figure15
We provide a one-click script to reproduce the results on multi-gpu server. These scripts simply chain commands in the following "Reproducing single figure" section.
$ cd /ugache/eval/gnn # GNN tests in gnn folder should be run in gnn container
# for DLR tests, enter /ugache/eva/dlr in dlr container
$ bash run-all-4v100.sh # run scripts that match the platform: run-all-(4v100,8v100,8a100).sh
In each figure*
folder, execute the following commands. Take dlr/figure11-4v100
for exmaple:
# tests in dlr folder should be run in dlr container
$ cd /ugache/eval/dlr/figure11-4v100
$ make run
$ make plot
$ ls data*
data.dat data.eps
$ cat data.dat
short_app policy_impl dataset_short step.train
dlrm SOK CR 0.005778
dlrm HPS CR 0.004299
dlrm UGache CR 0.002626
dcn SOK CR 0.007870
dcn HPS CR 0.006381
dcn UGache CR 0.004722
dlrm SOK SYN 0.014536
dlrm HPS SYN 0.018224
dlrm UGache SYN 0.008524
dcn SOK SYN 0.047721
dcn HPS SYN 0.046759
dcn UGache SYN 0.037482
The make run
command runs all tests, and logs will be saved to the run-logs
folder.
The make plot
command will first parse logs in run-logs
folder to produce a data.dat
file, then plot corresponding figure to data.eps
.
Each figure folder containers a runner.py
file, and the make run
is simply an alias of python runner.py
.
The python script iterates all configurations, generates a command for each configuration and runs it via os.system
.
You may execute python runner.py -m
to see what command it generates and manually run one configuration.
We recommand the eps-preview
extension in vscode to quickly preview eps figures.
We also provide original log files used in our paper submission in run-logs-paper
folder.
You may run make plot-paper
to directly plot figures using these log files to quickly reproduce the figures in paper without running all tests.