Below you can find a outline of how to reproduce our solution for the CHAMPS competition. If you run into any trouble with the setup/code or have any questions please contact us at Zico.Kolter@us.bosch.com
Copyright 2019 Robert Bosch GmbH
Code authors: Zico Kolter, Shaojie Bai, Devin Wilmott, Mordechai Kornbluth, Jonathan Mailoa, part of Bosch Research (CR).
config/: Configuration files
data/: Raw data
models/: Saved models
processed/: Processed data
src/: Source code for preprocessing, training, and predicting.
submission/: Directory for the actual predictions
Hardware (The following specs were used to create the original solution)
The variety of models were trained on different machines, each running a Linux OS:
- 5 machines had 4 GPUs, each a NVIDIA GeForce RTX 2080 Ti
- 2 machines had 1 GPU NVIDIA Tesla V100 with 32 GB memory
- 6 machines had 1 GPU NVIDIA Tesla V100 with 16 GB memory
- Python 3.5+
- CUDA 10.1
- NVIDIA APEX (Only available through the repo at this phase)
Python packages are detailed separately in
Note: Though listed in
rdkit is not available with
pip. We strongly suggest installing
rdkit via conda:
conda install -c rdkit rdkit
We use only the
structures.csv files of the competition. They should be (unzipped and) placed in the
data/ directory. All of the commands below are executed from the
python pipeline_pre.py 1(This could take 1-2 hours)
python pipeline_pre.py 2
(You may need to change the permission to the
.csv files to use the two scripts above via
Model Build - There are three options to produce the solution.
- Very fast prediction:
predictor.py fastto use the precomputed results for ensembling.
- Ordinary prediction:
predictor.pyto use the precomputed checkpoints for predicting and ensembling.
- Re-train models:
train.pyto train a new model from scratch. See
train.py -hfor allowed arguments, and
configfiles for each model for the arguments used.
config/models.json file contains the following important keys:
- names: List of the names we will ensemble
- output file: The name of the ensembled output file
- num atom types, bond types, triplet types, quad types: These are arguments to pass to the GraphTransformer instantiator. Note that in the default setting, quadruplet information is not used by GTs.
model_dir: The directory in
models/associated with each model. Each directory must have
GraphTransformerclass (and any modules it needs);
configfile with the kwargs to instantiate the
[MODEL_NAME].ckptthat can be loaded via
load_state_dict(torch.load('[MODEL_NAME].ckpt').state_dict())(to avoid PyTorch version conflict).
Notes on (Pre-trained) Model Loading
All pretrained models are stored in
models/. However, different models may have slightly different architecture (e.g., some GT models are followed by a 2-layer grouped residual network, while some others only have one residual block). The training script (
train.py), when initiated without the
--debug flag, will automatically create a log folder in
CHAMPS-GT/ that contains the code for the GT used. When loading the model, use the
graph_transformer.py in that log folder (instead of the default one in
Notes on Model Training
When trained from scratch, the default parameters should lead to a model achieving a score of around -3.06 to -3.07. Using
--debug flag will prevent the program from creating a log folder.
Notes on Saving Memory
What if you got a
CUDA out of memory error? We suggest a few solutions:
- If you have a multi-GPU machine, use the
--multi_gpuflag, and tune the
--gpu0_bszflag (which controls the minibatch size passed to GPU device 0). For instance, on a 4-GPU machine, you can do
python train.py [...] --batch_size 47 --multi_gpu --gpu0_bsz 11, which assigns a batch size of 12 to GPU
1,2,3and a batch size of 11 to GPU
- Use the
--fp16option, which applies NVIDIA APEX's mixed precision training.
- Use the
--batch_chunkoption, which chunks a larger batch into a few smaller (equal) shares. The gradients from the smaller minibatches will accumulate, so the effective batch size is still the same as
- Use fewer
--n_layer, or smaller