Peiyun Hu, Zack Lipton, Anima Anandkumar, Deva Ramanan
- mxnet-cu90mkl==0.12.1 (or mxnet==0.12.1, mxnet-cu90==0.12.1)
- opencv-python
- numpy
- nltk
- tqdm
Since MXNet is more efficient when data is nicely serialized, we first prepare dataset in a record file. We include scripts that convert data form the original format into record files. We automated downloading for cifar10 and cifar100. For tinyimagenet200, please download it from ImageNet's official website after logging in. Please refer to scripts under script/data for more details.
Once data preparation is done, we expect to find a ./data
with the following structure.
data
├── cifar10
│ ├── cifar-10-batches-py
│ ├── test.idx
│ ├── test.lst
│ ├── test.rec
│ ├── train.idx
│ ├── train.lst
│ └── train.rec
├── cifar100
│ ├── cifar-100-python
│ ├── test.idx
│ ├── test.lst
│ ├── test.rec
│ ├── train.idx
│ ├── train.lst
│ └── train.rec
└── tinyimagenet200
├── test
├── test.idx
├── test.lst
├── test.rec
├── train
├── train.idx
├── train.lst
├── train.rec
├── val
├── val.idx
├── val.lst
├── val.rec
├── wnids.txt
└── words.txt
We need to convert a set of class labels into a set of binary questions based on how class labels can be grouped together. Please refer to questions.py
for how binary questions are constructed for cifar10, cifar100, and tinyimagenet200 based on wordnet.
The code is written in a way that it can be configured for many variants of our model. However, the number of parameters and the effort of configuring can look daunting. To help with this, we include a script that shows how configuration is done for all variants of our model. Please refer to script/experiment/train_all_variants.py
.
usage: train_all_variants.py [-h] [--dryrun] [--dataset DATASET]
[--gpus GPUS [GPUS ...]] [--run-id RUN_ID]
optional arguments:
-h, --help show this help message and exit
--dryrun switch for a dry run
--dataset DATASET [cifar10, cifar100, tinyimagenet200]
--gpus GPUS [GPUS ...]
the list of gpu ids to cycle over
--run-id RUN_ID which cached random seed to use
I was using MXNet 0.12.1 when developing this code base. In later versions (e.g. the latest 1.3.0), there is a bug from mxnet/image/image.py that I am not sure how to fix. Until I found a way around, MXNet 0.12.1 is recommended.
Please refer to this pull request created by Abhay Mittal(@abhaymittal). Here is a short description:
- The division train_size / 10 on this line needs to be casted to int otherwise MxNet throws an error.
- In the MXNet code, I had to make a change in the next_sample method to use len(self.seq) instead of self.num_image
- Finally, in the last batch, MxNet was adding padding by using images in the beginning of the dataset. Therefore the value of self.cur was resetting and go backwards (code)( For e.g. if initially the cur was at 12200 and the whole seq length was 12300 with batch size of 200, the cursor would reset to 100 by taking 100 images from the beginning of the dataset. I solved this by storing the initial value of the cursor and then just taking the images from that point.
Write an email to me at peiyunh@cs.cmu.edu.