This document explains how the code in this repository can be used to produce the results reported in the following paper:
Deep Learning on Small Datasets without Pre-Training using Cosine Loss.
Björn Barz and Joachim Denzler.
IEEE Winter Conference on Applications of Computer Vision (WACV), 2020.
According to Table 2 in the paper:
| Loss Function | CUB | NAB | Cars | Flowers | MIT 67 Scenes | CIFAR-100 |
|---|---|---|---|---|---|---|
| cross entropy | 51.9% | 59.4% | 78.2% | 67.3% | 44.3% | 77.0% |
| cross entropy + label smoothing | 55.9% | 68.3% | 78.1% | 66.8% | 38.7% | 77.5% |
| cosine loss | 67.6% | 71.7% | 84.3% | 71.1% | 51.5% | 75.3% |
| cosine loss + cross entropy | 68.0% | 71.9% | 85.0% | 70.6% | 52.7% | 76.4% |
- Python >= 3.5
- numpy
- numexpr
- keras >= 2.2.0
- tensorflow (we used v1.8)
- sklearn
- scipy
- pillow
The following datasets have been used in the paper:
- Caltech UCSD Birds-200-2011 (CUB)
- North American Birds (NAB-large)
- Stanford Cars (Cars)
- Oxford Flowers-102 (Flowers)
- MIT 67 Indoor Scenes (MIT67Scenes)
- CIFAR-100 (CIFAR-100)
The names in parentheses specify the dataset names that can be passed to the scripts mentioned below.
In the following exemplary python script calls, replace $DS with the name of the dataset (see above),
$DSROOT with the path to that dataset, and $LR with the maximum learning rate for SGDR.
To save the model after training has completed, add --model_dump followed by the filename where the model definition and weights should be written to.
python learn_classifier.py \
--dataset $DS --data_root $DSROOT --sgdr_max_lr $LR \
--architecture resnet-50 --batch_size 96 \
--gpus 4 --read_workers 16 --queue_size 32 --gpu_mergeFor label smoothing, add --label_smoothing 0.1.
python learn_image_embeddings.py \
--dataset $DS --data_root $DSROOT --sgdr_max_lr $LR \
--embedding onehot --architecture resnet-50 --batch_size 96 \
--gpus 4 --read_workers 16 --queue_size 32 --gpu_mergeFor the combined cosine + cross-entropy loss, add --cls_weight 0.1.
To use semantic embeddings instead of one-hot vectors, pass a path to one of the embedding files in the embeddings directory to --embedding instead of onehot.
For the CIFAR-100 dataset, use the following parameters:
python learn_classifier.py \
--dataset CIFAR-100 --data_root $DSROOT --sgdr_max_lr $LR \
--architecture resnet-110-wfc --batch_size 100
python learn_image_embeddings.py \
--dataset CIFAR-100 --data_root $DSROOT --sgdr_max_lr $LR \
--embedding onehot --architecture resnet-110-wfc --batch_size 100For each dataset and loss function, we fine-tuned the learning rate individually by wrapping the training script calls into a bash loop like the following (here shown for training with the cosine loss on CIFAR-100 as an example):
for LR in 2.5 1.0 0.5 0.1 0.05 0.01 0.005 0.001; do
echo $LR
python learn_image_embeddings.py \
--dataset CIFAR-100 --data_root $DSROOT --sgdr_max_lr $LR \
--embedding onehot --architecture resnet-110-wfc --batch_size 100 \
2>/dev/null | grep -oP "val_(prob_)?acc: \K([0-9.]+)" | sort -n | tail -n 1
doneThe following table lists the values for --sgdr_max_lr that led to the best results.
| Loss | CUB | NAB | Cars | Flowers | MIT 67 Scenes | CIFAR-100 |
|---|---|---|---|---|---|---|
| cross entropy | 0.05 | 0.05 | 1.0 | 1.0 | 0.05 | 0.1 |
| cross entropy + label smoothing | 0.05 | 0.1 | 1.0 | 0.1 | 1.0 | 0.1 |
| cosine loss (one-hot) | 0.5 | 0.5 | 1.0 | 0.5 | 2.5 | 0.05 |
| cosine loss + cross entropy (one-hot) | 0.5 | 0.5 | 0.5 | 0.5 | 2.5 | 0.1 |
To experiment with differently sized variants of the CUB dataset, download the modified image list files and unzip the obtained archive into the root directory of your CUB dataset.
For training, specify the dataset name as CUB-subX, where X is the number of samples per class.
