In ECCV 2020 [pdf]
- Tensorflow 1.3
- Python 2.7
For training and testing the model using this code, the dataset needs to be in Tensorflow's tfrecord format. The tfrecord files contain the:
- images
- annotations i.e. labels
- CNN prediction probabilities (base CNN trained with sigmoid cross entropy on the same dataset).
To convert to tfrecord format, use the script:
python data_utils/download_and_convert_data.py --dataset_name=coco --dataset_dir=../data/coco
Please refer to the scripts in data_utils for details.
The dataset split used by us is given in the data folder. The images can be obtained from the respective dataset pages. The resnet features and prediction probabilities that are used to create the tfrecord files, can be downloaded from the following:
To train or test the model,
python runner.py \
--train_dir=${TRAIN_DIR} \
--dataset_dir=${DATASET_DIR} \
--dataset_name=coco \
--model=lstm_sem_multi_order \
--dim_hidden=512 \
--dim_embed=256 \
--prev2out=False \
--ctx2out=False \
--run_opt=test \
--batch_size=32 \
--eval_batch_size=100 \
--loss=sigmoid \
where, run_opt='train' or 'test'; dataset_dir is the directory which contains the dataset .tfrecord files.
The same is provided in the bash script run.sh.
If you find our work useful in your research, please consider citing us:
@inproceedings{ayu2020mornn,
title = {Recurrent Image Annotation With Explicit Inter-Label Dependencies},
author = {Ayushi Dutta, Yashaswi Verma, C.V. Jawahar},
booktitle = {European Conference on Computer Vision (ECCV)},
year = {2020}
}
Email(First Author): ayushi.dutta@alumni.iiit.ac.in