We recommended the following dependencies.
import nltk
nltk.download()
> d punkt
Download the dataset files. We use splits produced by Andrej Karpathy. The raw images can be downloaded from from their original sources here, here and here.
The precomputed image features are extracted from the raw images using the bottom-up attention model from here. Image features for training set, validation set and testing set should be merged in order into one .npy
file, respectively. More details about the image feature extraction can also be found in SCAN(https://github.com/kuanghuei/SCAN).
Data files can be found in SCAN (We use the same dataset split as theirs):
wget https://scanproject.blob.core.windows.net/scan-data/data_no_feature.zip
Place data_no_feature.zip
in the directory of data
.
./script/tune_coco.sh