# ICAP_TOOL

The ```icap_tool``` is a command line utility written in Python that allows you to flexibly specify the dataset of desired scales, different types of models and desired number of epochs.

Icap_tool is composed of three main subcommands: train, predict and evaluate

In [3]:
!python icap_tool.py --help

Usage: icap_tool.py [OPTIONS] COMMAND [ARGS]...

Options:
  --help  Show this message and exit.

Commands:
  evaluate
  predict
  train


## 'Train' subcommand

1) Dataset selection<br/>
By default, the command will use full 30k images of Flickr30k dataset and split it into 80% train set and 20% test set.<br/>
Howevever, you can use part of the data for a quick run. For example, "-d Flickr30k_8000_0.8" option will use only 8000 images with 80%:20% split for train and test respectively.

2) Model selection<br/>
There are 4 models available. cascaded_encoder_decoder, merged_encoder_decoder, encoder_decoder_with_attention, encoder_decoder_with_transformer.<br>
You can select single model, or multiple models separated by comma without space. For example, "-m transformer,merge" will run training for the merge model and the transformer model.

3) Epoch selection<br/>
Number of epochs can be specified then at the end of the given epochs, train weights will be saved for later use in prediction and evaluation. For example, "-e 30,50,100" will run total 100 epochs and at the end of 30, 50 and 100 epochs, the weights will be saved.

In [4]:
!python icap_tool.py train --help

Usage: icap_tool.py train [OPTIONS]

Options:
  -d, --dataset TEXT  Dataset name  [default: Flickr30k]
  -m, --models TEXT   List of models  [default:
                      transformer,attention,merge,cascade]
  -e, --epochs TEXT   Save weights after given number of epochs  [default:
                      10,20,30,40,50]
  --help              Show this message and exit.


In [None]:
# In a jupyter notebook, you can call the train function directly
from icap_tool import train
train('Flickr30k', 'transformer,attention,merge,cascade', '10,20,30,40,50')

## 'Predict' subcommand

Predict command will generation captions for the test images using the pre-trained weights by the 'train' command above.<br/>
You can specify the dataset, models and epochs in the same way as the 'train' command.<br/>
The generated captions will be saved to files and used for evaluation later.

In [5]:
!python icap_tool.py predict --help

Usage: icap_tool.py predict [OPTIONS]

Options:
  -d, --dataset TEXT   Dataset name  [default: Flickr30k]
  -m, --models TEXT    List of models  [default:
                       transformer,attention,merge,cascade]
  -e, --epochs TEXT    Load pretrained weights after given number of epochs
                       [default: 10,20,30,40,50]
  -n, --count INTEGER  Number of images to predict. (0 means all test images)
                       [default: 0]
  --help               Show this message and exit.


In [None]:
# In a jupyter notebook, you can call the predict function directly
from icap_tool import predict
predict('Flickr30k', 'transformer,attention,merge,cascade', '10,20,30,40,50', 0)

## 'Evaluate' subcommand

Evaluate command will calculate the BLEU, ROUGE and METEOR scores by comparing the generated captions by the models and the reference texts given with the original images.<br/>
The evaluation scores will be saved to files per-model and per-epoch.<br/>
The '-c' option will average out each individual evaluation results and consolidate in a single file.

In [6]:
!python icap_tool.py evaluate --help

Usage: icap_tool.py evaluate [OPTIONS]

Options:
  -d, --dataset TEXT  Dataset name  [default: Flickr30k]
  -m, --models TEXT   List of models  [default:
                      transformer,attention,merge,cascade]
  -e, --epochs TEXT   Load pretrained weights after given number of epochs
                      [default: 10,20,30,40,50]
  -c, --consolidate   Create consolidated evaluation table
  --help              Show this message and exit.


In [None]:
# In a jupyter notebook, you can call the evaluate function directly
from icap_tool import evaluate
evaluate('Flickr30k', 'transformer,attention,merge,cascade', '10,20,30,40,50', False)

In [10]:
evaluate('Flickr30k', 'transformer,attention,merge,cascade', '10,20,30,40,50', True)

Loading prebuilt vocabulary ./workspace/Flickr30k-vocab.pkl ... completed
Loading caption sequences ./workspace/Flickr30k-caption_sequences.pkl ... completed
Loading prebuilt embedding matrix ./workspace/Flickr30k-embedding_matrix_fasttext.pkl ... completed
Building image features ./workspace/Flickr30k-vgg16-no_include_top ... 30000/30000 processed. 100% completed
completed
Loading eval_scores ./workspace/Flickr30k-transformer_model/eval_scores-10.csv
Loading eval_scores ./workspace/Flickr30k-transformer_model/eval_scores-20.csv
Loading eval_scores ./workspace/Flickr30k-transformer_model/eval_scores-30.csv
Loading eval_scores ./workspace/Flickr30k-transformer_model/eval_scores-40.csv
Loading eval_scores ./workspace/Flickr30k-transformer_model/eval_scores-50.csv
Loading prebuilt vocabulary ./workspace/Flickr30k-vocab.pkl ... completed
Loading caption sequences ./workspace/Flickr30k-caption_sequences.pkl ... completed
Loading prebuilt embedding matrix ./workspace/Flickr30k-embedding_matr

In [12]:
import pandas as pd

df = pd.read_csv('workspace/Flickr30k-eval.csv')
print(df.to_string()) 

     metric  epoch  cascade   merge  attention  transformer
0    BLEU-1     10   0.5335  0.5261     0.5248       0.5368
1    BLEU-1     20   0.5186  0.5243     0.5068       0.5395
2    BLEU-1     30   0.5180  0.5097     0.5064       0.5405
3    BLEU-1     40   0.5262  0.5024     0.4987       0.5279
4    BLEU-1     50   0.5109  0.5159     0.4912       0.5245
5    BLEU-2     10   0.2669  0.2681     0.2698       0.2911
6    BLEU-2     20   0.2443  0.2651     0.2512       0.2931
7    BLEU-2     30   0.2668  0.2532     0.2465       0.2917
8    BLEU-2     40   0.2697  0.2544     0.2359       0.2833
9    BLEU-2     50   0.2639  0.2572     0.2290       0.2783
10   BLEU-3     10   0.0789  0.1002     0.1156       0.1306
11   BLEU-3     20   0.0888  0.1049     0.1027       0.1291
12   BLEU-3     30   0.1054  0.1036     0.0974       0.1310
13   BLEU-3     40   0.1066  0.1067     0.0904       0.1256
14   BLEU-3     50   0.1066  0.1071     0.0865       0.1211
15   BLEU-4     10   0.0232  0.0358     