# Train first set of one vs. rest (OVR) classifiers.

We train a set of classifiers that are used for computations of $ClassSim$ that is defined in our paper to estimate a similarity between classes.  
These classifiers are also used for classification tasks.  
Modules used in this notebook are defined in files in *models* directory.

## Set up

In [1]:
import os
import sys

import numpy as np

import pandas as pd
import glob

import warnings
warnings.filterwarnings('ignore')

In [2]:
BASE_MODEL_PATH="trained_model"
%mkdir -p $BASE_MODEL_PATH

In [3]:
from models.modelutils import ModelCompiler

Using TensorFlow backend.


In [4]:
compiler = ModelCompiler(BASE_MODEL_PATH)

In [5]:
from models.processor import create_generators

TRAIN_DATAGEN, VALID_DATAGEN = create_generators()

Load category information and all image paths.

In [6]:
from models.modelutils import dir2filedict_sorted, split_fdict

In [7]:
trdict = dir2filedict_sorted("data_fgvc/train")

In [8]:
valdict = dir2filedict_sorted("data_fgvc/valid")

In [9]:
categories = [str(i) for i in range(0, 100)]

In [10]:
categories[99]

'99'

In [11]:
valdict['0'][0:5]

['data_fgvc/valid/0/0062781.jpg',
 'data_fgvc/valid/0/0113201.jpg',
 'data_fgvc/valid/0/0450014.jpg',
 'data_fgvc/valid/0/0602177.jpg',
 'data_fgvc/valid/0/0716386.jpg']

Here is expected outputs.   
All the outputs in {*train.ipynb*, *classifier_similarity.ipynb*, *train_multiclass_classifier.ipynb*, *train_second.ipynb*} must be the same. 

['data_fgvc/valid/0/0062781.jpg',  
 'data_fgvc/valid/0/0113201.jpg',  
 'data_fgvc/valid/0/0450014.jpg',  
 'data_fgvc/valid/0/0602177.jpg',  
 'data_fgvc/valid/0/0716386.jpg']

In [12]:
len(trdict['10'])

60

## Train classifiers

In [13]:
from models.one_vs_all import OneVsAllModelTrainer
from models.modelutils import split_files

In [14]:
trainer = OneVsAllModelTrainer(TRAIN_DATAGEN, VALID_DATAGEN)

In [15]:
def train_one_category(cat, epoch=5):
    model_path = "{}/modelfgcv_{}".format(BASE_MODEL_PATH, cat)
    model = compiler.generate_compiled_model(model_path)
    
    trainer.set_model(model)
    trainer.set_savepath(model_path)
    
    true_train, false_train = split_files(cat, trdict)
    true_valid, false_valid = split_files(cat, valdict)
    
    trainer.set_dataset_files(true_train, false_train, true_valid, false_valid)
    trainer.train_model(eachepochs=epoch, hard_coded_steps_per_epoch=(100, 10))

In [16]:
train_one_category(categories[0])

Epoch 1/5
Epoch 00001: saving model to trained_model/modelfgcv_0-01-0.938.h5
 - 123s - loss: 0.2806 - acc: 0.8756 - val_loss: 0.1811 - val_acc: 0.9375
Epoch 2/5
Epoch 00002: saving model to trained_model/modelfgcv_0-02-0.944.h5
 - 79s - loss: 0.1019 - acc: 0.9632 - val_loss: 0.2272 - val_acc: 0.9437
Epoch 3/5
Epoch 00003: saving model to trained_model/modelfgcv_0-03-0.825.h5
 - 78s - loss: 0.0428 - acc: 0.9881 - val_loss: 0.4716 - val_acc: 0.8250
Epoch 4/5
Epoch 00004: saving model to trained_model/modelfgcv_0-04-0.825.h5
 - 79s - loss: 0.0433 - acc: 0.9875 - val_loss: 0.6187 - val_acc: 0.8250
Epoch 5/5
Epoch 00005: saving model to trained_model/modelfgcv_0-05-0.906.h5
 - 78s - loss: 0.0491 - acc: 0.9813 - val_loss: 0.2660 - val_acc: 0.9062


In [None]:
for i in range(0, len(categories)):
    train_one_category(categories[i])