# Octopod Ensemble Model Training Pipeline

As the fourth (and final) step of this tutorial, we will train an ensemble model using the image and text models we've already trained.

This notebook was run on an AWS p3.2xlarge

In [1]:
%load_ext autoreload

%autoreload 2

In [2]:
import sys
sys.path.append('../../')

In [3]:
import joblib
import pandas as pd

import torch
import torch.nn as nn
import torch.optim as optim
from torch.optim import lr_scheduler
from torch.utils.data import DataLoader
from transformers import AdamW, BertTokenizer, get_cosine_schedule_with_warmup

from octopod.learner import MultiTaskLearner, MultiInputMultiTaskLearner
from octopod.dataloader import MultiDatasetLoader
from octopod.ensemble import OctopodEnsembleDataset, BertResnetEnsembleForMultiTaskClassification

## Load in train and validation datasets

First we load in the csv's we created in Step 1.
Remember to change the path if you stored your data somewhere other than the default.

In [4]:
TRAIN_COLOR_DF = pd.read_csv('data/color_swatches/color_train.csv')

In [5]:
VALID_COLOR_DF = pd.read_csv('data/color_swatches/color_valid.csv')

In [6]:
TRAIN_PATTERN_DF = pd.read_csv('data/pattern_swatches/pattern_train.csv')

In [7]:
VALID_PATTERN_DF = pd.read_csv('data/pattern_swatches/pattern_valid.csv')

You will most likely have to alter this to however big your batches can be on your machine

In [8]:
batch_size = 16

In [9]:
bert_tok = BertTokenizer.from_pretrained(
    'bert-base-uncased',
    do_lower_case=True
)

max_seq_length = 128 

In [10]:
color_train_dataset = OctopodEnsembleDataset(
    text_inputs=TRAIN_COLOR_DF['complex_color'],
    img_inputs=TRAIN_COLOR_DF['image_locs'],
    y=TRAIN_COLOR_DF['simple_color_cat'],
    tokenizer=bert_tok,
    max_seq_length=max_seq_length,
    transform='train',
    crop_transform='train'

)
color_valid_dataset = OctopodEnsembleDataset(
    text_inputs=VALID_COLOR_DF['complex_color'],
    img_inputs=VALID_COLOR_DF['image_locs'],
    y=VALID_COLOR_DF['simple_color_cat'],
    tokenizer=bert_tok,
    max_seq_length=max_seq_length,
    transform='val',
    crop_transform='val'

)

pattern_train_dataset = OctopodEnsembleDataset(
    text_inputs=VALID_PATTERN_DF['fake_text'],
    img_inputs=VALID_PATTERN_DF['image_locs'],
    y=VALID_PATTERN_DF['pattern_type_cat'],
    tokenizer=bert_tok,
    max_seq_length=max_seq_length,
    transform='train',
    crop_transform='train'

)
pattern_valid_dataset = OctopodEnsembleDataset(
    text_inputs=VALID_PATTERN_DF['fake_text'],
    img_inputs=VALID_PATTERN_DF['image_locs'],
    y=VALID_PATTERN_DF['pattern_type_cat'],
    tokenizer=bert_tok,
    max_seq_length=max_seq_length,
    transform='val',
    crop_transform='val'
)

We then put the datasets into a dictionary of dataloaders.

Each task is a key.

In [11]:
train_dataloaders_dict = {
    'color': DataLoader(color_train_dataset, batch_size=batch_size, shuffle=True, num_workers=2),
    'pattern': DataLoader(pattern_train_dataset, batch_size=batch_size, shuffle=True, num_workers=2),
}
valid_dataloaders_dict = {
    'color': DataLoader(color_valid_dataset, batch_size=batch_size, shuffle=False, num_workers=2),
    'pattern': DataLoader(pattern_valid_dataset, batch_size=batch_size, shuffle=False, num_workers=2),
}

In [12]:
TrainLoader = MultiDatasetLoader(loader_dict=train_dataloaders_dict)
len(TrainLoader)

23

In [13]:
ValidLoader = MultiDatasetLoader(
    loader_dict=valid_dataloaders_dict,
    shuffle=False
)
len(ValidLoader)

9

Create Model and Learner
===

Since the image model could potentially have multiple Resnets for different subsets of tasks, we need to create an `image_task_dict` that splits up the tasks grouped by the Resnet they use.

In [14]:
image_task_dict = {
    'color_pattern': {
        'color': TRAIN_COLOR_DF['simple_color_cat'].nunique(),
        'pattern': TRAIN_PATTERN_DF['pattern_type_cat'].nunique()
    }  
}

We still need to create the `new_task_dict` for the learner.

In [15]:
new_task_dict = {
    'color': TRAIN_COLOR_DF['simple_color_cat'].nunique(),
    'pattern': TRAIN_PATTERN_DF['pattern_type_cat'].nunique()
}

In [16]:
device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')
print(device)

cuda:0


We first initialize the model by setting up the right shape with the image_task_dict.

In [17]:
model = BertResnetEnsembleForMultiTaskClassification(
    image_task_dict=image_task_dict
)

We then load in the existing models by specifying the folder where the models live and their id's.

In [18]:
resnet_model_id_dict = {
    'color_pattern': 'IMAGE_MODEL1'
}

In [19]:
model.load_core_models(
    folder='models/',
    bert_model_id='TEXT_MODEL1',
    resnet_model_id_dict=resnet_model_id_dict
)

We've set some helper methods that will freeze the core bert and resnets for you if you only want to train the new layers. As with all other aspects of training, this is likely to require some experimentation to determine what works for your problem.

You will likely need to explore different values in this section to find some that work
for your particular model.

In [20]:
model.freeze_bert()
model.freeze_resnets()

lr_last = 1e-3
lr_main = 1e-5

lr_list = [
    {'params': model.bert.parameters(), 'lr': lr_main},
    {'params': model.dropout.parameters(), 'lr': lr_main},   
    {'params': model.image_resnets.parameters(), 'lr': lr_main},
    {'params': model.image_dense_layers.parameters(), 'lr': lr_main},
    {'params': model.ensemble_layers.parameters(), 'lr': lr_last},
    {'params': model.classifiers.parameters(), 'lr': lr_last},
]

optimizer = optim.Adam(lr_list)

exp_lr_scheduler = lr_scheduler.StepLR(optimizer, step_size= 4, gamma= 0.1)

In [21]:
loss_function_dict = {'color': 'categorical_cross_entropy', 'pattern': 'categorical_cross_entropy'}
metric_function_dict = {'color': 'multi_class_acc', 'pattern': 'multi_class_acc'}

In [22]:
learn = MultiInputMultiTaskLearner(model, TrainLoader, ValidLoader, new_task_dict, loss_function_dict, metric_function_dict)

Train Model
===

As your model trains, you can see some output of how the model is performing overall and how it is doing on each individual task.

In [23]:
learn.fit(
    num_epochs=10,
    scheduler=exp_lr_scheduler,
    step_scheduler_on_batch=False,
    optimizer=optimizer,
    device=device,
    best_model=True
)

train_loss,val_loss,color_train_loss,color_val_loss,color_multi_class_accuracy,pattern_train_loss,pattern_val_loss,pattern_multi_class_accuracy,time
0.255081,0.057677,0.216981,0.027293,0.907407,0.747331,0.188935,0.52,00:03
0.181422,0.025467,0.159878,0.013501,0.898148,0.459776,0.077156,0.6,00:03
0.216635,0.018584,0.188308,0.014491,0.907407,0.582625,0.036265,0.8,00:03
0.164598,0.026894,0.154966,0.014742,0.898148,0.289053,0.079393,0.48,00:03
0.16765,0.024614,0.163939,0.012435,0.907407,0.21559,0.077229,0.52,00:03
0.159685,0.02267,0.153458,0.01131,0.925926,0.240137,0.071746,0.56,00:03
0.117357,0.022851,0.111142,0.009109,0.925926,0.197653,0.082217,0.48,00:03
0.118633,0.023313,0.112404,0.011147,0.898148,0.199106,0.075873,0.52,00:03
0.135601,0.021864,0.122261,0.010357,0.935185,0.307941,0.071577,0.56,00:03
0.164267,0.023728,0.159604,0.011967,0.935185,0.224515,0.074537,0.52,00:03


Epoch 2 best model saved with loss of 0.018583763390779495


Ideally the ensemble would perform better than either the image or text model alone, but our performance is probably suffering due to this being synthetic data.

Checking validation data
===

We provide a method on the learner called `get_val_preds`, which makes predictions on the validation data. You can then use this to analyze your model's performance in more detail.

In [24]:
pred_dict = learn.get_val_preds(device)

In [25]:
pred_dict

{'color': {'y_true': array([1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 0,
         0, 0, 1, 1, 1, 1, 1, 0, 1, 0, 1, 0, 0, 1, 1, 0, 0, 1, 1, 1, 0, 1,
         0, 0, 0, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0,
         1, 1, 1, 1, 0, 0, 1, 0, 0, 1, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 1, 1,
         0, 1, 0, 0, 0, 1, 1, 0, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1]),
  'y_pred': array([[0.01048604, 0.989514  ],
         [0.00740035, 0.9925997 ],
         [0.04395323, 0.9560468 ],
         [0.00924578, 0.99075425],
         [0.01007592, 0.98992413],
         [0.28866163, 0.7113384 ],
         [0.00713036, 0.9928697 ],
         [0.00621065, 0.9937894 ],
         [0.9387955 , 0.06120452],
         [0.03254075, 0.96745926],
         [0.78577715, 0.21422282],
         [0.6562214 , 0.34377858],
         [0.00641926, 0.9935807 ],
         [0.00988812, 0.9901119 ],
         [0.814158  , 0.18584193],
         [0.02132111, 0.9786789 ],
         [0.00856477, 0.9914352 ],
     

Save/Export Model
===

The ensemble model can also be saved or exported.

In [26]:
model.save(folder='models/', model_id='ENSEMBLE_MODEL1')

In [27]:
model.export(folder='models/', model_id='ENSEMBLE_MODEL1')