As the second step of this tutorial, we will train an image model. This step can be run in parallel with Step 3 (training the text model).

This notebook was run on an AWS p3.2xlarge

# Octopod Image Model Training Pipeline

In [23]:
%load_ext autoreload

%autoreload 2

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


In [24]:
import sys
sys.path.append('../../')

In [25]:
import numpy as np
import pandas as pd
import torch
import torch.nn as nn
import torch.optim as optim
from torch.optim import lr_scheduler
from torch.utils.data import Dataset, DataLoader

Note: for images, we use the MultiInputMultiTaskLearner since we will send in the full image and a center crop of the image.

In [26]:
from octopod import MultiInputMultiTaskLearner, MultiDatasetLoader
from octopod.vision.dataset import OctopodImageDataset
from octopod.vision.models import ResnetForMultiTaskClassification

## Load in train and validation datasets

First we load in the csv's we created in Step 1.
Remember to change the path if you stored your data somewhere other than the default.

In [27]:
TRAIN_COLOR_DF = pd.read_csv('data/color_swatches/color_train.csv')

In [28]:
VALID_COLOR_DF = pd.read_csv('data/color_swatches/color_valid.csv')

In [29]:
TRAIN_PATTERN_DF = pd.read_csv('data/pattern_swatches/pattern_train.csv')

In [30]:
VALID_PATTERN_DF = pd.read_csv('data/pattern_swatches/pattern_valid.csv')

You will most likely have to alter this to however big your batches can be on your machine

In [31]:
batch_size = 64

We use the `OctopodImageDataSet` class to create train and valid datasets for each task.

Check out the documentation for infomation about the transformations.

In [32]:
color_train_dataset = OctopodImageDataset(
    x=TRAIN_COLOR_DF['image_locs'],
    y=TRAIN_COLOR_DF['simple_color_cat'],
    transform='train',
    crop_transform='train'
)
color_valid_dataset = OctopodImageDataset(
    x=VALID_COLOR_DF['image_locs'],
    y=VALID_COLOR_DF['simple_color_cat'],
    transform='val',
    crop_transform='val'
)

pattern_train_dataset = OctopodImageDataset(
    x=TRAIN_PATTERN_DF['image_locs'],
    y=TRAIN_PATTERN_DF['pattern_type_cat'],
    transform='train',
    crop_transform='train'
)
pattern_valid_dataset = OctopodImageDataset(
    x=VALID_PATTERN_DF['image_locs'],
    y=VALID_PATTERN_DF['pattern_type_cat'],
    transform='val',
    crop_transform='val'
)

We then put the datasets into a dictionary of dataloaders.

Each task is a key.

In [33]:
train_dataloaders_dict = {
    'color': DataLoader(color_train_dataset, batch_size=batch_size, shuffle=True, num_workers=2),
    'pattern': DataLoader(pattern_train_dataset, batch_size=batch_size, shuffle=True, num_workers=2),
}
valid_dataloaders_dict = {
    'color': DataLoader(color_valid_dataset, batch_size=batch_size, shuffle=False, num_workers=8),
    'pattern': DataLoader(pattern_valid_dataset, batch_size=batch_size, shuffle=False, num_workers=8),
}

The dictionary of dataloaders is then put into an instance of the Octopod `MultiDatasetLoader` class.

In [34]:
TrainLoader = MultiDatasetLoader(loader_dict=train_dataloaders_dict)
len(TrainLoader)

8

In [35]:
ValidLoader = MultiDatasetLoader(loader_dict=valid_dataloaders_dict, shuffle=False)
len(ValidLoader)

3

We need to create a dictionary of the tasks and the number of unique values so that we can create our model. This is a `new_task_dict` because we are training new tasks from scratch, but we could potentially have a mix of new and pretrained tasks. See the Octopod documentation for more details.

In [36]:
new_task_dict = {
    'color': TRAIN_COLOR_DF['simple_color_cat'].nunique(),
    'pattern': TRAIN_PATTERN_DF['pattern_type_cat'].nunique(),
}

In [37]:
new_task_dict

{'color': 2, 'pattern': 2}

In [38]:
device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')
print(device)

cuda:0


Create Model and Learner
===

These are completely new tasks so we use `new_task_dict`. If we had already trained a model on some tasks, we would use `pretrained_task_dict`.

And since these are new tasks, we set `load_pretrained_renset=True` to use the weights from Torch.

In [39]:
model = ResnetForMultiTaskClassification(
    new_task_dict=new_task_dict,
    load_pretrained_resnet=True
)

You will likely need to explore different values in this section to find some that work for your particular model.

In [40]:
lr_last = 1e-2
lr_main = 1e-4

optimizer = optim.Adam([
    {'params': model.resnet.parameters(), 'lr': lr_main},
    {'params': model.dense_layers.parameters(), 'lr': lr_last},
    {'params': model.new_classifiers.parameters(), 'lr': lr_last},
    
])

exp_lr_scheduler = lr_scheduler.StepLR(optimizer, step_size= 4, gamma= 0.1)

In [41]:
loss_function_dict = {'color': 'categorical_cross_entropy', 'pattern': 'categorical_cross_entropy'}
metric_function_dict = {'color': 'multi_class_acc', 'pattern': 'multi_class_acc'}

In [42]:
learn = MultiInputMultiTaskLearner(model, TrainLoader, ValidLoader, new_task_dict, loss_function_dict, metric_function_dict)

Train model
===

As your model trains, you can see some output of how the model is performing overall and how it is doing on each individual task.

In [43]:
learn.fit(
    num_epochs=10,
    scheduler=exp_lr_scheduler,
    step_scheduler_on_batch=False,
    optimizer=optimizer,
    device=device,
    best_model=True
)

train_loss,val_loss,color_train_loss,color_val_loss,color_multi_class_accuracy,pattern_train_loss,pattern_val_loss,pattern_multi_class_accuracy,time
0.52756,0.025808,0.455579,0.018895,0.824074,0.837555,0.055673,0.68,00:04
0.316734,0.011392,0.220964,0.004098,0.953704,0.729182,0.042902,0.64,00:04
0.242027,0.010949,0.17333,0.00457,0.916667,0.537883,0.038504,0.6,00:04
0.182427,0.014214,0.168406,0.004074,0.907407,0.242809,0.058019,0.68,00:04
0.153758,0.012729,0.124627,0.002537,0.972222,0.279215,0.056759,0.52,00:04
0.12859,0.012524,0.110009,0.002901,0.981481,0.208611,0.054095,0.6,00:04
0.098957,0.015342,0.087208,0.004767,0.907407,0.149558,0.061024,0.6,00:04
0.095887,0.011757,0.081883,0.002274,0.953704,0.156199,0.052724,0.6,00:04
0.101653,0.009056,0.100016,0.002722,0.944444,0.108702,0.036416,0.64,00:04
0.095422,0.011112,0.090029,0.002989,0.953704,0.118647,0.046202,0.6,00:04


Epoch 8 best model saved with loss of 0.00905553251504898


If you run the above cell and see an error like: 

```python
RuntimeError: DataLoader worker (pid X) is killed by signal: Bus error. It is possible that dataloader's workers are out of shared memory. Please try to raise your shared memory limit.
```

Try lowering the `num_workers` to `0` for each `DataLoader` in `train_dataloaders_dict` and `valid_dataloaders_dict`. 

Validate model
===

We provide a method on the learner called `get_val_preds`, which makes predictions on the validation data. You can then use this to analyze your model's performance in more detail.

In [44]:
pred_dict = learn.get_val_preds(device)

In [45]:
pred_dict

{'color': {'y_true': array([1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 0,
         0, 0, 1, 1, 1, 1, 1, 0, 1, 0, 1, 0, 0, 1, 1, 0, 0, 1, 1, 1, 0, 1,
         0, 0, 0, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0,
         1, 1, 1, 1, 0, 0, 1, 0, 0, 1, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 1, 1,
         0, 1, 0, 0, 0, 1, 1, 0, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1]),
  'y_pred': array([[8.63048699e-05, 9.99913692e-01],
         [1.23549262e-02, 9.87645090e-01],
         [1.56733469e-04, 9.99843240e-01],
         [3.93248983e-02, 9.60675061e-01],
         [3.09351861e-04, 9.99690652e-01],
         [1.47890067e-02, 9.85211015e-01],
         [2.85775837e-04, 9.99714196e-01],
         [1.99814723e-03, 9.98001873e-01],
         [9.55912054e-01, 4.40879054e-02],
         [6.96805291e-05, 9.99930263e-01],
         [8.37490857e-01, 1.62509143e-01],
         [5.89722931e-01, 4.10277128e-01],
         [6.11390977e-04, 9.99388576e-01],
         [1.05856903e-04, 9.99894142e-01]

Save/Export Model
===

Once we are happy with our training we can save (or export) our model, using the `save` method (or `export`).

See the docs for the difference between `save` and `export`.

We will need the saved model later to use in the ensemble model

In [46]:
model.save(folder='models/', model_id='IMAGE_MODEL1')

In [47]:
model.export(folder='models/', model_id='IMAGE_MODEL1')

Now that we have an image model, we can move to `Step3_train_text_model`.