## EfficientNets (Hyperparameter Tuning)

Here we will explore the optimal hyperparameters in fine-tuning our model.

***Note: Make sure you have trained the base model before fine-tuning***

In [1]:
import sys
import os
import pickle
import sagemaker
from sagemaker.session import Session

sys.path.append('../source')
session = Session()
bucket = session.default_bucket()
role = sagemaker.get_execution_role()

Load the metadata

In [2]:
root_dir = '../data/mit_indoor_67/metadata/'
efficientnet = 'efficientnet-b3'  # Version of the base model
metadata_file = root_dir + efficientnet.replace("-", "_") + ".pkl"
metadata = pickle.load(open(metadata_file, 'rb'))
metadata

{'train': 's3://sagemaker-us-east-2-194071253362/mit_indoor_67/processed/train/efficientnet_b3',
 'val': 's3://sagemaker-us-east-2-194071253362/mit_indoor_67/processed/val/efficientnet_b3',
 'test': 's3://sagemaker-us-east-2-194071253362/mit_indoor_67/processed/test/efficientnet_b3/indoor67_test.pkl'}

Define output_path, source directory and dependencies

In [3]:
prefix = 'mit_indoor_67'
output_path = os.path.join('s3://', bucket, prefix)
print('model artefacts will be saved to: {}'.format(output_path))

model artefacts will be saved to: s3://sagemaker-us-east-2-194071253362/mit_indoor_67


In [4]:
source_dir = '../source'
dependencies = ['../source/dataset', '../source/utils']

The training script:

In [5]:
!pygmentize ../source/main.py

[34mimport[39;49;00m [04m[36margparse[39;49;00m
[34mimport[39;49;00m [04m[36mjson[39;49;00m
[34mimport[39;49;00m [04m[36mos[39;49;00m
[34mimport[39;49;00m [04m[36mpandas[39;49;00m [34mas[39;49;00m [04m[36mpd[39;49;00m
[34mimport[39;49;00m [04m[36mnumpy[39;49;00m [34mas[39;49;00m [04m[36mnp[39;49;00m
[34mimport[39;49;00m [04m[36mtorch[39;49;00m
[34mimport[39;49;00m [04m[36mtorch[39;49;00m[04m[36m.[39;49;00m[04m[36moptim[39;49;00m [34mas[39;49;00m [04m[36moptim[39;49;00m
[34mimport[39;49;00m [04m[36mtorch[39;49;00m[04m[36m.[39;49;00m[04m[36mnn[39;49;00m [34mas[39;49;00m [04m[36mnn[39;49;00m
[34mimport[39;49;00m [04m[36mtime[39;49;00m
[34mimport[39;49;00m [04m[36mcopy[39;49;00m
[34mimport[39;49;00m [04m[36msubprocess[39;49;00m
[34mfrom[39;49;00m [04m[36mglob[39;49;00m [34mimport[39;49;00m glob
[34mfrom[39;49;00m [04m[36mPIL[39;49;00m [34mimport[39;49;00m Image
[34mfrom[39;49;00m [04m[3

Set job name

In [6]:
from time import gmtime, strftime
job_name = "{}-hpo-{}".format(efficientnet, strftime("%m%d-%H%M%S", gmtime()))
print(job_name)

efficientnet-b3-hpo-1128-125011


Set instance details, Pytorch framework version and hyperparameters:


In [7]:
instance_type = 'ml.p3.2xlarge'
instance_count = 1
framework_version = '1.6.0'
entry_point='main.py'
py_version='py3'
hyperparameters = {
                    'model' : 'EfficientNet-b3',
                    'epochs': 20, # since AdamW converges quickly we only set max. epoch to 20
                    'batch-size' : 32,
                    'blocks-unfrozen' : 0, 
                    'sampling' : 'subsetrandom',
                    'lr' : 1e-4,
                    'workers' : 7,
                    'optimizer' : 'adamw',
                    'dropout' : 0.5,
                    'weight-decay' : 0.01,
                    'patience' : 3  # set lower patience to implement early stopping
                    } 

Instantiate the estimator

In [8]:
from sagemaker.pytorch import PyTorch

# initial attempt
estimator = PyTorch(entry_point=entry_point,
                    source_dir=source_dir,
                    dependencies=dependencies,
                    role=role,
                    instance_count=instance_count,
                    instance_type=instance_type,
                    framework_version=framework_version,
                    py_version=py_version,
                    output_path=output_path,
                    sagemaker_session=session,
                    hyperparameters=hyperparameters
                   )

Create Hyperparameter Tuner

Here we were unsure how many layers should be unfrozen, as well as the learning rate, weight decay (L2 penalty), dropout. Hence we will use SageMaker's Hyperparameter Tuner to find the optimal settings. The objective was to minimize cross entropy loss in the validation set.

In [9]:
from sagemaker.tuner import IntegerParameter, CategoricalParameter, ContinuousParameter, HyperparameterTuner

hyperparameter_ranges = {
                         'blocks-unfrozen' : IntegerParameter(0, 26),
                         'weight-decay' : ContinuousParameter(1e-4, 1e-1),
                         'dropout' : ContinuousParameter(0.1, 0.9),
                         'lr' : ContinuousParameter(1e-4, 1e-1)
                         }
max_jobs=8
max_parallel_jobs=1
objective_metric_name = 'Validation Loss'
objective_type = 'Minimize'  # Objective
metric_definitions = [{'Name': 'Validation Loss',
                       'Regex': 'MIN VAL LOSS: ([0-9\\.]+)'}]  # Minimum Validation Loss in a particular training job

In [10]:
tuner = HyperparameterTuner(estimator,
                            objective_metric_name,
                            hyperparameter_ranges,
                            metric_definitions,
                            max_jobs=max_jobs,
                            max_parallel_jobs=max_parallel_jobs,
                            objective_type=objective_type)

In [11]:
# train on a the base mode - feature extractor
base_model_job = '' # BASE JOB NAME HERE
base_model = os.path.join(output_path, base_model_job, 'output', 'model.tar.gz')

print(base_model)

s3://sagemaker-us-east-2-194071253362/mit_indoor_67/mitindoor67-efficientnet-b3-base-2020-11-28-07-08-59/output/model.tar.gz


Fit the training and validation data

In [12]:
tuner.fit({
    'train': metadata['train'],
    'val' : metadata['val'],
    'base' : base_model},
    job_name=job_name,
    wait = False)

Once we have identified the optimal setting and hyperparameters and trained the final model, we can move onto [testing our model](./ENindoor67-LocalTesting.ipynb)