# Hyperparameters Optimizations

1. [Introduction](#Introduction)
2. [Prerequisites and Preprocessing](#Prequisites-and-Preprocessing)
3. [Training parameters for HPO](#Training-parameters-for-HPO)
4. [Plot training and validation accuracies](#Plot-training-and-validation-accuracies)


## Introduction
***

Welcome to our end-to-end example of hyperparameter tuning with blazing text (word2vec) algorithm. In this demo, we will use the HPO feature of sagemaker and train 2 models with learning rate, mini batch size and optimizer chosen by the bayesian method.

To get started, we need to set up the environment with a few prerequisite steps, for permissions, configurations, and so on.

## Prequisites and Preprocessing
***
### Permissions and environment variables

Before launching this notebooks, please start notebooks #1

In [None]:
user='user1'
my_bucket='marc-stationf-sagemaker'

In [None]:
%%time
import boto3
import re
from sagemaker import get_execution_role

role = get_execution_role()

bucket=my_bucket # customize to your bucket

containers = {'us-west-2': '433757028032.dkr.ecr.us-west-2.amazonaws.com/blazingtext:latest',
              'us-east-1': '811284229777.dkr.ecr.us-east-1.amazonaws.com/blazingtext:latest',
              'us-east-2': '825641698319.dkr.ecr.us-east-2.amazonaws.com/blazingtext:latest',
              'eu-west-1': '685385470294.dkr.ecr.eu-west-1.amazonaws.com/blazingtext:latest'}

training_image = containers[boto3.Session().region_name]
print(training_image)

## Training parameters for HPO
***

### Static Hyperparameters

First we define the static hyperparameters used for both models

In [None]:
mode = "batch_skipgram"
epochs = 5
min_count = 5
sampling_threshold = 0.0001
#learning_rate = 0.05
window_size = 5
vector_dim = 100
negative_samples = 5
#batch_size = 11 #  = (2*window_size + 1) (Preferred. Used only if mode is batch_skipgram)
evaluation = True# Perform similarity evaluation on WS-353 dataset at the end of training
subwords = False

###  Hyperparameters used for HPO

Now we are going to define the hyperparameters tuned with the bayesian strategy : 

* Learning Rate
* batch size

In [None]:
import time
tuning_job_prefix_name = 'blazin-hpo-' + user
timestamp = time.strftime('-%H-%M-%S', time.gmtime())
tuning_job_name = tuning_job_prefix_name + timestamp

print (tuning_job_name)

tuning_job_config = {
    "ParameterRanges": {
      "ContinuousParameterRanges": [
        {
          "MaxValue": "0.01",
          "MinValue": "0.005",
          "Name": "learning_rate",
        }
      ],
      "IntegerParameterRanges": [
        {
          "MaxValue": "32",
          "MinValue": "8",
          "Name": "batch_size",
        }
      ]
    },
    "ResourceLimits": {
      "MaxNumberOfTrainingJobs": 2,
      "MaxParallelTrainingJobs": 2
    },
    "Strategy": "Bayesian",
    "HyperParameterTuningJobObjective": {
      "MetricName": "train:mean_rho",
      "Type": "Maximize"
    }
  }

###  Training Params

Now we create the training params for sagemaker training

In [None]:
%%time
import time
import boto3
from time import gmtime, strftime


s3 = boto3.client('s3')
# create unique job name 
job_name_prefix = 'blazin-training-' + user
timestamp = time.strftime('-%Y-%m-%d-%H-%M-%S', time.gmtime())
job_name = job_name_prefix + timestamp
training_params = \
{
    # specify the training docker image
    "AlgorithmSpecification": {
        "TrainingImage": training_image,
        "TrainingInputMode": "File"
    },
    "RoleArn": role,
    "OutputDataConfig": {
        "S3OutputPath": 's3://{}/{}/output'.format(bucket, job_name_prefix)
    },
    "ResourceConfig": {
        "InstanceCount": 2,
        "InstanceType": "ml.c4.2xlarge",
        "VolumeSizeInGB": 50
    },
    "StaticHyperParameters": {
        "mode": mode,
        "epochs": str(epochs),
        "min_count": str(min_count),
        "sampling_threshold": str(sampling_threshold),
        "window_size": str(window_size),
        "vector_dim": str(vector_dim),
        "negative_samples": str(negative_samples),
        "evaluation": str(evaluation),
        "subwords": str(subwords)      
    },
    "StoppingCondition": {
        "MaxRuntimeInSeconds": 360000
    },
#Training data should be inside a subdirectory called "train"
#Validation data should be inside a subdirectory called "validation"
#The algorithm currently only supports fullyreplicated model (where data is copied onto each machine)
    "InputDataConfig": [
        {
            "ChannelName": "train",
            "DataSource": {
                "S3DataSource": {
                    "S3DataType": "S3Prefix",
                    "S3Uri":'s3://sagemaker-eu-west-1-542104878797/sagemaker/DEMO-blazingtext-text8/train',
                    "S3DataDistributionType": "FullyReplicated"
                }
            },
            "ContentType": "application/x-recordio",
            "CompressionType": "None"
        },
    ]
}
print('Training job name: {}'.format(job_name))
print('\nInput Data Location: {}'.format(training_params['InputDataConfig'][0]['DataSource']['S3DataSource']))

### Create Training Job for HPO

In [None]:
# create the Amazon SageMaker training job
sagemaker = boto3.client(service_name='sagemaker')
#sagemaker.create_training_job(**training_params)

sagemaker.create_hyper_parameter_tuning_job(HyperParameterTuningJobName = tuning_job_name,
                                            HyperParameterTuningJobConfig = tuning_job_config,
                                            TrainingJobDefinition = training_params)

In [None]:
training_info = sagemaker.describe_hyper_parameter_tuning_job(HyperParameterTuningJobName=tuning_job_name)
status = training_info['HyperParameterTuningJobStatus']
print("Training job ended with status: " + status)

#### Follow HPO

In [None]:
client = boto3.client('logs')

lgn='/aws/sagemaker/TrainingJobs'

response = client.describe_log_streams(
    logGroupName=lgn,
    logStreamNamePrefix='blazin-hpo-' + user,
    orderBy='LogStreamName',
    descending=True,
    limit=50
)
logstreams = response['logStreams']

response = sagemaker.describe_hyper_parameter_tuning_job(
    HyperParameterTuningJobName=tuning_job_name
)

hpoName = response['HyperParameterTuningJobName']

for logstream in logstreams:
    if hpoName in logstream['logStreamName']:
        print(logstream['logStreamName'])
        logN = client.get_log_events(logGroupName=lgn, logStreamName=logstream['logStreamName'])
        events = logN['events']
        for event in events:
            if '#mean_rho' in event['message']:
                print(event['message'])

### Best Training Job Configuration

In [None]:
response = sagemaker.describe_hyper_parameter_tuning_job(
    HyperParameterTuningJobName=tuning_job_name
)
Status = response['HyperParameterTuningJobStatus']
BTJ = response['BestTrainingJob']

print("HPO Status : " + Status)
btjName = BTJ['TrainingJobName']
print(btjName)
for key, value in BTJ['TunedHyperParameters'].items():
    print(key + " : " + value)
print("RHO Mean : " + str(BTJ['FinalHyperParameterTuningJobObjectiveMetric']['Value']))

#### S3 path to access Best training job model

In [None]:
print('s3://{}/{}/output/{}/output/model.tar.gz'.format(bucket, job_name_prefix,btjName))