# Optimize models using Automatic Model Tuning
 In this lab you will apply a random algorithm of Automated Hyperparameter Tuning to train a BERT-based natural language processing (NLP) classifier. The model analyzes customer feedback and classifies the messages into positive (1), neutral (0), and negative (-1) sentiments.
 
 Amazon SageMaker supports Automated Hyperparameter Tuning. It runs multiple training jobs on the training dataset using the hyperparameter ranges specified by the user. Then it chooses the combination of hyperparameters that leads to the best model candidate. The choice is made based on the objective metrics, e.g. maximization of the validation accuracy. 

For the choice of hyperparameters combinations, SageMaker supports two different types of tuning strategies: random and Bayesian. This capability can be further extended by providing an implementation of a custom tuning strategy as a Docker container.

<img src="c3w1/images/hpt.png" width="70%" align="center"> 

In this lab you will perform the following three steps:

<img src="c3w1/images/sagemaker_hpt.png" width="50%" align="center"> 

In [2]:
#emli_notes: s3 bucket be in same region as training job, 
#for a project, create a bucket with the name of that project; use sm search to list all training jobs that had an s3 uri with this bucket whick tells the results of all previous jobs

# please ignore warning messages during the installation
!pip install --disable-pip-version-check -q sagemaker==2.35.0
!conda install -q -y pytorch==1.6.0 -c pytorch
!pip install --disable-pip-version-check -q transformers==3.5.1

import boto3, sagemaker, pandas as pd, botocore

config = botocore.config.Config(user_agent_extra='dlai-pds/c3/w1')

# low-level service client of the boto3 session
sm = boto3.client(service_name='sagemaker', config=config)

sess = sagemaker.Session(sagemaker_client=sm)

bucket = sess.default_bucket()
role = sagemaker.get_execution_role()
region = sess.boto_region_name

[0mCollecting package metadata (current_repodata.json): ...working... done
Solving environment: ...working... done

# All requested packages already installed.

[0m

<a name='c3w1-1.'></a>
## 1. Configure dataset and Hyperparameter Tuning Job (HTP)

In [None]:
#Configure dataset, Upload the data to the S3 bucket
processed_train_data_s3_uri = 's3://{}/transformed/data/sentiment-train/'.format(bucket)
processed_validation_data_s3_uri = 's3://{}/transformed/data/sentiment-validation/'.format(bucket)
processed_test_data_s3_uri = 's3://{}/transformed/data/sentiment-test/'.format(bucket)
!aws s3 cp --recursive ./data/sentiment-train $processed_train_data_s3_uri
!aws s3 cp --recursive ./data/sentiment-validation $processed_validation_data_s3_uri
!aws s3 cp --recursive ./data/sentiment-test $processed_test_data_s3_uri
!aws s3 ls --recursive $processed_train_data_s3_uri

from sagemaker.inputs import TrainingInput

data_channels = {
    'train': TrainingInput(s3_data=processed_train_data_s3_uri),
    'validation':TrainingInput(s3_data = processed_validation_data_s3_uri)} #There is no need to create a test data channel, as the test data is used later at the evaluation stage and does not need to be wrapped into the sagemaker.inputs.TrainingInput function.


In [7]:
#Configure Hyperparameter Tuning Job

#configure static hyperparameters
max_seq_length=128 # maximum number of input tokens passed to BERT model
freeze_bert_layer=False # specifies the depth of training within the network
epochs=3; train_steps_per_epoch=50; validation_batch_size=64; validation_steps_per_epoch=50; seed=42

train_instance_count=1; train_instance_type='ml.c5.9xlarge'; train_volume_size=256; input_mode='File'; run_validation=True

#Some of these will be passed into the PyTorch estimator and tuner in the hyperparameters argument. Let's set up the dictionary for that:
hyperparameters_static={
    'freeze_bert_layer': freeze_bert_layer, 'max_seq_length': max_seq_length, 'epochs': epochs,
    'train_steps_per_epoch': train_steps_per_epoch, 'validation_batch_size': validation_batch_size,
    'validation_steps_per_epoch': validation_steps_per_epoch, 'seed': seed, 'run_validation': run_validation}

#Configure hyperparameter ranges to explore in the Tuning Job. The values of the ranges typically come from prior experience, research papers, or other models similar to the task you are trying to do.
from sagemaker.tuner import IntegerParameter; from sagemaker.tuner import ContinuousParameter; from sagemaker.tuner import CategoricalParameter
 
    
hyperparameter_ranges = {
    'learning_rate': ContinuousParameter(0.00001, 0.00005, scaling_type='Linear'), # specifying continuous variable type, the tuning job will explore the range of values
    'train_batch_size': CategoricalParameter([128, 256]),} # specifying categorical variable type, the tuning job will explore only listed values, 

#Set up evaluation metrics: Choose loss and accuracy as the evaluation metrics. The regular expressions Regex will capture the values of metrics that the algorithm will emit.
metric_definitions = [
     {'Name': 'validation:loss', 'Regex': 'val_loss: ([0-9.]+)'}, {'Name': 'validation:accuracy', 'Regex': 'val_acc: ([0-9.]+)'},]


## Run Tuning Job

In [None]:
#Prepare the PyTorch model to run as a SageMaker Training Job:
from sagemaker.pytorch import PyTorch as PyTorchEstimator # Note: indeed, it is not compulsory to rename the PyTorch estimator, but this is useful for code clarity, especially when a few modules of 'sagemaker.pytorch' are used

estimator = PyTorchEstimator(
    entry_point='train.py', source_dir='src', role=role, instance_count=train_instance_count,instance_type=train_instance_type, 
    volume_size=train_volume_size, py_version='py3', framework_version='1.6.0',
    hyperparameters=hyperparameters_static, metric_definitions=metric_definitions,input_mode=input_mode,)

#Launch the Hyperparameter Tuning Job: hyperparameter tuning search strategies: {https://docs.aws.amazon.com/sagemaker/latest/dg/automatic-model-tuning-how-it-works.html}
from sagemaker.tuner import HyperparameterTuner

tuner = HyperparameterTuner(
    estimator=estimator, hyperparameter_ranges=hyperparameter_ranges, metric_definitions=metric_definitions,
    strategy="Random",# other tuning strategies are bayesian, grid,..the selection of HP value ranges are based on this strategy
    objective_type='Maximize', objective_metric_name='validation:accuracy',
    max_jobs=2, # The max_parallel_jobs parameter limits the number of training jobs (and therefore hyperparameter combinations) to run in parallel within the tuning job. This parameter is often used in combination with the Bayesian search strategy when you want to test a smaller set of training jobs (less than the max_jobs), learn from the smaller set of training jobs, then apply Bayesian methods to determine the next set of hyperparameters used by the next set of training jobs. Bayesian methods can improve hyperparameter-tuning performance in some cases.
    max_parallel_jobs=2, # maximum number of jobs to run in parallel
    early_stopping_type='Auto') # refer: https://docs.aws.amazon.com/sagemaker/latest/dg/automatic-model-tuning-early-stopping.html.;

#Launch the SageMaker Hyper-Parameter Tuning (HPT) Job:
tuner.fit(inputs=data_channels, # train and validation input
          include_cls_metadata=False, # to be set as false if the algorithm cannot handle unknown hyperparameters
          wait=False) # do not wait for the job to complete before continuing

#Check Tuning Job status in link:
tuning_job_name = tuner.latest_tuning_job.job_name; print(tuning_job_name)
from IPython.core.display import display, HTML
display(HTML('<b>Review <a target="blank" href="https://console.aws.amazon.com/sagemaker/home?region={}#/hyper-tuning-jobs/{}">Hyper-Parameter Tuning Job</a></b>'.format(region, tuning_job_name)))

In [None]:
#wait till job done:
#%%time
tuner.wait()
#The results of the HPT are available on the analytics of the tuner object. The dataframe function converts the result directly into the dataframe. explore the results here:
import time
time.sleep(10) # slight delay to allow the analytics to be calculated

df_results = tuner.analytics().dataframe()
df_results.shape
df_results.sort_values('FinalObjectiveValue', ascending=0)

#When training and tuning at scale, it is important to continuously monitor and use the right compute resources. While you have the flexibility of choosing different compute options how do you choose the specific instance types and sizes to use? There is no standard answer for this. It comes down to understanding the workload and running empirical testing to determine the best compute resources to use for the training.
#Training Jobs emit CloudWatch metrics for resource utilization in below link:
from IPython.core.display import display, HTML
display(HTML('<b>Review Training Jobs of the <a target="blank" href="https://console.aws.amazon.com/sagemaker/home?region={}#/hyper-tuning-jobs/{}">Hyper-Parameter Tuning Job</a></b>'.format(region, tuning_job_name)))

## 3. Evaluate the results

In [None]:
#Evaluate the best candidate: can paste HPT job name to HPO_Analyze_Tuning_Job_results file in HPTuning from sagemaker exaples and analyze results too
best_candidate = df_results.sort_values('FinalObjectiveValue', ascending=0).iloc[0]
best_candidate_training_job_name = best_candidate['TrainingJobName']
print('Best candidate Training Job name: {}'.format(best_candidate_training_job_name))
best_candidate_accuracy = best_candidate[None] 
print('Best candidate accuracy result: {}'.format(best_candidate_accuracy))

#use the function describe_training_job of the service client to get some more information about the best candidate. The result is in dictionary format. 
best_candidate_description = sm.describe_training_job(TrainingJobName=best_candidate_training_job_name)
best_candidate_training_job_name2 = best_candidate_description['TrainingJobName']
print('Training Job name: {}'.format(best_candidate_training_job_name2))

#Pull the Tuning Job and Training Job Amazon Resource Name (ARN) from the best candidate training job description.
best_candidate_tuning_job_arn = best_candidate_description[None] # Replace None
best_candidate_training_job_arn = best_candidate_description[None] # Replace None
print('Best candidate Tuning Job ARN: {}'.format(best_candidate_tuning_job_arn))
print('Best candidate Training Job ARN: {}'.format(best_candidate_training_job_arn))

#Pull the path of the best candidate model in the S3 bucket. need it later to set up the Processing Job for the evaluation.
model_tar_s3_uri = sm.describe_training_job(TrainingJobName=best_candidate_training_job_name)['ModelArtifacts']['S3ModelArtifacts']
print(model_tar_s3_uri)

#Evaluation with test dataset: To perform model evaluation, use a scikit-learn-based Processing Job.
from sagemaker.sklearn.processing import SKLearnProcessor

processing_instance_type = "ml.c5.2xlarge"
processing_instance_count = 1

processor = SKLearnProcessor(
    framework_version="0.23-1",role=role,instance_type=processing_instance_type, 
    instance_count=processing_instance_count, max_runtime_in_seconds=7200,)

from sagemaker.processing import ProcessingInput, ProcessingOutput

processor.run(
    code="src/evaluate_model_metrics.py",
    inputs=[
        ProcessingInput(input_name="model-tar-s3-uri", source=model_tar_s3_uri, destination="/opt/ml/processing/input/model/"),
        ProcessingInput(input_name="evaluation-data-s3-uri", source=processed_test_data_s3_uri, destination="/opt/ml/processing/input/data/",),],
    outputs=[ProcessingOutput(s3_upload_mode="EndOfJob", output_name="metrics", source="/opt/ml/processing/output/metrics"),],
    arguments=["--max-seq-length", str(max_seq_length)],
    logs=True, wait=False,)

#pull the Processing Job name:
scikit_processing_job_name = processor.jobs[-1].describe()["ProcessingJobName"]
print('Processing Job name: {}'.format(scikit_processing_job_name))

#Pull the Processing Job status 
scikit_processing_job_status = processor.jobs[-1].describe()[None] 
print('Processing job status: {}'.format(scikit_processing_job_status))

#Review the created Processing Job in the AWS console. 
from IPython.core.display import display, HTML
display(HTML('<b>Review <a target="blank" href="https://console.aws.amazon.com/sagemaker/home?region={}#/processing-jobs/{}">Processing Job</a></b>'.format(region, scikit_processing_job_name)))


In [None]:
#review the CloudWatch Logs.Wait for about 5 minutes 
from IPython.core.display import display, HTML
display(HTML('<b>Review <a target="blank" href="https://console.aws.amazon.com/cloudwatch/home?region={}#logStream:group=/aws/sagemaker/ProcessingJobs;prefix={};streamFilter=typeLogStreamPrefix">CloudWatch Logs</a> after about 5 minutes</b>'.format(region, scikit_processing_job_name)))

In [None]:
#After the completion of the Processing Job you can also review the output in the S3 bucket
from IPython.core.display import display, HTML
display(HTML('<b>Review <a target="blank" href="https://s3.console.aws.amazon.com/s3/buckets/{}/{}/?region={}&tab=overview">S3 output data</a> after the Processing Job has completed</b>'.format(bucket, scikit_processing_job_name, region)))

#Monitor the Processing Job:
from pprint import pprint
running_processor = sagemaker.processing.ProcessingJob.from_processing_name(processing_job_name=scikit_processing_job_name, sagemaker_session=sess)
processing_job_description = running_processor.describe()
pprint(processing_job_description)

In [None]:
#Wait for the Processing Job to complete.
#%%time
running_processor.wait(logs=False)

#Inspect the processed output data: Get the S3 bucket location of the output metrics:
processing_job_description = running_processor.describe()
output_config = processing_job_description["ProcessingOutputConfig"]
for output in output_config["Outputs"]:
    if output["OutputName"] == "metrics":
        processed_metrics_s3_uri = output["S3Output"]["S3Uri"]
print(processed_metrics_s3_uri)
!aws s3 ls $processed_metrics_s3_uri/

#The test accuracy can be pulled from the evaluation.json file:
import json
from pprint import pprint
metrics_json = sagemaker.s3.S3Downloader.read_file("{}/evaluation.json".format(processed_metrics_s3_uri))
print('Test accuracy: {}'.format(json.loads(metrics_json)))

#Copy image with the confusion matrix generated during the model evaluation into the folder generated.
!aws s3 cp $processed_metrics_s3_uri/confusion_matrix.png ./generated/
import time
time.sleep(10) # Slight delay for our notebook to recognize the newly-downloaded file

#Show and review the confusion matrix:
#%%html
<img src='./generated/confusion_matrix.png'>