# Hyperparameter Optimization

**SageMaker Studio Kernel**: Data Science

In this exercise you will do:
 - Run a SageMaker HOP jobs by using custom script Estimator

***

## Part 1/2 - Setup
Here we'll import some libraries and define some variables. You can also take a look on the scripts that were previously created for preparing the data and training our model.

In [None]:
import boto3
import logging
import sagemaker
from sagemaker.tensorflow import TensorFlow
from sagemaker.tuner import ContinuousParameter, HyperparameterTuner

In [None]:
logging.basicConfig(level=logging.INFO)
LOGGER = logging.getLogger(__name__)

In [None]:
sagemaker_client = boto3.client("sagemaker")
s3_client = boto3.client("s3")

***

### Global configurations

Configuration variables used for Processing, Training, and registration

In [None]:
region = boto3.session.Session().region_name
role_name = "mlops-sagemaker-execution-role"
role = "arn:aws:iam::{}:role/{}".format(boto3.client('sts').get_caller_identity().get('Account'), role_name)

kms_account_id = boto3.client('sts').get_caller_identity().get('Account')

kms_alias = "ml-kms"

bucket_name = ""

In [None]:
boto_session = boto3.Session(region_name=region)

sagemaker_client = boto_session.client("sagemaker")
runtime_client = boto_session.client("sagemaker-runtime")

sagemaker_session = sagemaker.session.Session(
    boto_session=boto_session,
    sagemaker_client=sagemaker_client,
    sagemaker_runtime_client=runtime_client,
    default_bucket=bucket_name
)

In [None]:
kms_key = "arn:aws:kms:{}:{}:alias/{}".format(region, kms_account_id, kms_alias)

***

## Part 1/2: Define Estimator

#### Compress source code for installing additional python modules

In [None]:
!pygmentize ./../algorithms/training/src/train.py

In [None]:
! ./../algorithms/buildspec.sh training $bucket_name

#### Define input variables

In [None]:
training_artifact_path = "artifact/training"
training_artifact_name = "sourcedir.tar.gz"
training_output_files_path = "models"
training_framework_version = "2.4"
training_python_version = "py37"
training_instance_count = 1
training_instance_type = "ml.c5.4xlarge"
training_hyperparameters = {
    "epochs": 5,
    "batch_size": 100,
    "input_file": "processed_data.csv"
}

#### Create Estimator

Lets start a training job using a Tensorflow Estimator

In [None]:
estimator = TensorFlow(
    entry_point="train.py",
    framework_version=training_framework_version,
    py_version=training_python_version,
    source_dir="s3://{}/{}/{}".format(bucket_name,
                                      training_artifact_path,
                                      training_artifact_name
                                      ),
    output_path="s3://{}/{}".format(bucket_name,
                                    training_output_files_path),
    hyperparameters=training_hyperparameters,
    enable_sagemaker_metrics=True,
    metric_definitions=[
        {
            'Name': 'Test loss',
            'Regex': 'Test loss:.* ([0-9\\.]+)'
        },
        {
            'Name': 'Test accuracy',
            'Regex': 'Test accuracy:.* ([0-9\\.]+)'
        }
    ],
    role=role,
    instance_count=training_instance_count,
    instance_type=training_instance_type,
    output_kms_key=kms_key
)

***

## Part 2/2 Run HPO job

#### Define HPO parameters

In [None]:
hyperparamter_range = {
    "learning_rate": ContinuousParameter(1e-5, 1e-1)
}

metric_definitions = [
    {
        'Name': 'Val loss',
        'Regex': 'val_loss:.* ([0-9\\.]+)'
    },
    {
        'Name': 'Val accuracy',
        'Regex': 'val_accuracy:.* ([0-9\\.]+)'
    }
]

objective_metric_name = "Val accuracy"
objective_type = "Maximize"

#### Create HyperparameterTuner

In [None]:
tuner = HyperparameterTuner(
    estimator,
    objective_metric_name,
    hyperparamter_range,
    metric_definitions,
    max_jobs=6,
    max_parallel_jobs=3,
    objective_type=objective_type,
    strategy="Random"
)

In [None]:
tuner.fit(
    inputs={
        "train": "s3://{}/{}/train".format(
            bucket_name,
            processing_output_files_path
        ),
        "test": "s3://{}/{}/test".format(
            bucket_name,
            processing_output_files_path
        )
    }
)

We have just seen how to run Amazon SageMaker HPO jobs for identifying the right combination of hyperparameters for our ML algorithm. Now we are ready to execute our end to end workflow using an Amazon SageMaker Pipeline

 > [SageMaker-Pipeline](./07-SageMaker-Pipeline-Training.ipynb)