## Task 2: Tune a model

A data scientist's typical machine learning (ML) process is composed of five major steps: preparing the data, training the model, tuning the model, deploying the model to an endpoint, and evaluating the model in production to retrain as needed. In the previous labs, you completed the data preparation and model training steps. In this lab, you complete model tuning. You can complete each of these steps in SageMaker Studio with access to powerful Jupyter notebook instances, built-in algorithms, model training, and model deployment within the service. 

During the training and tuning portions of the process, you typically work with data and feed it into a model where you evaluate the model’s prediction against the expected result. You keep a portion of your input data, the testing data, away from the training and validation data used to train and tune the model. In the next lab, you use the testing data to examine the behavior of your model on data it has never seen. For this lab, you use the training and validation data to tune your model. You tune the model by adjusting your hyperparameter configurations. The goal of these adjustments is to incrementally improve your output metrics.

Amazon SageMaker provides automatic model tuning, also known as hyperparameter tuning, to find the best version of a model in an efficient manner. SageMaker hyperparameter tuning runs many training jobs on a dataset by using the specified algorithm and hyperparameter ranges. It then chooses the hyperparameter values that result in the best performing model, as determined by your chosen metric. You specify an ML model to tune, your objective metric, and the hyperparameters to use, And SageMaker hyperparameter tuning finds the best version of the model in a cost-effective way.

### Task 2.1: Set up the environment

Before you start tuning your model, install any necessary dependencies.

In [1]:
#Install matplotlib and restart kernel
%pip install matplotlib
%pip uninstall bokeh -y
%pip install bokeh==2.4.2
%reset -f

# Install dependencies 
import boto3
import numpy as np
import pandas as pd
import sagemaker
import bokeh
import bokeh.io

from sagemaker.inputs import TrainingInput
from pprint import pprint
from sagemaker import image_uris
from sagemaker.session import Session
from sagemaker.tuner import IntegerParameter, CategoricalParameter, ContinuousParameter, HyperparameterTuner
from sagemaker.xgboost.estimator import XGBoost
from time import strftime
from bokeh.models import HoverTool
from bokeh.plotting import figure, show

sagemaker_session = sagemaker.Session()
role = sagemaker.get_execution_role()
region = boto3.Session().region_name
sess = boto3.Session()
sm = sess.client('sagemaker')

Note: you may need to restart the kernel to use updated packages.
[0mNote: you may need to restart the kernel to use updated packages.
Collecting bokeh==2.4.2
  Downloading bokeh-2.4.2-py3-none-any.whl.metadata (14 kB)
Downloading bokeh-2.4.2-py3-none-any.whl (18.5 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m18.5/18.5 MB[0m [31m49.5 MB/s[0m eta [36m0:00:00[0m [36m0:00:01[0m
[?25hInstalling collected packages: bokeh
Successfully installed bokeh-2.4.2
Note: you may need to restart the kernel to use updated packages.




sagemaker.config INFO - Not applying SDK defaults from location: /etc/xdg/sagemaker/config.yaml
sagemaker.config INFO - Not applying SDK defaults from location: /home/sagemaker-user/.config/sagemaker/config.yaml


Next, import the dataset.

In [2]:
# Import the dataset 
s3 = boto3.resource('s3')
for buckets in s3.buckets.all():
    if 'labdatabucket' in buckets.name:
        bucket = buckets.name
print("Bucket: ", bucket)
prefix = 'scripts/data'
output_path = 's3://{}/{}/output'.format(bucket, prefix)

train_path = f"s3://{bucket}/{prefix}/train/adult_data_processed_train.csv"
validation_path = f"s3://{bucket}/{prefix}/validation/adult_data_processed_validation.csv"

train_input = TrainingInput(train_path, content_type='text/csv')
validation_input = TrainingInput(validation_path, content_type='text/csv')

print(f'Training path: {train_path}')
print(f'Validation path: {validation_path}')

create_date = strftime("%m%d%H%M")
container = image_uris.retrieve(framework='xgboost',region=boto3.Session().region_name,version='1.2-1')
run_name = 'lab-3-run-{}'.format(create_date)
run_tags = [{'Key': 'lab-3', 'Value': 'lab-3-run'}]
job_name = 'lab-3-job-{}'.format(create_date)

Bucket:  labdatabucket-us-west-2-545022455
Training path: s3://labdatabucket-us-west-2-545022455/scripts/data/train/adult_data_processed_train.csv
Validation path: s3://labdatabucket-us-west-2-545022455/scripts/data/validation/adult_data_processed_validation.csv


You have successfully imported the libraries and data you need to start training a model.

### Task 2.2: Configure an estimator object

In this task, you configure an estimator object that is identical to the one that you used in the previous lab. For a tuning job, the key difference is how you configure the hyperparameters.

In [3]:
xgb_model = sagemaker.estimator.Estimator(
    container,
    role, 
    instance_count = 1, 
    instance_type ='ml.m5.xlarge',
    output_path = output_path,
    sagemaker_session = sagemaker_session,
    EnableSageMakerMetricsTimeSeries = True,
    tags = run_tags
)

You have successfully configured an estimator object.

### Task 2.3: Configure a hyperparameter tuner

Selecting the right hyperparameter values for a machine learning model can be difficult. The correct answer depends on the algorithm and the data. Some algorithms have many tuneable hyperparameters. Some are very sensitive to the hyperparameter values selected. And yet most have a nonlinear relationship between model fit and hyperparameter values. Amazon SageMaker automatic model tuning helps by automating the hyperparameter tuning process.

To use SageMaker automatic model tuning, you specify a range, or a list of possible values, for each hyperparameter that you choose to tune. SageMaker automatic model tuning runs multiple training jobs with various hyperparameter settings. It then evaluates the results of each job based on a specified objective metric and selects the hyperparameter settings for future attempts based on previous results. For each tuning job, you specify a maximum number of training jobs, and the tuning completes when that number has been reached.

Refer to [Perform Automatic Model Tuning with SageMaker](https://docs.aws.amazon.com/sagemaker/latest/dg/automatic-model-tuning.html) for more information about automatic model tuning. 

The hyperparameter ranges that you need set are as follows:
- **alpha**: L1 regularization term on weights. Increasing this value makes models less complex by reducing possible overfitting. The trade-off is that models are less sensitive to the class of interest and less fit to your training dataset.
- **eta**: Step size shrinkage used in updates to prevent overfitting. After each boosting step, you can directly get the weights of new features. The eta parameter actually shrinks the feature weights to make the boosting process more conservative.
- **max_depth**: Maximum depth of a tree. Increasing the depth might result in overfitting and decreasing the depth might result in underfitting.
- **min_child_weight**: Minimum sum of instance weight needed in a child. If the tree partition step results in a leaf node with the sum of instance weight less than min_child_weight, the building process gives up further partitioning.
- **num_round**: The number of rounds (trees) used for boosting. Increasing the trees can increase the model accuracy but increases the risk of overfitting.

Refer to [XGBoost Hyperparameters](https://docs.aws.amazon.com/sagemaker/latest/dg/xgboost_hyperparameters.html) for more information about XGBoost hyperparameters.

In [4]:
hyperparameter_ranges = {
    'alpha': ContinuousParameter(0, 2),
    'eta': ContinuousParameter(0, 1),
    'max_depth': IntegerParameter(1, 10),
    'min_child_weight': ContinuousParameter(1, 10),
    'num_round': IntegerParameter(100, 1000)
}

An objective metric is the value that the hyperparameter tuner is focused on optimizing. In this case, you are trying to maximize the validation area under the curve (AUC). AUC measures the ability of the model to predict a higher score for positive examples as compared to negative examples. Because it is independent of the score cut-off, you can get a sense of the prediction accuracy of your model from the AUC metric without picking a threshold. The AUC metric returns a decimal value from 0 to 1. A model with AUC of 0.50 is no better than a coin flip because it represents random chance, whereas a "perfect" model will have a score of 1.0. The higher AUC, the better your model can distinguish between frauds and legitimates. Values near 0 are unusual to see, and typically indicate a problem with the data. 

In [5]:
objective_metric_name = 'validation:auc'
objective_type='Maximize'

You use an estimator object to obtain configuration information for training jobs that are created as the result of a hyperparameter tuning job. The HyperparameterTuner parameters that you need are as follows:
- **estimator**: An estimator object that has been initialized with the required configuration. There does not need to be a training job associated with this instance.
- **objective_metric_name**: Name of the metric for evaluating training jobs.
- **hyperparameter_ranges**: Dictionary of parameter ranges. These parameter ranges can be one of three types: Continuous, Integer, or Categorical. The keys of the dictionary are the names of the hyperparameter, and the values are the appropriate parameter range class to represent the range.
- **objective_type**: The type of the objective metric for evaluating training jobs. This value can be either 'Minimize' or 'Maximize' (default: 'Maximize').
- **max_jobs**: Maximum total number of training jobs to start for the hyperparameter tuning job. The default value is unspecified for the ‘Grid’ strategy, and the default value is 1 for all other strategies (default: None).
- **max_parallel_jobs**: Maximum number of parallel training jobs to start (default: 1).
- **early_stopping_type**: Specifies whether early stopping is enabled for the job. Can be either 'Auto' or 'Off' (default: 'Off'). If set to 'Off', early stopping will not be attempted. If set to 'Auto', early stopping of some training jobs might happen, but is not guaranteed to.

In the following code, tell the tuner to run at most 12 experiments (max_jobs) and only four concurrent experiments at a time (max_parallel_jobs). Both of these parameters keep your cost and training time under control.

In [6]:
tuner = HyperparameterTuner(
    estimator = xgb_model,
    objective_metric_name = objective_metric_name,
    hyperparameter_ranges = hyperparameter_ranges,
    objective_type = objective_type,
    max_jobs=12,
    max_parallel_jobs=4,
    early_stopping_type='Auto',
)

You have successfully configured a hyperparameter tuner.

### Task 2.4: Run a hyperparameter tuning job

Now that you have configured your estimator object and hyperparameters, you are ready to start tuning the model. The fit() method starts the tuning script. The tuning takes approximately 5–6 minutes to run. To start model tuning, call the estimator's fit() method with the training and validation datasets. If you set `wait=True`, the fit() method displays progress logs and waits until training is complete.

In [7]:
tuner.fit(
    {
        "train": train_input,
        "validation": validation_input
    },
    job_name=job_name,
    wait=True
)

......................................................!


<i aria-hidden="true" class="fas fa-clipboard-check" style="color:#18ab4b"></i> **Expected output:** If the estimator and hyperparameter ranges configuration are correct and the tuning job is started correctly, you should see the following output:

```plain
************************
**** EXAMPLE OUTPUT ****
************************
No finished training job found associated with this estimator. Please make sure this estimator is only used for building workflow config
......................................................................!
```

<i aria-hidden="true" class="fas fa-sticky-note" style="color:#563377"></i> **Note:** The "*No finished training job found associated with this estimator*" warning is expected. This message comes from the SDK and it is to warn you that the model data of the estimator is being referenced, but that the estimator has not been run yet.

You have successfully run a hyperparameter tuning job.

### Task 2.5: Evaluate the models and select one as a candidate for deployment

After you launch a tuning job, you can see its progress by calling the *describe_tuning_job* API. The output from *describe-tuning-job* is a JSON object that contains information about the current state of the tuning job. You can call *list_training_jobs_for_tuning_job* to see a detailed list of the training jobs that the tuning job launched.

In [8]:
# Print the number of completed tuning jobs
tuning_job_result = sm.describe_hyper_parameter_tuning_job(
    HyperParameterTuningJobName=job_name
)

status = tuning_job_result["HyperParameterTuningJobStatus"]
if status != "Completed":
    print("Reminder: the tuning job has not been completed.")

job_count = tuning_job_result["TrainingJobStatusCounters"]["Completed"]
print("%d training jobs have completed" % job_count)

objective = tuning_job_result["HyperParameterTuningJobConfig"]["HyperParameterTuningJobObjective"]
is_minimize = objective["Type"] != "Maximize"
objective_name = objective["MetricName"]

8 training jobs have completed


In [9]:
# Get the best training job
if tuning_job_result.get("BestTrainingJob", None):
    print("Best model found so far:")
    pprint(tuning_job_result["BestTrainingJob"])
else:
    print("No training jobs have reported results yet.")

Best model found so far:
{'CreationTime': datetime.datetime(2025, 4, 21, 13, 6, 10, tzinfo=tzlocal()),
 'FinalHyperParameterTuningJobObjectiveMetric': {'MetricName': 'validation:auc',
                                                 'Value': 0.9198099970817566},
 'ObjectiveStatus': 'Succeeded',
 'TrainingEndTime': datetime.datetime(2025, 4, 21, 13, 6, 53, tzinfo=tzlocal()),
 'TrainingJobArn': 'arn:aws:sagemaker:us-west-2:456938221590:training-job/lab-3-job-04211302-007-dfb390b1',
 'TrainingJobName': 'lab-3-job-04211302-007-dfb390b1',
 'TrainingJobStatus': 'Completed',
 'TrainingStartTime': datetime.datetime(2025, 4, 21, 13, 6, 14, tzinfo=tzlocal()),
 'TunedHyperParameters': {'alpha': '0.09884944250910954',
                          'eta': '0.36057580384812604',
                          'max_depth': '2',
                          'min_child_weight': '1.9352716507829695',
                          'num_round': '506'}}


You can list hyperparameters and objective metrics of all training jobs and pick up the training job with the best objective metric.

In [10]:
# Print the tuning metrics
tuner = sagemaker.HyperparameterTuningJobAnalytics(job_name)

full_df = tuner.dataframe()

if len(full_df) > 0:
    df = full_df[full_df["FinalObjectiveValue"] > -float("inf")]
    if len(df) > 0:
        df = df.sort_values("FinalObjectiveValue", ascending=is_minimize)
        print("Number of training jobs with valid objective: %d" % len(df))
        print({"lowest": min(df["FinalObjectiveValue"]), "highest": max(df["FinalObjectiveValue"])})
        pd.set_option("display.max_colwidth", None)  # Don't truncate TrainingJobName
    else:
        print("No training jobs have reported valid results yet.")

df

Number of training jobs with valid objective: 12
{'lowest': 0.5, 'highest': 0.9198099970817566}


Unnamed: 0,alpha,eta,max_depth,min_child_weight,num_round,TrainingJobName,TrainingJobStatus,FinalObjectiveValue,TrainingStartTime,TrainingEndTime,TrainingElapsedTimeSeconds
5,0.098849,0.3605758,2.0,1.935272,506.0,lab-3-job-04211302-007-dfb390b1,Completed,0.91981,2025-04-21 13:06:14+00:00,2025-04-21 13:06:53+00:00,39.0
4,0.595835,0.1027631,6.0,2.955053,401.0,lab-3-job-04211302-008-355d579b,Stopped,0.91681,2025-04-21 13:06:27+00:00,2025-04-21 13:06:58+00:00,31.0
9,0.403326,0.06134075,6.0,2.498098,761.0,lab-3-job-04211302-003-fe9ded4f,Stopped,0.91412,2025-04-21 13:04:03+00:00,2025-04-21 13:05:56+00:00,113.0
10,1.869521,0.505585,1.0,3.748675,416.0,lab-3-job-04211302-002-62b93a40,Completed,0.90829,2025-04-21 13:04:02+00:00,2025-04-21 13:05:46+00:00,104.0
2,1.437782,0.2826593,6.0,7.505319,417.0,lab-3-job-04211302-010-79cd83c8,Completed,0.90614,2025-04-21 13:07:12+00:00,2025-04-21 13:07:51+00:00,39.0
3,0.729641,0.6064024,6.0,3.268158,204.0,lab-3-job-04211302-009-43727dc5,Completed,0.89148,2025-04-21 13:07:01+00:00,2025-04-21 13:07:35+00:00,34.0
11,1.776487,0.6370456,9.0,5.482653,394.0,lab-3-job-04211302-001-4b9cc9f7,Completed,0.8889,2025-04-21 13:04:00+00:00,2025-04-21 13:05:44+00:00,104.0
1,2.0,1.0,6.0,2.289012,630.0,lab-3-job-04211302-011-9723d39c,Stopped,0.88541,2025-04-21 13:07:15+00:00,2025-04-21 13:07:58+00:00,43.0
8,0.741501,0.1477106,10.0,2.945973,893.0,lab-3-job-04211302-004-fadc1a15,Completed,0.88103,2025-04-21 13:04:14+00:00,2025-04-21 13:06:13+00:00,119.0
6,0.309124,0.7571507,8.0,1.43299,177.0,lab-3-job-04211302-006-cc1e0ccf,Completed,0.85754,2025-04-21 13:06:12+00:00,2025-04-21 13:06:46+00:00,34.0


You can also see how the objective metric changes over time, as the tuning job progresses. For Bayesian strategy, you should expect to see a general trend toward better results, but this progress is not steady because the algorithm must balance exploration of new areas of parameter space against exploitation of known good areas. This can give you a sense of whether or not the number of training jobs is sufficient for the complexity of your search space.

In [11]:
# Plot the objective metric results against time
bokeh.io.output_notebook()

df = df.sort_values(by=['TrainingStartTime'], ascending=True)

# x = df['TrainingStartTime'].to_numpy()
x = df['TrainingStartTime']
y = df['FinalObjectiveValue'].to_numpy()

p = figure(
    title="Final Objective Value over Time", 
    width=900, height=400, 
    x_axis_label="TrainingStartTime",
    y_axis_label="FinalObjectiveValue",
    x_axis_type="datetime"
)

# add hover tool 
hover = HoverTool(tooltips=[
    ('FinalObjectiveValue', '@y'),
    ('TrainingStartTime', "@x{%T}")
    ], formatters={'@x': 'datetime'}) 
p.add_tools(hover) 


# p.circle(source=df, x="TrainingStartTime", y="FinalObjectiveValue")
p.line(x,y,color='green',line_width=2)
p.circle(x, y, fill_color ="red", line_color ="green", size=8) 

show(p)

Now that you have finished a tuning job, you might want to know the correlation between your objective metric and individual hyperparameters that you have selected to tune. Having that insight helps you decide whether it makes sense to adjust search ranges for certain hyperparameters and start another tuning job. For example, if you see a positive trend between objective metric and a numerical hyperparameter, you probably want to set a higher tuning range for that hyperparameter in your next tuning job.

The following cell draws a graph for each hyperparameter to show its correlation with your objective metric.

In [12]:
# Plot each of the hyperparameter ranges with the objective metric results
ranges = tuner.tuning_ranges
figures = []
for hp_name, hp_range in ranges.items():
    categorical_args = {}

    x = df[hp_name].to_numpy()
    y = df['FinalObjectiveValue'].to_numpy()

    # determine line of best fit
    par = np.polyfit(x, y, 1, full=True)
    slope=par[0][0]
    intercept=par[0][1]
    y_predicted = [slope*i + intercept  for i in x]        


    p = figure(
        width=500,
        height=500,
        title="Objective vs %s" % hp_name,
        x_axis_label=hp_name,
        y_axis_label=objective_name,
        **categorical_args,
    )

    # add hover tool 
    hover = HoverTool(tooltips=[
        ('FinalObjectiveValue', '@y'),
        (hp_name, '@x')
    ]) 
    p.add_tools(hover) 

    p.circle(source=df, x=hp_name, y="FinalObjectiveValue")
    p.line(x,y_predicted,color='green', line_width=2)
    figures.append(p)


show(bokeh.layouts.Column(*figures))

Take a moment to view the charts. Which hyperparameters have a positive correlation with the objective? You can estimate which graph has a positive correlation by using a line of best fit through the points in the graph. If the line of best fit slopes up, the hyperparameter and objective have a positive correlation. In these graphs, a line of best fit has been added. 

The hyperparameters do not operate in isolation for these tuning jobs. Each tuning job has adjusted several hyperparameters at the same time. You can isolate hyperparameters to view the direct correlation with the target, but remember that each tuning job's result is a combination of several hyperparameter adjustments.

### Task 2.6: View the model artifacts

SageMaker stores the model artifacts in your S3 bucket. To find the location of one of the model artifact, follow these steps:

1. Choose the bucket icon from the left menu bar.

1. In the list of buckets, open the Amazon S3 bucket that contains **labdatabucket** in its name.

1. Navigate to any one of the the **scripts/data/output/lab-3-job-.../output** subfolders. 

You see a model artifact **model.tar.gz** in the subfolder. This is one of the models that you created with your SageMaker Estimator by calling the tuner.fit() method.


### Cleanup

You have completed this notebook. To move to the next part of the lab, do the following:

- Close this notebook file.
- Return to the lab session and continue with the **Conclusion**.