# Amazon SageMaker Canvas fine-tune foundation model

This notebook was automatically generated for Amazon SageMaker Canvas model **New model 2024-6-24 11:24:22 PM**.The
notebook allows you to select a candidate base model, inspect and modify the hyperparameters and perform
fine-tuning. Depending on the candidate base model, the fine-tuning is performed by either Amazon SageMaker Autopilot
or Amazon Bedrock. The notebook also allows preparing the fine-tuned model and run inference on the same.

---

## Contents

1. [Setup](#1.-Setup)
1. [Candidate selection](#2.-Select-text-generation-candidate-to-train)
1. [Fine-tune the selected candidate](#3.-Fine-tune-the-selected-candidate)
1. [Model metrics](#4.-Model-metrics)
1. [Deploy & run inference on the fine-tuned model](#5.-Deploy-&-run-Inference-on-the-fine-tuned-model)
1. [Clean up](#6.-Clean-up)

---

## 1. Setup

Before executing the notebook, there are some initial steps required for setup.

In [None]:
!pip install --upgrade sagemaker --quiet
!pip install --upgrade --force-reinstall boto3
!pip install --upgrade pandas

Here, we use the execution role associated with the current notebook instance as the
AWS account role with SageMaker and Bedrock access. It should have necessary permissions,
including access to your data in S3. We also initialize SageMaker and Bedrock clients for later use.

In [None]:
import os
import sagemaker, boto3, json
from sagemaker.session import Session
from pprint import pprint
from datetime import datetime
import pandas as pd
import time

sagemaker_session = Session()
aws_role = sagemaker_session.get_caller_identity_arn()
aws_region = boto3.Session().region_name

bedrock_runtime_client = boto3.client('bedrock-runtime')
bedrock_client = boto3.client('bedrock')
sagemaker_client = boto3.client("sagemaker")
sagemaker_runtime_client = boto3.client('sagemaker-runtime')

## 2. Select text generation candidate to train

All the base models chosen as part of SageMaker Canvas model setup are included as candidates here. A base model could
be a Jumpstart model or a Bedrock proprietary model. Jumpstart models will be fine-tuned through Amazon
SageMaker Autopilot while the Bedrock proprietary ones will be fine-tuned through Amazon Bedrock.

A candidate is selected by default. You can also update the selected candidate.

In [None]:
candidates = [
    {
        "baseModelType": "bedrock",
        "baseModelName": "amazon.titan-text-express-v1",
        "hyperparameters": {
            "epochCount": "3",
            "batchSize": "1",
            "learningRate": "0.00001",
            "learningRateWarmupSteps": "5"
        }
    },
]

<div class="alert alert-info"> 💡 <strong> Available Knobs</strong>
Selected candidate can be updated by updating the candidate index.
</div>

In [None]:
selected_candidate = candidates[0]

## 3. Fine tune the selected candidate

Fine-tuning refers to the process of customizing a foundation model on custom data to improve its performance for a specific task or domain.

<div class="alert alert-info"> 💡 <strong> Available Knobs</strong>

The recommended hyperparameter values for the selected candidate can be overridden with custom values if required. Uncomment the code and adjust the values. A brief description of each hyperparameter is included below.

1. **Epoch count**: The epoch count is the number of times that the model is trained on the entire training set.A larger number of epochs will allow the model to learn more thoroughly, but it may also make it more likely to overfit to the training data. A smaller number of epochs will allow the model to learn less thoroughly, but it may also make it less likely to overfit.
1. **Batch Size**: The batch size is the number of training examples that the model is trained on at each iteration. A larger batch size will allow the model to learn more quickly, but it may also require more memory. A smaller batch size will allow the model to learn more slowly, but it may also require less memory.
1. **Learning Rate**: The learning rate determines how quickly the model updates its weights during training. A higher learning rate will cause the model to learn more quickly, but it may also make it more likely to overfit to the training data. A lower learning rate will cause the model to learn more slowly, but it may also make it less likely to overfit.
1. **Learning Rate warmup steps**: The number of training steps over which the learning rate is gradually increased from a small initial value to a higher target value. This helps the model to learn more quickly at the beginning of training, when it is still trying to figure out the basic structure of the data. As the model learns more, the learning rate is gradually reduced to prevent it from overfitting.

</div>

In [None]:
# selected_candidate["hyperparameters"]["epochCount"] = 3
# selected_candidate["hyperparameters"]["batchSize"] = 1
# selected_candidate["hyperparameters"]["learningRate"] = 0.00001
# selected_candidate["hyperparameters"]["learningRateWarmupSteps"] = 5

Here we setup the fine-tuning job input depending on the candidate's model type.

In [None]:
if selected_candidate["baseModelType"] == "jumpstart":
    request = {
        "AutoMLJobName": f"New model 2024-6-24 11:24:22 PM-{datetime.now():%Y-%m-%d-%H-%M-%S}"[0:31],
        "AutoMLProblemTypeConfig": {
            "TextGenerationJobConfig": {
                "BaseModelName": selected_candidate["baseModelName"],
                "TextGenerationHyperParameters": selected_candidate["hyperparameters"],
            }
        },
        "RoleArn": aws_role,
        "AutoMLJobInputDataConfig": [
            {
                "ChannelType": "training",
                "DataSource": {
                    "S3DataSource": {
                        "S3DataType": "S3Prefix",
                        "S3Uri": None
                    }
                }
            },
            {
                "ChannelType": "validation",
                "DataSource": {
                    "S3DataSource": {
                        "S3DataType": "S3Prefix",
                        "S3Uri": None
                    }
                }
            }
        ],
        "OutputDataConfig": {
            "S3OutputPath": None
        },
        "Tags": [{'Key': 'sagemaker:is-canvas-resource', 'Value': 'True'}, {'Key': 'sagemaker:is-canvas-genai-resource', 'Value': 'True'}, {'Key': 'sagemaker:is-created-from-canvas-notebook', 'Value': 'True'}],
    }
elif selected_candidate["baseModelType"] == "bedrock":
    request = {
        "baseModelIdentifier": selected_candidate["baseModelName"],
        "trainingDataConfig": {
            "s3Uri": "s3://sagemaker-us-west-2-302187232084/Canvas/lungile/Datasets/5528a963-1e68-4d7a-a15e-d6800584cc8a/1719264293.837319/bedrock/training/part-00000-7f3133c2-7d54-49c0-b770-28710f5d0497-c000.jsonl"
        },
        "customModelName": f"New model 2024-6-24 11:24:22 PM-{datetime.now():%Y-%m-%d-%H-%M-%S}",
        "hyperParameters": selected_candidate["hyperparameters"],
        "jobName": f"New model 2024-6-24 11:24:22 PM-{datetime.now():%Y-%m-%d-%H-%M-%S}",
        "roleArn": aws_role,
        "outputDataConfig": {
            "s3Uri": "s3://sagemaker-us-west-2-302187232084/Training/"
        },
        "validationDataConfig": {
            "validators": [
                {
                    "s3Uri": "s3://sagemaker-us-west-2-302187232084/Canvas/lungile/Datasets/5528a963-1e68-4d7a-a15e-d6800584cc8a/1719264293.837319/bedrock/validation/part-00000-ab0d3f02-92c6-4af9-9f2c-96e9cfb3648f-c000.jsonl"
                }
            ]
        },
        "customModelTags": [{'key': 'sagemaker:is-canvas-resource', 'value': 'True'}, {'key': 'sagemaker:is-canvas-genai-resource', 'value': 'True'}, {'key': 'sagemaker:is-created-from-canvas-notebook', 'value': 'True'}],
        "jobTags": [{'key': 'sagemaker:is-canvas-resource', 'value': 'True'}, {'key': 'sagemaker:is-canvas-genai-resource', 'value': 'True'}, {'key': 'sagemaker:is-created-from-canvas-notebook', 'value': 'True'}],
    }
else:
    raise ValueError("Unknown model type")

pprint(request)

Now we launch the fine-tuning job.

In [None]:
if selected_candidate["baseModelType"] == "jumpstart":
    response = sagemaker_client.create_auto_ml_job_v2(**request)
elif selected_candidate["baseModelType"] == "bedrock":
    response = bedrock_client.create_model_customization_job(**request)

Here we query for the job status. The job needs to completed before we can proceed with the further sections of the notebook.

In [None]:
def wait_fine_tuning_job_to_complete():
    print("Waiting for fine-tuning job to complete")
    while True:
        if selected_candidate["baseModelType"] == "jumpstart":
            job_details = sagemaker_client.describe_auto_ml_job_v2(
                    AutoMLJobName=request['AutoMLJobName'])
            job_status = job_details["AutoMLJobStatus"]

        elif selected_candidate["baseModelType"] == "bedrock":
            job_details = bedrock_client.get_model_customization_job(
                    jobIdentifier=request["customModelName"]
            )
            job_status = job_details["status"]

        if job_status in ["Completed", "Failed", "Stopped"]:
            print(f"\nJob finished with status: {job_status}")
            break

        print(".", end="", flush=True)
        time.sleep(120)
    return job_details, job_status

job_details, job_status = wait_fine_tuning_job_to_complete()

pprint(f'Base model type: {selected_candidate["baseModelType"]}')
pprint(f'Job Status: {job_status}')
#pprint(f'Complete Job details: {job_details}')

## 4. Model metrics

In this section, we will fetch training and validation metrics for the fine tuning job. A brief description of the metrics is included below.

1. **Training perplexity** is a measure of how well a language model predicts the next word in a sequence of words, given the words that have already been seen. A lower perplexity score indicates that the language model is better at predicting the next word in a sequence of words.
1. **Validation perplexity** is a measure of how well a language model predicts the next word in a sequence of words, given the words that have already been seen, on a held-out validation dataset. A lower perplexity score indicates that the language model is better at predicting the next word in a sequence of words.
1. **Training loss** is a metric used to evaluate how well a fine-tuned large language model is learning during the training process. The lower the training loss, the better the model is learning.
1. **Validation loss** is a metric used to evaluate how well a fine-tuned large language model performs on a held-out dataset of data that was not used to train the model. The lower the validation loss, the better the model is performing.
1. **ROUGE (Recall-Oriented Understudy for Gisting Evaluation)** is a metric used to evaluate the quality of summaries generated by a language model. ROUGE is a good measure of how well a system can summarize the main points of a text. This metric is emitted only for jumpstart models.
1. **BLEU (Bilingual Evaluation Understudy)** is a metric used to compare a candidate translation of text to one or more reference translations. BLEU is a good measure of how well a system can translate words and phrases accurately. This metric is emitted only for jumpstart models.

In [None]:
def display_metrics(training_metrics_s3_path, validation_metrics_s3_path):
    print("Training Metrics\n")
    print(f'S3 Path: {training_metrics_s3_path}\n')
    df = pd.read_csv(training_metrics_s3_path)
    print(df.head(5))
    print("\n\n")
    print("Validation Metrics\n")
    print(f'S3 Path: {validation_metrics_s3_path}\n')
    df = pd.read_csv(validation_metrics_s3_path)
    print(df.head(5))

def get_best_candidate_training_step_name(sagemaker_autopilot_job_details):
    candidate_steps = sagemaker_autopilot_job_details["BestCandidate"]["CandidateSteps"]
    for candidate_step in candidate_steps:
        if candidate_step["CandidateStepType"] == "AWS::SageMaker::TrainingJob":
            return candidate_step["CandidateStepName"]

    raise ValueError("No training step found for the best candidate")

def display_bedrock_fine_tuned_model_metrics():
    job_details = bedrock_client.get_model_customization_job(
        jobIdentifier = request["customModelName"]
    )
    training_metrics_s3_path = (
        f'{job_details["outputDataConfig"]["s3Uri"]}'
        f'model-customization-job-{job_details["jobArn"].split("/")[-1]}'
        '/training_artifacts/step_wise_training_metrics.csv'
    )
    validation_metrics_s3_path = (
        f'{job_details["outputDataConfig"]["s3Uri"]}'
        f'model-customization-job-{job_details["jobArn"].split("/")[-1]}'
        '/validation_artifacts/post_fine_tuning_validation/validation/validation_metrics.csv'
    )

    display_metrics(training_metrics_s3_path, validation_metrics_s3_path)


def display_autopilot_fine_tuned_model_metrics():
    job_details = sagemaker_client.describe_auto_ml_job_v2(
        AutoMLJobName=request['AutoMLJobName']
    )
    training_step_name = get_best_candidate_training_step_name(job_details)
    training_metrics_s3_path = (
        f'{job_details["OutputDataConfig"]["S3OutputPath"]}'
        f'{request["AutoMLJobName"]}'
        f'/{training_step_name}'
        '/train_metrics.csv'
    )
    validation_metrics_s3_path = (
        f'{job_details["OutputDataConfig"]["S3OutputPath"]}'
        f'{request["AutoMLJobName"]}'
        f'/{training_step_name}'
        '/validation_metrics.csv'
    )

    display_metrics(training_metrics_s3_path, validation_metrics_s3_path)


if selected_candidate["baseModelType"] == "jumpstart":
    display_autopilot_fine_tuned_model_metrics()
elif selected_candidate["baseModelType"] == "bedrock":
    display_bedrock_fine_tuned_model_metrics()

## 5. Deploy & run Inference on the fine-tuned model

We now want to use the model to perform inference. For a model fine-tuned through Amazon Bedrock, we will create a provisioned model for inference. A model fine-tuned through SageMaker Autopilot will be deployed to a real-time SageMaker endpoint for performing inference.

To begin with, we will define a few useful functions here.

In [None]:
def create_provisioned_model_throughput(modelId):
    provisioned_model_name = f"New model 2024-6-24 11:24:22 PM-provisioned-{datetime.now():%Y-%m-%d-%H-%M-%S}"
    response = bedrock_client.create_provisioned_model_throughput(
        modelUnits=1,
        provisionedModelName=provisioned_model_name,
        modelId=modelId,
        tags=[{'key': 'sagemaker:is-canvas-resource', 'value': 'True'}, {'key': 'sagemaker:is-canvas-genai-resource', 'value': 'True'}, {'key': 'sagemaker:is-created-from-canvas-notebook', 'value': 'True'}],
    )
    pprint(f"Created provisioned model {response}")
    return response["provisionedModelArn"]


def create_sagemaker_model(automl_job_details):
    inference_containers = automl_job_details["BestCandidate"]["InferenceContainerDefinitions"]["GPU"]
    model_name = f"New model 2024-6-24 11:24:22 PM-sm-model-{datetime.now():%Y-%m-%d-%H-%M-%S}"
    response = sagemaker_client.create_model(
        ModelName=model_name,
        ExecutionRoleArn=aws_role,
        Containers=inference_containers,
    )
    pprint(f"Created Sagemaker model {response}")
    return model_name


def create_sagemaker_endpoint_config(sagemaker_model_name):
    endpoint_config_name = f"New model 2024-6-24 11:24:22 PM-ec-{datetime.now():%Y-%m-%d-%H-%M-%S}"
    production_variant_config = dict(
        InstanceType="ml.g5.24xlarge",
        InitialInstanceCount=1,
        ModelName=sagemaker_model_name,
        InitialVariantWeight=1.0,
        ModelDataDownloadTimeoutInSeconds=3600,
        ContainerStartupHealthCheckTimeoutInSeconds=3600,
        VariantName=f"production-variant-{datetime.now():%Y-%m-%d-%H-%M-%S}",
    )

    response = sagemaker_client.create_endpoint_config(
        EndpointConfigName=endpoint_config_name,
        ProductionVariants=[production_variant_config],
        Tags=[{'Key': 'sagemaker:is-canvas-resource', 'Value': 'True'}, {'Key': 'sagemaker:is-canvas-genai-resource', 'Value': 'True'}, {'Key': 'sagemaker:is-created-from-canvas-notebook', 'Value': 'True'}],
    )
    pprint(f"Created endpoint config {response}")
    return endpoint_config_name

Here, we create the resources for inference.

In [None]:
sagemaker_endpoint_name = None
bedrock_provisioned_model_id = None

if selected_candidate["baseModelType"] == "jumpstart":
    job_details = sagemaker_client.describe_auto_ml_job_v2(
        AutoMLJobName=request['AutoMLJobName']
    )
    sm_model_name = create_sagemaker_model(job_details)
    sm_endpoint_config = create_sagemaker_endpoint_config(sm_model_name)
    sagemaker_endpoint_name = f"New model 2024-6-24 11:24:22 PM-sm-endpoint-{datetime.now():%Y-%m-%d-%H-%M-%S}"

    response = sagemaker_client.create_endpoint(
        EndpointName=sagemaker_endpoint_name, 
        EndpointConfigName=sm_endpoint_config,
        Tags=[{'Key': 'sagemaker:is-canvas-resource', 'Value': 'True'}, {'Key': 'sagemaker:is-canvas-genai-resource', 'Value': 'True'}, {'Key': 'sagemaker:is-created-from-canvas-notebook', 'Value': 'True'}],
)
    pprint(f"\nCreated Endpoint: {response}")
elif selected_candidate["baseModelType"] == "bedrock":
    bedrock_provisioned_model_id = create_provisioned_model_throughput(
        request['customModelName'])

Here we describe the inference resource to check if it is "In Service". The status should be "In Service" before proceeding with the rest of the notebook.

In [None]:
def wait_for_inference_resource_in_service():
    print("Waiting for inference resource to be ready")
    while True:
        if sagemaker_endpoint_name:
            response = sagemaker_client.describe_endpoint(
                EndpointName=sagemaker_endpoint_name)
            status = response["EndpointStatus"]
        elif bedrock_provisioned_model_id:
            response = bedrock_client.get_provisioned_model_throughput(
                provisionedModelId=bedrock_provisioned_model_id
            )
            status = response["status"]

        if status in ["InService", "Failed"]:
            print(f"Inference resource creation completed with status: {status}")
            break
        print(".", end="", flush=True)
        time.sleep(30)


wait_for_inference_resource_in_service()
pprint(response)

We can now send data to the endpoint to get inferences in real time. This step invokes the endpoint with included sample data.

In [None]:
def build_sagemaker_input_payload():
    # base model name is defined in https://docs.aws.amazon.com/sagemaker/latest/dg/autopilot-llms-finetuning-models.html#autopilot-llms-finetuning-supported-llms
    base_model_name = selected_candidate["baseModelName"]
    if "MPT" in base_model_name:
        return {
            "text_inputs": "What is AWS?",
            "max_length": 50,
        }
    elif "Falcon" in base_model_name:
        return {
            "inputs": "What is AWS",
            "parameters": {
                "max_new_tokens": 50,
            }
        }
    elif any(model in base_model_name for model in ["Dolly", "Flan"]):
        return {
            "text_inputs": "What is AWS",
            "max_new_tokens": 50,
        }
    else:
        raise ValueError("Unrecognized base model name")


def invoke_sagemaker_endpoint():
    body = json.dumps(build_sagemaker_input_payload()).encode('utf-8')
    response = sagemaker_runtime_client.invoke_endpoint(
        Body=body,
        EndpointName=sagemaker_endpoint_name,
        Accept='application/json',
        ContentType='application/json',
    )
    print(response.get('Body').read())


def invoke_bedrock_provisioned_model():
    body = json.dumps(
        {"inputText": "What is AWS?"}
    )

    response = bedrock_runtime_client.invoke_model(
        body=body,
        modelId=bedrock_provisioned_model_id,
        accept='application/json',
        contentType='application/json'
    )
    print(response.get('body').read())


if selected_candidate["baseModelType"] == "jumpstart":
    invoke_sagemaker_endpoint()
elif selected_candidate["baseModelType"] == "bedrock":
    invoke_bedrock_provisioned_model()

## 6. Clean up

Here we clean-up the inference resources.

In [None]:
if bedrock_provisioned_model_id:
    response = bedrock_client.delete_provisioned_model_throughput(
        provisionedModelId=bedrock_provisioned_model_id
    )

if sagemaker_endpoint_name:
    response = sagemaker_client.delete_endpoint(
        EndpointName=sagemaker_endpoint_name
    )

pprint(f"Completed clean-up: {response}")