## Links

- [SageMaker Text Generation Demo](https://github.com/aws/amazon-sagemaker-examples/blob/main/introduction_to_amazon_algorithms/jumpstart_text_generation/Amazon_JumpStart_Text_Generation.ipynb)
- This is the demo that i am using [SageMaker Sentence Pair Classification Demo](https://github.com/aws/amazon-sagemaker-examples/blob/main/introduction_to_amazon_algorithms/jumpstart_sentence_pair_classification/Amazon_JumpStart_Sentence_Pair_Classification.ipynb)
- [Fine-tune and host Hugging Face BERT models on Amazon SageMaker](https://aws.amazon.com/blogs/machine-learning/fine-tune-and-host-hugging-face-bert-models-on-amazon-sagemaker/)
- [SageMaker Built-in NLP models](https://sagemaker.readthedocs.io/en/stable/algorithms/text/sentence_pair_classification_hugging_face.html)

Before running the code, you will need an AWS account and set up SageMaker.

### Set up

In [1]:
!pip install sagemaker ipywidgets --upgrade --quiet

To train and host on Amazon SageMaker, we need to setup and authenticate the use of AWS services. Here, we use **the execution role associated with the current notebook instance** as the AWS account role with SageMaker access. It has necessary permissions, including access to your data in S3. (To check if you have the execution role, Go to AWS SageMaker -> Notebook (left-side nav) -> Choose your notebook instance -> Under Permissions and encryption -> See IAM role ARN) 

In [2]:
# permissions and environment variables
import sagemaker, boto3, json
from sagemaker import get_execution_role

aws_role = get_execution_role()
aws_region = boto3.Session().region_name
sess = sagemaker.Session()
aws_role, aws_region, sess

('arn:aws:iam::544669270813:role/service-role/AmazonSageMaker-ExecutionRole-20230307T225001',
 'us-east-1',
 <sagemaker.session.Session at 0x7fef90daf280>)

### Select a pretrained model

In [3]:
model_id = "huggingface-spc-bert-base-uncased"

[Optional] Select a different JumpStart model. Here, we download jumpstart model_manifest file from the jumpstart s3 bucket, filter-out all the Sentence Pair Classification models and select a model.

In [4]:
from ipywidgets import Dropdown

# download JumpStart model_manifest file.
boto3.client("s3").download_file(
    f"jumpstart-cache-prod-{aws_region}", "models_manifest.json", "models_manifest.json"
)
with open("models_manifest.json", "rb") as json_file:
    model_list = json.load(json_file)

# filter-out all the Sentence Pair Classification models from the manifest list.
spc_models_all_versions, spc_models = [
    model["model_id"] for model in model_list if "-spc-" in model["model_id"]
], []
[spc_models.append(model) for model in spc_models_all_versions if model not in spc_models]

# display the model-ids in a dropdown to select a model for inference.
model_dropdown = Dropdown(
    options=spc_models,
    value=model_id,
    description="Select a model",
    style={"description_width": "initial"},
    layout={"width": "max-content"},
)

In [5]:
# choose model for inference
display(model_dropdown)

Dropdown(description='Select a model', index=3, layout=Layout(width='max-content'), options=('huggingface-spc-…

In [6]:
# model_version="*" fetches the latest version of the model
model_id, model_version = model_dropdown.value, "*"
model_id, model_version

('huggingface-spc-bert-base-uncased', '*')

### Run inference on the pre-trained model

#### Retreive JumpStart Artifacts & deploy an endpoint

This will take a few minutes to fetch the model from sagemaker

In [8]:
from sagemaker import image_uris, model_uris, script_uris
from sagemaker.model import Model
from sagemaker.predictor import Predictor
from sagemaker.utils import name_from_base

# model_version="*" fetches the latest version of the model.
infer_model_id, infer_model_version = model_dropdown.value, "*"

endpoint_name = name_from_base(f"jumpstart-example-{infer_model_id}")

inference_instance_type = "ml.m5.xlarge"

# Retrieve the inference docker container uri.
deploy_image_uri = image_uris.retrieve(
    region=None,
    framework=None,
    image_scope="inference",
    model_id=infer_model_id,
    model_version=infer_model_version,
    instance_type=inference_instance_type,
)
# Retrieve the inference script uri.
deploy_source_uri = script_uris.retrieve(
    model_id=infer_model_id, model_version=infer_model_version, script_scope="inference"
)
# Retrieve the base model uri.
base_model_uri = model_uris.retrieve(
    model_id=infer_model_id, model_version=infer_model_version, model_scope="inference"
)
# Create the SageMaker model instance. Note that we need to pass Predictor class when we deploy model through Model class,
# for being able to run inference through the SageMaker API.
model = Model(
    image_uri=deploy_image_uri,
    source_dir=deploy_source_uri,
    model_data=base_model_uri,
    entry_point="inference.py",
    role=aws_role,
    predictor_cls=Predictor,
    name=endpoint_name,
)
# deploy the Model.
base_model_predictor = model.deploy(
    initial_instance_count=1,
    instance_type=inference_instance_type,
    endpoint_name=endpoint_name,
)

----!

#### Example input sentences for inference

In [9]:
sentence_pair1 = [
    "How many octaves does Beyonce have?",
    "Beyoncé's vocal range spans four octaves.",
]
sentence_pair2 = [
    "How many octaves does Beyonce have?",
    "While another critic says she is a "
    "Vocal acrobat, being able to sing long and complex melismas and vocal runs effortlessly, and in key.",
]

#### Query endpoint and parse response

In [12]:
newline, bold, unbold = "\n", "\033[1m", "\033[0m"

def query_endpoint(encoded_text):
    response = base_model_predictor.predict(
        encoded_text, {"ContentType": "application/list-text", "Accept": "application/json;verbose"}
    )
    return response


def parse_response(query_response):
    model_predictions = json.loads(query_response)
    probabilities, labels, predicted_label = (
        model_predictions["probabilities"],
        model_predictions["labels"],
        model_predictions["predicted_label"],
    )
    return probabilities, labels, predicted_label


for sentence_pair in [sentence_pair1, sentence_pair2]:
    query_response = query_endpoint(json.dumps(sentence_pair).encode("utf-8"))
    probabilities, labels, predicted_label = parse_response(query_response)
    print(
        f"Inference:{newline}"
        f"Input text: '{sentence_pair}'{newline}"
        f"Model prediction: {probabilities}{newline}"
        f"Labels: {labels}{newline}"
        f"Predicted Label: {bold}{predicted_label}{unbold}{newline}"
    )

Inference:
Input text: '['How many octaves does Beyonce have?', "Beyoncé's vocal range spans four octaves."]'
Model prediction: [2.8040263652801514, -3.48968768119812]
Labels: ['entail', 'no_entail']
Predicted Label: [1mentail[0m

Inference:
Input text: '['How many octaves does Beyonce have?', 'While another critic says she is a Vocal acrobat, being able to sing long and complex melismas and vocal runs effortlessly, and in key.']'
Model prediction: [-2.672070264816284, 3.6076502799987793]
Labels: ['entail', 'no_entail']
Predicted Label: [1mno_entail[0m



#### Clean up the endpoint

In [13]:
# Delete the SageMaker endpoint and the attached resources
base_model_predictor.delete_model()
base_model_predictor.delete_endpoint()

### Finetune the pre-trained model on a custom dataset

Previously, we saw how to run inference on a pre-trained model, which was fine-tuned on QNLI dataset. Next, we discuss how a model can be finetuned to a custom dataset.

The Text Embedding model can be fine-tuned on any sentence pair classification dataset in the same way the model available for inference has been fine-tuned on the QNLI dataset. The model available for fine-tuning attaches a binary classification layer to the Text Embedding model and initializes the layer parameters to random values.

#### Retrieve Jumpstart Training artifacts

In [14]:
from sagemaker import image_uris, model_uris, script_uris, hyperparameters

model_id, model_version = model_dropdown.value, "*"
training_instance_type = "ml.p3.2xlarge"

# Retrieve the docker image
train_image_uri = image_uris.retrieve(
    region=None,
    framework=None,
    model_id=model_id,
    model_version=model_version,
    image_scope="training",
    instance_type=training_instance_type,
)
# Retrieve the training script
train_source_uri = script_uris.retrieve(
    model_id=model_id, model_version=model_version, script_scope="training"
)
# Retrieve the pre-trained model tarball to further fine-tune
train_model_uri = model_uris.retrieve(
    model_id=model_id, model_version=model_version, model_scope="training"
)

#### Set Training parameters

In [25]:
# Sample training data is available in this bucket
training_data_bucket = f"jumpstart-cache-prod-{aws_region}"
# For a quick demonstration of training we have created a random subset of QNLI dataset.
# For complete QNLI dataset replace "QNLI-tiny" with "QNLI" in the line below.
training_data_prefix = "training-datasets/QNLI-tiny/"

training_dataset_s3_path = f"s3://{training_data_bucket}/{training_data_prefix}"

output_bucket = sess.default_bucket()
output_prefix = "jumpstart-example-spc-training"

s3_output_location = f"s3://{output_bucket}/{output_prefix}/output"

In [26]:
from sagemaker import hyperparameters

# Retrieve the default hyper-parameters for fine-tuning the model
hyperparameters = hyperparameters.retrieve_default(model_id=model_id, model_version=model_version)

# [Optional] Override default hyperparameters with custom values
hyperparameters["batch-size"] = "64"
print(hyperparameters)

{'epochs': '3', 'adam-learning-rate': '2e-05', 'batch-size': '64', 'reinitialize-top-layer': 'Auto', 'train-only-top-layer': 'False'}


#### Train with Automatic Model Tuning
Amazon SageMaker automatic model tuning, also known as hyperparameter tuning, finds the best version of a model by running many training jobs on your dataset using the algorithm and ranges of hyperparameters that you specify. It then chooses the hyperparameter values that result in a model that performs the best, as measured by a metric that you choose. We will use a HyperparameterTuner object to interact with Amazon SageMaker hyperparameter tuning APIs.

In [28]:
from sagemaker.tuner import ContinuousParameter

# Use AMT for tuning and selecting the best model
use_amt = True

# Define objective metric per framework, based on which the best model will be selected.
metric_definitions_per_model = {
    "tensorflow": {
        "metrics": [{"Name": "val_accuracy", "Regex": "val_accuracy: ([0-9\\.]+)"}],
        "type": "Maximize",
    },
    "huggingface": {
        "metrics": [{"Name": "eval_accuracy", "Regex": "'eval_accuracy': ([0-9\\.]+)"}],
        "type": "Maximize",
    },
}

# You can select from the hyperparameters supported by the model, and configure ranges of values to be searched for training the optimal model.(https://docs.aws.amazon.com/sagemaker/latest/dg/automatic-model-tuning-define-ranges.html)
hyperparameter_ranges = {
    "adam-learning-rate": ContinuousParameter(0.000001, 0.001, scaling_type="Logarithmic")
}

# Increase the total number of training jobs run by AMT, for increased accuracy (and training time).
max_jobs = 6
# Change parallel training jobs run by AMT to reduce total training time, constrained by your account limits.
# if max_jobs=max_parallel_jobs then Bayesian search turns to Random.
max_parallel_jobs = 2

#### Start Training
We start by creating the estimator object with all the required assets and then launch the training job.

In [29]:
from sagemaker.estimator import Estimator
from sagemaker.utils import name_from_base
from sagemaker.tuner import HyperparameterTuner

training_job_name = name_from_base(f"jumpstart-example-{model_id}-transfer-learning")

# Create SageMaker Estimator instance
spc_estimator = Estimator(
    role=aws_role,
    image_uri=train_image_uri,
    source_dir=train_source_uri,
    model_uri=train_model_uri,
    entry_point="transfer_learning.py",
    instance_count=1,
    instance_type=training_instance_type,
    max_run=360000,
    hyperparameters=hyperparameters,
    output_path=s3_output_location,
    base_job_name=training_job_name,
)

if use_amt:
    metric_definitions = next(
        value for key, value in metric_definitions_per_model.items() if model_id.startswith(key)
    )

    hp_tuner = HyperparameterTuner(
        spc_estimator,
        metric_definitions["metrics"][0]["Name"],
        hyperparameter_ranges,
        metric_definitions["metrics"],
        max_jobs=max_jobs,
        max_parallel_jobs=max_parallel_jobs,
        objective_type=metric_definitions["type"],
        base_tuning_job_name=training_job_name,
    )

    # Launch a SageMaker Tuning job to search for the best hyperparameters
    hp_tuner.fit({"training": training_dataset_s3_path})
else:
    # Launch a SageMaker Training job by passing s3 path of the training data
    spc_estimator.fit({"training": training_dataset_s3_path}, logs=True)

No finished training job found associated with this estimator. Please make sure this estimator is only used for building workflow config


ResourceLimitExceeded: An error occurred (ResourceLimitExceeded) when calling the CreateHyperParameterTuningJob operation: The account-level service limit 'ml.p3.2xlarge for training job usage' is 1 Instances, with current utilization of 0 Instances and a request delta of 2 Instances. Please contact AWS support to request an increase for this limit.

#### Deploy & run inference on the fine-tuned model

In [None]:
inference_instance_type = "ml.m5.xlarge"

# Retrieve the inference docker container uri
deploy_image_uri = image_uris.retrieve(
    region=None,
    framework=None,
    image_scope="inference",
    model_id=model_id,
    model_version=model_version,
    instance_type=inference_instance_type,
)
# Retrieve the inference script uri
deploy_source_uri = script_uris.retrieve(
    model_id=model_id, model_version=model_version, script_scope="inference"
)

endpoint_name = name_from_base(f"jumpstart-example-FT-{model_id}-")

# Use the estimator from the previous step to deploy to a SageMaker endpoint
finetuned_predictor = (hp_tuner if use_amt else spc_estimator).deploy(
    initial_instance_count=1,
    instance_type=inference_instance_type,
    entry_point="inference.py",
    image_uri=deploy_image_uri,
    source_dir=deploy_source_uri,
    endpoint_name=endpoint_name,
)

In [18]:
sentence_pair1 = [
    "How many octaves does Beyonce have?",
    "Beyoncé's vocal range spans four octaves.",
]
sentence_pair2 = [
    "How many octaves does Beyonce have?",
    "While another critic says she is a "
    "Vocal acrobat, being able to sing long and complex melismas and vocal runs effortlessly, and in key.",
]

Next, we query the finetuned model, parse the response and print the predictions.

In [None]:
newline, bold, unbold = "\n", "\033[1m", "\033[0m"


def query_endpoint(encoded_text):
    response = finetuned_predictor.predict(
        encoded_text, {"ContentType": "application/list-text", "Accept": "application/json;verbose"}
    )
    return response


def parse_response(query_response):
    model_predictions = json.loads(query_response)
    probabilities, labels, predicted_label = (
        model_predictions["probabilities"],
        model_predictions["labels"],
        model_predictions["predicted_label"],
    )
    return probabilities, labels, predicted_label


for sentence_pair in [sentence_pair1, sentence_pair2]:
    query_response = query_endpoint(json.dumps(sentence_pair).encode("utf-8"))
    probabilities, labels, predicted_label = parse_response(query_response)
    print(
        f"Inference:{newline}"
        f"Input text: '{sentence_pair}'{newline}"
        f"Model prediction: {probabilities}{newline}"
        f"Labels: {labels}{newline}"
        f"Predicted Label: {bold}{predicted_label}{unbold}{newline}"
    )

#### Next, we clean up the deployed endpoint

In [None]:
# Delete the SageMaker endpoint and the attached resources
finetuned_predictor.delete_model()
finetuned_predictor.delete_endpoint()