Copyright (C) 2022 Intel Corporation
 
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
 
http://www.apache.org/licenses/LICENSE-2.0
 
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions
and limitations under the License.
 

SPDX-License-Identifier: Apache-2.0

# General Description

Version: 1.1 Date: Oct 28, 2022

This notebook outlines the general usage of cloud training platform using Intel's CPU, Intel optimized TensorFlow, Intel Neural Compressor on Amazon SageMaker platform. A BERT model is fine-tuned using HuggingFace framework and a quantized (INT8) BERT model is generated as a result.

Users may wish to based on parts of the codes and customize those to suit their purposes.

# Prerequisite

1. Setup the Amazon AWS credential (e.g.: aws configure) in the 
    i. container 
    ii. docker host environment
    
2. Set the notebook kernel to use 'Python 3 (ipykernel)'

# Step 0: Specify the AWS information (Optional)

Users may wish to specify the AWS information in the ./config/config.yaml file to pre-fill the necessary information required for the workflow. Or users may also fill in the necessary fields when executing the steps.

In [None]:
import yaml
with open('./config/sagemaker_config.yaml') as f:
    config_dict = yaml.safe_load(f)
    read_from_yaml = True

# Step 1: Build a custom docker image for training

    1. Copy the content of the "../src/sagemaker_training_container" and paste those outside the docker container. 
    2. Modify the AWS credential of the build_and_push.sh 
       Pay attention to the region, account number, algorithm_name and the firewall issue 
    3. Run build_and_push.sh to build the custom docker image for training.

Note: Users may change the content of the "train.py" to adjust the nature of the training task/use different BERT models/change the behavior of Intel Neural Compressor

# Step 2: Run the training codes using SageMaker
Users may wish to change the type of the cluster nodes and the number of the nodes for their training purposes. 

Please change the two variables 'target_instance_type' and 'num_of_nodes' to achieve this purpose.

List of EC2 instances: https://docs.aws.amazon.com/sagemaker/latest/dg/notebooks-available-instance-types.html




In [None]:
#from sagemaker.tensorflow.estimator import TensorFlow
from sagemaker.tensorflow import TensorFlow
import sagemaker
from sagemaker.estimator import Estimator

sess = sagemaker.Session()
sagemaker_session_bucket=None
if sagemaker_session_bucket is None and sess is not None:
    sagemaker_session_bucket = sess.default_bucket()

if read_from_yaml:
    role = config_dict['role']
    image_uri = config_dict['training_image_uri']
else:
    role = '' #e.g.: AmazonSageMaker-ExecutionRole-xxxxxxxxxxxxxx
    image_uri = '' #e.g.: xxxxxxxxxxxxx.dkr.ecr.us-west-2.amazonaws.com/sagemaker-inteltf-huggingface-inc:latest
    
sess = sagemaker.Session(default_bucket=sagemaker_session_bucket)

print(f"sagemaker role arn: {role}")
print(f"sagemaker bucket: {sess.default_bucket()}")
print(f"sagemaker session region: {sess.boto_region_name}")

#Users may change the following two parameters to create a cluster they wish
target_instance_type = 'ml.c5.18xlarge'
num_of_nodes = 3

#Training with Horovod: https://sagemaker.readthedocs.io/en/stable/frameworks/tensorflow/using_tf.html#training-with-horovod
tensorflow_estimator =  TensorFlow(entry_point='../src/sagemaker_training_container/dummy.py', 
                            instance_type=target_instance_type, #Just create a dummy python file to enable this API. In order to change the training behavior, users should change the train.py and rebuild the docker image by running the build_and_push.sh
                            instance_count=num_of_nodes,
                            image_uri = image_uri, #e.g.: xxxxxxxxxxxxx.dkr.ecr.us-west-2.amazonaws.com/sagemaker-inteltf-huggingface-inc:latest
                            role=role,
                            hyperparameters = {'epochs': 5,
                                            'train_batch_size': 32,
                                            'model_name':'bert-base-uncased'
                                                },
                            #script_mode=True,
                            distribution = {"mpi": {"enabled": True} }
                                   
                        )
tensorflow_estimator.fit()

# Step 3: Convert the quantized model into SavedModel format

- Download the S3 model artifact manually
      Visit Amazon "SageMaker" website and click in the "Training Jobs". Find the training job that has the same name as above and download the S3 model artifact (model.tar.gz). Unzip the model.tar.gz and upload the ptq_model.pb under the same directory of this notebook for Post-Training Quantization.

In [None]:
#Deployment using TensorFlowModel provided by SageMaker
#Convert the quantized pb model into SavedModel format
#import tensorflow as tf
#tf.disable_v2_behavior()
import tensorflow.compat.v1 as tf 
from tensorflow.python.saved_model import signature_constants
from tensorflow.python.saved_model import tag_constants
tf.disable_v2_behavior()

def convert_pb_to_savedmodel(pb_model_path, output_dir):
    #Read pb model
    graph_def = tf.compat.v1.GraphDef()
    with open(pb_model_path, 'rb') as f:
        graph_def.ParseFromString(f.read())

    #Save the BERT pb model into SavedModel
    builder, sigs = tf.saved_model.builder.SavedModelBuilder(output_dir), {}
    with tf.Session(graph=tf.Graph()) as sess:
        tf.import_graph_def(graph_def, name="")
        g = tf.get_default_graph()
        in1 = g.get_tensor_by_name('attention_mask:0')
        in2 = g.get_tensor_by_name('input_ids:0')
        in3 = g.get_tensor_by_name('token_type_ids:0')
        out = g.get_tensor_by_name('Identity:0')
        sigs[signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY] = \
            tf.saved_model.signature_def_utils.predict_signature_def(
                {"attention_mask": in1, "input_ids": in2,  "token_type_ids": in3}, {"Identity": out})

        builder.add_meta_graph_and_variables(sess,
                                            [tag_constants.SERVING],
                                            signature_def_map=sigs)
        builder.save()
    return

#Retrieve the pb from the SageMaker Training Job
pb_model_path = "./ptq_model.pb"
savedmodel_output_dir = "./ptq_model_savedmodel/saved_model/1/"

convert_pb_to_savedmodel(pb_model_path, savedmodel_output_dir)

# Step 4: Compress and zip the quantized model

In [None]:
!tar -czf ./model.tar.gz ./ptq_model_savedmodel/

# Step 5: Upload the quantized model to S3

In [None]:
from sagemaker.session import Session

model_data = Session().upload_data(path="./model.tar.gz", key_prefix="model")
print("model uploaded to: {}".format(model_data))