# PaddlePaddle BYOS

## Pre-requisites

This notebook shows how to use the SageMaker Python SDK to run your code in a local container before deploying to SageMaker's managed training or hosting environments.  This can speed up iterative testing and debugging while using the same familiar Python SDK interface.  Just change your estimator's `train_instance_type` to `local` (or `local_gpu` if you're using an ml.p2 or ml.p3 notebook instance).

In order to use this feature you'll need to install docker-compose (and nvidia-docker if training with a GPU).

**Note, you can only run a single local notebook at one time.**

In [None]:
# !/bin/bash ./utils/setup.sh

## Overview

The **SageMaker Python SDK** helps you deploy your models for training and hosting in optimized, productions ready containers in SageMaker. The SageMaker Python SDK is easy to use, modular, extensible and compatible with TensorFlow, MXNet, PyTorch and Chainer. This tutorial focuses on how to create a convolutional neural network model to train the [Cifar10 dataset](https://www.cs.toronto.edu/~kriz/cifar.html) using **PyTorch in local mode**.

### Set up the environment

This notebook was created and tested on a single ml.p2.xlarge notebook instance.

Let's start by specifying:

- The S3 bucket and prefix that you want to use for training and model data. This should be within the same region as the Notebook Instance, training, and hosting.
- The IAM role arn used to give training and hosting access to your data. See the documentation for how to create these. Note, if more than one role is required for notebook instances, training, and/or hosting, please replace the sagemaker.get_execution_role() with appropriate full IAM role arn string(s).

In [3]:
import os
import sagemaker

sagemaker_session = sagemaker.Session()

bucket = sagemaker_session.default_bucket()
prefix = 'sagemaker/DEMO-PaddleNLP-DuUIE'

role = sagemaker.get_execution_role()

In [None]:
# import subprocess

# instance_type = 'local'

# if subprocess.call('nvidia-smi') == 0:
#     ## Set type to GPU if one is present
#     instance_type = 'local_gpu'
    
# print("Instance type = " + instance_type)

### Upload the data
We use the ```sagemaker.Session.upload_data``` function to upload our datasets to an S3 location. The return value inputs identifies the location -- we will use this later when we start the training job.

In [None]:
# base_dir = 'file:///home/ec2-user/SageMaker/paddlenlp_sagemaker/data/'
# inputs = {'training': base_dir}
# print(inputs)

## Script Functions

SageMaker invokes the main function defined within your training script for training. When deploying your trained model to an endpoint, the model_fn() is called to determine how to load your trained model. The model_fn() along with a few other functions list below are called to enable predictions on SageMaker.

### [Predicting Functions](https://github.com/aws/sagemaker-pytorch-containers/blob/master/src/sagemaker_pytorch_container/serving.py)
* model_fn(model_dir) - loads your model.
* input_fn(serialized_input_data, content_type) - deserializes predictions to predict_fn.
* output_fn(prediction_output, accept) - serializes predictions from predict_fn.
* predict_fn(input_data, model) - calls a model on data deserialized in input_fn.

The model_fn() is the only function that doesn't have a default implementation and is required by the user for using PyTorch on SageMaker. 

## Create a training job using the sagemaker.PyTorch estimator

The `PyTorch` class allows us to run our training function on SageMaker. We need to configure it with our training script, an IAM role, the number of training instances, and the training instance type. For local training with GPU, we could set this to "local_gpu".  In this case, `instance_type` was set above based on your whether you're running a GPU instance.

After we've constructed our `PyTorch` object, we fit it using the data we uploaded to S3. Even though we're in local mode, using S3 as our data source makes sense because it maintains consistency with how SageMaker's distributed, managed training ingests data.


In [None]:
# from sagemaker.pytorch import PyTorch

# # git_config = {'repo': 'https://github.com/PaddlePaddle/PaddleNLP.git', 'branch': 'develop'}

# hyperparameters = {'train_path': '/opt/ml/input/data/training/train.txt', 
#                    'dev_path': '/opt/ml/input/data/training/dev.txt', 
#                    'save_dir': '/opt/ml/model', 
#                    'learning_rate': 1e-5, 
#                    'batch_size': 16, 
#                    'max_seq_len':512, 
#                    'num_epochs': 100, 
#                    'model': 'uie-base',
#                    'seed': 1000,
#                    'logging_steps': 10,
#                    'valid_steps': 100,
#                    'device': 'gpu'}

# estimator = PyTorch(entry_point='finetune.py',
#                             source_dir='./',
# #                             source_dir='model_zoo/uie/',
# #                             git_config=git_config,
#                             role=role,
#                             hyperparameters=hyperparameters,
#                             framework_version='1.9.1',
#                             py_version='py38',
#                             script_mode=True,
#                             instance_count=1,  # 1 or 2 or ...
#                             instance_type=instance_type)

# estimator.fit(inputs)

## SageMaker Training using GPU instance

In [5]:
WORK_DIRECTORY = '/home/ec2-user/SageMaker/DuUIE/data/'

# data_location = sagemaker_session.upload_data(WORK_DIRECTORY, key_prefix=prefix)
data_location = 's3://sagemaker-us-east-1-579019700964/sagemaker/DEMO-PaddleNLP-DuUIE'

inputs = {'training': data_location}

print(inputs)

{'training': 's3://sagemaker-us-east-1-579019700964/sagemaker/DEMO-PaddleNLP-DuUIE'}


In [7]:
# !aws s3 cp --recursive config $data_location/config
# !aws s3 cp --recursive uie-char-small $data_location/uie-char-small

upload: config/.ipynb_checkpoints/multi-task-duuie-checkpoint.yaml to s3://sagemaker-us-east-1-579019700964/sagemaker/DEMO-PaddleNLP-DuUIE/config/.ipynb_checkpoints/multi-task-duuie-checkpoint.yaml
upload: config/multi-task-duuie.yaml to s3://sagemaker-us-east-1-579019700964/sagemaker/DEMO-PaddleNLP-DuUIE/config/multi-task-duuie.yaml
upload: uie-char-small/vocab.txt to s3://sagemaker-us-east-1-579019700964/sagemaker/DEMO-PaddleNLP-DuUIE/uie-char-small/vocab.txt
upload: uie-char-small/model_config.json to s3://sagemaker-us-east-1-579019700964/sagemaker/DEMO-PaddleNLP-DuUIE/uie-char-small/model_config.json
upload: uie-char-small/tokenizer_config.json to s3://sagemaker-us-east-1-579019700964/sagemaker/DEMO-PaddleNLP-DuUIE/uie-char-small/tokenizer_config.json
upload: uie-char-small/model_state.pdparams to s3://sagemaker-us-east-1-579019700964/sagemaker/DEMO-PaddleNLP-DuUIE/uie-char-small/model_state.pdparams


In [10]:
# !aws s3 ls $data_location/

                           PRE .ipynb_checkpoints/
                           PRE config/
                           PRE duuie/
                           PRE duuie_pre/
                           PRE seen_schema/
                           PRE uie-char-small/
2022-06-09 07:53:20  102605243 duuie.zip
2022-06-09 07:53:20    1979357 duuie_test_a.json
2022-06-09 07:53:20     798571 duuie_test_a.zip
2022-06-09 07:53:20       6846 seen_schema.zip


In [None]:
from sagemaker.pytorch import PyTorch

hyperparameters = {'multi_task_config': '/opt/ml/input/data/training/config/multi-task-duuie.yaml',
                   'negative_keep': 1.0,
                   'do_train': '',
                   'metric_for_best_model': 'all-task-ave',
                   'model_name_or_path': '/opt/ml/input/data/training/uie-char-small',
                   'num_train_epochs': 10,
                   'per_device_train_batch_size': 16,  # 32
                   'per_device_eval_batch_size': 128,  # 256
                   'output_dir': '/opt/ml/model/duuie_multi_task_b32_lr5e-4',
                   'logging_dir': '/opt/ml/output/duuie_multi_task_b32_lr5e-4_log',
                   'learning_rate': 5e-4,
                   'overwrite_output_dir': '',
                   'gradient_accumulation_steps': 1,
                   'device': 'gpu'}

instance_type = 'ml.p3.2xlarge'  # 'ml.p3.2xlarge' or 'ml.p3.8xlarge' or ...

# git_config = {'repo': 'https://github.com/PaddlePaddle/PaddleNLP.git', 'branch': 'develop'}

estimator = PyTorch(entry_point='run_seq2struct.py',
                    source_dir='./',
#                             source_dir='model_zoo/uie/',
#                             git_config=git_config,
                            role=role,
                            hyperparameters=hyperparameters,
                            framework_version='1.9.1',
                            py_version='py38',
                            script_mode=True,
                            instance_count=1,  # 1 or 2 or ...
                            instance_type=instance_type)

estimator.fit(inputs)

2022-06-09 10:16:25 Starting - Starting the training job...
2022-06-09 10:16:51 Starting - Preparing the instances for trainingProfilerReport-1654769785: InProgress
.........
2022-06-09 10:18:15 Downloading - Downloading input data...
2022-06-09 10:18:50 Training - Downloading the training image..........................[34mbash: cannot set terminal process group (-1): Inappropriate ioctl for device[0m
[34mbash: no job control in this shell[0m
[34m2022-06-09 10:23:05,414 sagemaker-training-toolkit INFO     Imported framework sagemaker_pytorch_container.training[0m
[34m2022-06-09 10:23:05,440 sagemaker_pytorch_container.training INFO     Block until all host DNS lookups succeed.[0m
[34m2022-06-09 10:23:05,448 sagemaker_pytorch_container.training INFO     Invoking user training script.[0m
[34m2022-06-09 10:23:06,074 sagemaker-training-toolkit INFO     Installing dependencies from requirements.txt:[0m
[34m/opt/conda/bin/python -m pip install -r requirements.txt[0m
[34mColle

In [None]:
training_job_name = estimator.latest_training_job.name
# training_job_name = 'pytorch-training-2022-06-07-03-39-32-658'
print(training_job_name)

# Deploy the trained model to prepare for predictions

The deploy() method creates an endpoint (in this case locally) which serves prediction requests in real-time.

In [None]:
!rm -rf model.tar.gz
!rm -rf model_*
!rm -rf inference.*
!aws s3 cp s3://$bucket/$training_job_name/output/model.tar.gz .
!tar -xvf model.tar.gz

In [None]:
!cp inference.* model/
!cd model && tar -czvf ../model-inference.tar.gz *

!aws s3 cp model-inference.tar.gz s3://$bucket/$training_job_name/output/model-inference.tar.gz

In [None]:
# instance_type = 'local'
instance_type = 'ml.m5.xlarge'

# predictor = estimator.deploy(initial_instance_count=1, instance_type=instance_type)

from sagemaker.pytorch.model import PyTorchModel

pytorch_model = PyTorchModel(model_data='s3://{}/{}/output/model-inference.tar.gz'.format(bucket, training_job_name), role=role,
                             entry_point='infer.py', framework_version='1.9.1', py_version='py38')

predictor = pytorch_model.deploy(instance_type=instance_type, initial_instance_count=1)

# Invoking the endpoint

In [1]:
import sagemaker
from sagemaker.predictor import Predictor
from sagemaker.serializers import JSONSerializer
from sagemaker.deserializers import JSONDeserializer

sagemaker_session = sagemaker.Session()
endpointName = 'pytorch-inference-2022-06-07-08-04-23-851'

predictor = Predictor(endpointName, sagemaker_session=sagemaker_session, serializer=JSONSerializer(), deserializer=JSONDeserializer())

In [3]:
from sagemaker.serializers import JSONSerializer
from sagemaker.deserializers import JSONDeserializer

predictor.serializer = JSONSerializer()
predictor.deserializer = JSONDeserializer()

texts = ['"北京市海淀区人民法院\n民事判决书\n(199x)建初字第xxx号\n原告：张三。\n委托代理人李四，北京市 A律师事务所律师。\n被告：B公司，法定代表人王五，开发公司总经理。\n委托代理人赵六，北京市 C律师事务所律师。"', 
         '原告赵六，2022年5月29日生\n委托代理人孙七，深圳市C律师事务所律师。\n被告周八，1990年7月28日出生\n委托代理人吴九，山东D律师事务所律师']

outputs = predictor.predict(texts)
print('outputs: ', outputs)

outputs:  [{'行业': [{'text': '律师', 'start': 53, 'end': 55, 'probability': 0.5180757641792297}, {'text': '开发公司', 'start': 77, 'end': 81, 'probability': 0.5001093149185181}], '地域': [{'text': '北京市', 'start': 48, 'end': 51, 'probability': 0.6983951330184937}, {'text': '北京市', 'start': 1, 'end': 4, 'probability': 0.5646492838859558}], '组织形式': [{'text': '开发公司', 'start': 77, 'end': 81, 'probability': 0.4969618618488312}, {'text': '法院', 'start': 9, 'end': 11, 'probability': 0.5218775868415833}], '商号': [{'text': '建初', 'start': 24, 'end': 26, 'probability': 0.6158205270767212}]}, {'地域': [{'text': '山东', 'start': 64, 'end': 66, 'probability': 0.8744303584098816}, {'text': '深圳市', 'start': 25, 'end': 28, 'probability': 0.9532861113548279}], '组织形式': [{'text': '事务所', 'start': 31, 'end': 34, 'probability': 0.5747163891792297}], '商号': [{'text': 'C', 'start': 28, 'end': 29, 'probability': 0.5181689858436584}]}]


# Clean-up

Deleting the local endpoint when you're finished is important since you can only run one local endpoint at a time.

In [None]:
# estimator.delete_endpoint()
predictor.delete_endpoint()