# Huggingface Sagemaker-sdk - Run a batch transform inference job with 🤗 Transformers


1. [Introduction](#Introduction)  
2. [Run Batch Transform after training a model](#Run-Batch-Transform-after-training-a-model)  
3. [Run Batch Transform Inference Job with a fine-tuned model using `jsonl`](#Run-Batch-Transform-Inference-Job-with-a-fine-tuned-model-using-jsonl)   

Welcome to this getting started guide, we will use the new Hugging Face Inference DLCs and Amazon SageMaker Python SDK to deploy two transformer model for inference. 
In the first example we deploy a trained Hugging Face Transformer model on to SageMaker for inference.
In the second example we directly deploy one of the 10 000+ Hugging Face Transformers from the [Hub](https://huggingface.co/models) to Amazon SageMaker for Inference.<

## Run Batch Transform after training a model 
_not included in the notebook_

After you train a model, you can use [Amazon SageMaker Batch Transform](https://docs.aws.amazon.com/sagemaker/latest/dg/how-it-works-batch.html) to perform inferences with the model. In Batch Transform you provide your inference data as a S3 uri and SageMaker will care of downloading it, running the prediction and uploading the results afterwards to S3 again. You can find more documentation for Batch Transform [here](https://docs.aws.amazon.com/sagemaker/latest/dg/batch-transform.html)

If you trained the model using the **HuggingFace estimator**, you can invoke `transformer()` method to create a transform job for a model based on the training job.

```python
batch_job = huggingface_estimator.transformer(
    instance_count=1,
    instance_type='ml.c5.2xlarge',
    strategy='SingleRecord')


batch_job.transform(
    data='s3://s3-uri-to-batch-data',
    content_type='application/json',    
    split_type='Line')
```
For more details about what can be specified here, see [API docs](https://sagemaker.readthedocs.io/en/stable/overview.html#sagemaker-batch-transform).



In [1]:
# !pip install "sagemaker>=2.48.0" "datasets==1.11" --upgrade

# Run Batch Transform Inference Job with a fine-tuned model using `jsonl`

## Data Pre-Processing

In this example we are using the provided `tweet_data.csv` as dataset. The `csv` contains ~1800 tweets about different airlines. The `csv` contains 1 column `"inputs"` with the tweets. To use this `csv` we need to convert it into a `jsonl` file and upload it to s3. Due to the complex structure of text are only `jsonl` file supported for batch transform. As pre-processing we are removing the `@` in the beginning of the tweet to get the names/identities correct.

_**NOTE**: While preprocessing you need to make sure that your `inputs` fit the `max_length`.

In [1]:
import csv
import json
import sagemaker
from sagemaker.s3 import S3Uploader,s3_path_join

# get the s3 bucket
sess = sagemaker.Session()
role = sagemaker.get_execution_role()
sagemaker_session_bucket = sess.default_bucket()

# datset files
dataset_csv_file="tweet_data.csv"
dataset_json_file="tweet_data.json"
data_json = {}
data_json['instances'] = [] 

with open(dataset_csv_file, "r+") as infile, open(dataset_json_file, "w+") as outfile:
    reader = csv.DictReader(infile)
    for row in reader:
        # remove @
        row["inputs"] = row["inputs"].replace("@","")
        data_json['instances'].append({
            'inputs': str(row["inputs"])
        })
#     print(data_json)
    json.dump(data_json, outfile)
    outfile.write('\n')
    
# # uploads a given file to S3.
input_s3_path = s3_path_join("s3://",sagemaker_session_bucket,"batch_transform/input")
output_s3_path = s3_path_join("s3://",sagemaker_session_bucket,"batch_transform/output")
model_url = s3_path_join("s3://",sagemaker_session_bucket,"batch_transform/model")
s3_file_uri = S3Uploader.upload(dataset_json_file,input_s3_path)

print(f"{dataset_json_file} uploaded to {s3_file_uri}")

tweet_data.json uploaded to s3://sagemaker-us-west-2-706553727873/batch_transform/input/tweet_data.json


The created file looks like this

```json
{"inputs": "VirginAmerica What dhepburn said."}
{"inputs": "VirginAmerica plus you've added commercials to the experience... tacky."}
{"inputs": "VirginAmerica I didn't today... Must mean I need to take another trip!"}
{"inputs": "VirginAmerica it's really aggressive to blast obnoxious \"entertainment\"...."}
{"inputs": "VirginAmerica and it's a really big bad thing about it"}
{"inputs": "VirginAmerica seriously would pay $30 a flight for seats that didn't h...."}
{"inputs": "VirginAmerica yes, nearly every time I fly VX this \u201cear worm\u201d won\u2019t go away :)"}
{"inputs": "VirginAmerica Really missed a prime opportunity for Men Without ..."}
{"inputs": "virginamerica Well, I didn't\u2026but NOW I DO! :-D"}
{"inputs": "VirginAmerica it was amazing, and arrived an hour early. You're too good to me."}
{"inputs": "VirginAmerica did you know that suicide is the second leading cause of death among teens 10-24"}
{"inputs": "VirginAmerica I &lt;3 pretty graphics. so much better than minimal iconography. :D"}
{"inputs": "VirginAmerica This is such a great deal! Already thinking about my 2nd trip ..."}
....
```

## Create Inference Transformer to run the batch job

We use the [twitter-roberta-base-sentiment](https://huggingface.co/cardiffnlp/twitter-roberta-base-sentiment) model running our batch transform job. This is a RoBERTa-base model trained on ~58M tweets and finetuned for sentiment analysis with the TweetEval benchmark.


In [4]:
# package the inference scrip and pre-trained model into .tar.gz format
!tar -cvzf model.tar.gz model_token/*

model_token/config.json
model_token/pytorch_model.bin
model_token/special_tokens_map.json
model_token/tokenizer.json
model_token/tokenizer_config.json
model_token/vocab.txt


In [7]:
# upload pre-trained model to s3 bucket
print(f"Uploading Model to {model_url}")
model_uri = S3Uploader.upload('model.tar.gz',model_url)
print(f"Uploaded model to {model_uri}")

Uploading Model to s3://sagemaker-us-west-2-706553727873/batch_transform/model
Uploaded model to s3://sagemaker-us-west-2-706553727873/batch_transform/model/model.tar.gz


In [None]:
from sagemaker.huggingface.model import HuggingFaceModel

# create Hugging Face Model Class
huggingface_model = HuggingFaceModel(
   source_dir = 'code',
   entry_point = 'batchinference.py',
   model_data=model_uri, # configuration for loading model from Hub
   role=role, # iam role with permissions to create an Endpoint
   transformers_version="4.6", # transformers version used
   pytorch_version="1.7", # pytorch version used
   py_version='py36', # python version used
)

# create Transformer to run our batch job
batch_job = huggingface_model.transformer(
    instance_count=1,
    instance_type='ml.p3.2xlarge',
    output_path=output_s3_path, # we are using the same s3 path to save the output with the input
    strategy='MultiRecord')

# starts batch transform job and uses s3 data as input
batch_job.transform(
    data=input_s3_path,
    content_type='application/json',    
    split_type='None')

....................................[32m2021-09-05T21:37:06.498:[sagemaker logs]: MaxConcurrentTransforms=1, MaxPayloadInMB=6, BatchStrategy=MULTI_RECORD[0m
[34m2021-09-05 21:37:03,530 [INFO ] main com.amazonaws.ml.mms.ModelServer - [0m
[34mMMS Home: /opt/conda/lib/python3.6/site-packages[0m
[34mCurrent directory: /[0m
[34mTemp directory: /home/model-server/tmp[0m
[34mNumber of GPUs: 1[0m
[34mNumber of CPUs: 8[0m
[34mMax heap size: 12944 M[0m
[34mPython executable: /opt/conda/bin/python3.6[0m
[34mConfig file: /etc/sagemaker-mms.properties[0m
[34mInference address: http://0.0.0.0:8080[0m
[34mManagement address: http://0.0.0.0:8080[0m
[34mModel Store: /.sagemaker/mms/models[0m
[34mInitial Models: ALL[0m
[34mLog dir: /logs[0m
[35m2021-09-05 21:37:03,530 [INFO ] main com.amazonaws.ml.mms.ModelServer - [0m
[35mMMS Home: /opt/conda/lib/python3.6/site-packages[0m
[35mCurrent directory: /[0m
[35mTemp directory: /home/model-server/tmp[0m
[35mNumber of GPUs

In [9]:
import json
from sagemaker.s3 import S3Downloader
from ast import literal_eval
# creating s3 uri for result file -> input file + .out
output_file = f"tweet_data.json.out"
output_path = s3_path_join(output_s3_path,output_file)

# download file
S3Downloader.download(output_path,'.')

batch_transform_result = []
with open(output_file) as f:
    for line in f:
        # converts jsonline array to normal array
        line = "[" + line.replace("[","").replace("]",",") + "]"
        batch_transform_result = literal_eval(line) 
        
# print results 
print(batch_transform_result[:20])

['{"predictions": {"text": {"score": "0.5271208", "label": "joy"}, {"score": "0.47287917", "label": "anger"},}, {"text": {"score": "0.61746234", "label": "joy"}, {"score": "0.38253772", "label": "anger"},}, {"text": {"score": "0.56283057", "label": "joy"}, {"score": "0.43716943", "label": "anger"},}, {"text": {"score": "0.5812586", "label": "joy"}, {"score": "0.41874143", "label": "anger"},}, {"text": {"score": "0.5777793", "label": "anger"}, {"score": "0.42222065", "label": "joy"},}, {"text": {"score": "0.53390485", "label": "joy"}, {"score": "0.46609515", "label": "anger"},}, {"text": {"score": "0.5462722", "label": "anger"}, {"score": "0.4537278", "label": "joy"},}, {"text": {"score": "0.6382487", "label": "joy"}, {"score": "0.36175132", "label": "anger"},}, {"text": {"score": "0.55565697", "label": "joy"}, {"score": "0.44434294", "label": "anger"},}, {"text": {"score": "0.52727044", "label": "joy"}, {"score": "0.47272956", "label": "anger"},}, {"text": {"score": "0.5550806", "label