# Huggingface Sagemaker-sdk - Run a batch transform inference job with 🤗 Transformers


Welcome to this getting started guide, we will use the new Hugging Face Inference DLCs and Amazon SageMaker Python SDK to deploy two transformer model for inference. 
In this example we run a batch-transform job using a trained Hugging Face Transformer model on to SageMaker for inference.

## Run Batch Transform after training a model 
_not included in the notebook_

After you train a model, you can use [Amazon SageMaker Batch Transform](https://docs.aws.amazon.com/sagemaker/latest/dg/how-it-works-batch.html) to perform inferences with the model. In Batch Transform you provide your inference data as a S3 uri and SageMaker will care of downloading it, running the prediction and uploading the results afterwards to S3 again. You can find more documentation for Batch Transform [here](https://docs.aws.amazon.com/sagemaker/latest/dg/batch-transform.html)

If you trained the model using the **HuggingFace estimator**, you can invoke `transformer()` method to create a transform job for a model based on the training job.

```python
batch_job = huggingface_estimator.transformer(
    instance_count=1,
    instance_type='ml.c5.2xlarge',
    strategy='SingleRecord')


batch_job.transform(
    data='s3://s3-uri-to-batch-data',
    content_type='application/json',    
    split_type='Line')
```
For more details about what can be specified here, see [API docs](https://sagemaker.readthedocs.io/en/stable/overview.html#sagemaker-batch-transform).



In [None]:
!pip install "sagemaker>=2.48.0" --upgrade
# !pip install torch -q
!pip install transformers -q

In [None]:
import sagemaker
sagemaker.__version__

# Run Batch Transform Inference Job with a fine-tuned model using `jsonl`

## Data Pre-Processing

In this example we are using a excerpt of provided `test.csv` from the first session today. The `csv` contains ~1000 tweets. The `csv` contains 2 columns `"text"` and `"label"`. To use this `csv` we need to convert it into a `jsonl` file and upload it to s3. Due to the complex structure of text are only `jsonl` file supported for batch transform. As pre-processing we are removing the `@anonymized_account` of the tweet.

_**NOTE**: While preprocessing you need to make sure that your `inputs` fit the `max_length`.

In [17]:
import csv
import json
import sagemaker
from sagemaker.s3 import S3Uploader,s3_path_join

# get the s3 bucket
sess = sagemaker.Session()
role = sagemaker.get_execution_role()
sagemaker_session_bucket = sess.default_bucket()

# datset files
dataset_csv_file="test.csv"
dataset_json_file="transform_dataset.json"

with open(dataset_csv_file, "r+") as infile, open(dataset_json_file, "w+", encoding='utf8') as outfile:
    reader = csv.DictReader(infile)
    for row in reader:
        # remove @
        row["inputs"] = row["text"].replace("@anonymized_account","").strip()
        del row["label"]
        del row["text"]
        json.dump(row, outfile,ensure_ascii=False)
        outfile.write('\n')

                
# uploads a given file to S3.
input_s3_path = s3_path_join("s3://",sagemaker_session_bucket,"batch_transform/input")
output_s3_path = s3_path_join("s3://",sagemaker_session_bucket,"batch_transform/output")
s3_file_uri = S3Uploader.upload(dataset_json_file,input_s3_path)


The created file looks like this

```json
{"inputs": "Spoko, jak im Duda z Morawieckim zamówią po pięć piw to wszystko będzie ok."}
{"inputs": "Ale on tu nie miał szans jej zagrania, a ta 'proba' to czysta prowizorka."}
{"inputs": "No czy Prezes nie miał racji, mówiąc,ze to są zdradzieckie mordy? No czy nie miał racji?😁😁"}
{"inputs": "Przecież to nawet nie jest przewrotka 😂"}
{"inputs": "Owszem podatki tak. Ale nie w takich okolicznościach. Czemu Małysza odpalili z teamu Orlen?"}
{"inputs": "skąd wiesz jaki Skendija ma budżet skoro mówisz że jest bogatsza ? Tylko dwóch zawodników ponoć dobrze zarabia."}
{"inputs": "Z tego, co widzę, to kibice Widzewa mają szczęście, że trwa mundial. Dzięki temu ogólnopolska szydera jest tylko z Argentyny i Messiego."}
{"inputs": "Na utrzymanie własnej armii 2% PKB, tyle że teraz to jedna wielka ściema"}
{"inputs": "Przypomnijcie mi ze muszę jeszcze suszarkę spakować"}
....
```

## Create Inference Transformer to run the batch job

We use the [polish_cyberbullying_bert_base](https://colab.research.google.com/drive/1KlpbVkAo1cTX3u5mrI7Bj-VfaX4sUtuR?usp=sharing) trained in the earlier session today by Dogu. If you haven't attempted you can re-do the session with this notebook: https://colab.research.google.com/drive/1KlpbVkAo1cTX3u5mrI7Bj-VfaX4sUtuR?usp=sharingmodel
Or use the model as it is provided on Amazon S3.

**using the S3 provided model**

In [3]:
# download the model from S3 and unzip
!wget https://pai-marketplace21-tutorials.s3.amazonaws.com/practical-nlp/models/polish_cyberbullying_bert_base.tgz
!tar zxvf polish_cyberbullying_bert_base.tgz

# package pre-trained model into .tar.gz format
!cd polish_cyberbullying_bert_base && tar zcvf model.tar.gz * 
!mv polish_cyberbullying_bert_base/model.tar.gz ./model.tar.gz

--2021-09-30 09:19:36--  https://pai-marketplace21-tutorials.s3.amazonaws.com/practical-nlp/models/polish_cyberbullying_bert_base.tgz
Auflösen des Hostnamens pai-marketplace21-tutorials.s3.amazonaws.com (pai-marketplace21-tutorials.s3.amazonaws.com)… 52.216.131.131
Verbindungsaufbau zu pai-marketplace21-tutorials.s3.amazonaws.com (pai-marketplace21-tutorials.s3.amazonaws.com)|52.216.131.131|:443 … verbunden.
HTTP-Anforderung gesendet, auf Antwort wird gewartet … 200 OK
Länge: 434163450 (414M) [application/x-gzip]
Wird in »polish_cyberbullying_bert_base.tgz.1« gespeichert.


2021-09-30 09:19:58 (18,8 MB/s) - »polish_cyberbullying_bert_base.tgz.1« gespeichert [434163450/434163450]

x polish_cyberbullying_bert_base/
x polish_cyberbullying_bert_base/added_tokens.json
x polish_cyberbullying_bert_base/special_tokens_map.json
x polish_cyberbullying_bert_base/config.json
x polish_cyberbullying_bert_base/tokenizer.json
x polish_cyberbullying_bert_base/merges.txt
x polish_cyberbullying_bert_base

**using the Hugging Face model**

In [None]:
### Download Hugging Face Pretrained Model
# from transformers import AutoModelForSequenceClassification, AutoTokenizer
# MODEL = 'cardiffnlp/twitter-roberta-base-sentiment'
# model = AutoModelForSequenceClassification.from_pretrained(MODEL)
# tokenizer = AutoTokenizer.from_pretrained(MODEL)
# model.save_pretrained('my_model')
# tokenizer.save_pretrained('my_model')

In [None]:
# package the inference scrip and pre-trained model into .tar.gz format
# !cd my_model && tar zcvf model.tar.gz * 
# !mv my_model/model.tar.gz ./model.tar.gz

**upload the mmodel to Amazon S3**

In [None]:
# upload pre-trained model to s3 bucket
model_url = s3_path_join("s3://",sagemaker_session_bucket,"batch_transform/model")
print(f"Uploading Model to {model_url}")
model_uri = S3Uploader.upload('model.tar.gz',model_url)
print(f"Uploaded model to {model_uri}")

## Create Batch Transform job

In [None]:
from sagemaker.huggingface.model import HuggingFaceModel

# create Hugging Face Model Class
huggingface_model = HuggingFaceModel(
   model_data=model_uri, # configuration for loading model from Hub
   role=role, # iam role with permissions to create an Endpoint
   transformers_version="4.6", # transformers version used
   pytorch_version="1.7", # pytorch version used
    name="prosus-workshop-bert-batch", # model and batch name
   py_version='py36', # python version used
)

# create Transformer to run our batch job
batch_job = huggingface_model.transformer(
    instance_count=1,
    instance_type='ml.g4dn.xlarge',
    output_path=output_s3_path, # we are using the same s3 path to save the output with the input
    strategy='SingleRecord')

# starts batch transform job and uses s3 data as input
batch_job.transform(
    data=input_s3_path,
    content_type='application/json',    
    split_type='Line')

Use batch transform when you:

* Want to get inferences for an entire dataset and index them to serve inferences in real time
* Don't need a persistent endpoint that applications (for example, web or mobile apps) can call to get inferences
* Don't need the subsecond latency that SageMaker hosted endpoints provide

You can also use batch transform to preprocess your data before using it to train a new model or generate inferences.
The following diagram shows the workflow of a batch transform job:

![batch-transform](../../imgs/batch-transform-v2.png)

In [None]:
import json
from sagemaker.s3 import S3Downloader
from ast import literal_eval
# creating s3 uri for result file -> input file + .out
output_file = f"tweet_data.json.out"
output_path = s3_path_join(output_s3_path,output_file)

# download file
S3Downloader.download(output_path,'.')

batch_transform_result = []
with open(output_file) as f:
    for line in f:
        # converts jsonline array to normal array
        line = "[" + line.replace("[","").replace("]",",") + "]"
        batch_transform_result = literal_eval(line) 
        
# print results 
print(batch_transform_result[:20])