# Huggingface Sagemaker-sdk - Run a batch transform inference job with 🤗 Transformers


1. [Introduction](#Introduction)  
2. [Run Batch Transform after training a model](#Run-Batch-Transform-after-training-a-model)  
3. [Run Batch Transform Inference Job with a fine-tuned model using `jsonl`](#Run-Batch-Transform-Inference-Job-with-a-fine-tuned-model-using-jsonl)   
4. [Run Batch Transform Inference Job with a fine-tuned model using `csv`](#Run-Batch-Transform-Inference-Job-with-a-fine-tuned-model-using-csv)  

Welcome to this getting started guide, we will use the new Hugging Face Inference DLCs and Amazon SageMaker Python SDK to deploy two transformer model for inference. 
In the first example we deploy a trained Hugging Face Transformer model on to SageMaker for inference.
In the second example we directly deploy one of the 10 000+ Hugging Face Transformers from the [Hub](https://huggingface.co/models) to Amazon SageMaker for Inference.<

## Run Batch Transform after training a model 
_not included in the notebook_

After you train a model, you can use [Amazon SageMaker Batch Transform](https://docs.aws.amazon.com/sagemaker/latest/dg/how-it-works-batch.html) to perform inferences with the model. In Batch Transform you provide your inference data as a S3 uri and SageMaker will care of downloading it, running the prediction and uploading the results afterwards to S3 again. You can find more documentation for Batch Transform [here](https://docs.aws.amazon.com/sagemaker/latest/dg/batch-transform.html)

If you trained the model using the **HuggingFace estimator**, you can invoke `transformer()` method to create a transform job for a model based on the training job.

```python
batch_job = huggingface_estimator.transformer(
    instance_count=1,
    instance_type='ml.c5.2xlarge',
    strategy='SingleRecord')


batch_job.transform(
    data='s3://s3-uri-to-batch-data',
    content_type='application/json',    
    split_type='Line')
```
For more details about what can be specified here, see [API docs](https://sagemaker.readthedocs.io/en/stable/overview.html#sagemaker-batch-transform).



# Run Batch Transform Inference Job with a fine-tuned model using `jsonl`

In [1]:
!pip install "sagemaker>=2.48.0" "datasets==1.11" --upgrade

Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
Collecting sagemaker>=2.48.0
  Downloading sagemaker-2.52.2.tar.gz (436 kB)
[K     |████████████████████████████████| 436 kB 3.0 MB/s eta 0:00:01
Collecting datasets
  Downloading datasets-1.11.0-py3-none-any.whl (264 kB)
[K     |████████████████████████████████| 264 kB 11.1 MB/s eta 0:00:01
Collecting fsspec>=2021.05.0
  Downloading fsspec-2021.7.0-py3-none-any.whl (118 kB)
[K     |████████████████████████████████| 118 kB 7.6 MB/s eta 0:00:01
Building wheels for collected packages: sagemaker
  Building wheel for sagemaker (setup.py) ... [?25ldone
[?25h  Created wheel for sagemaker: filename=sagemaker-2.52.2-py2.py3-none-any.whl size=613974 sha256=00faee031381f800570e1ceb353ce0a920264e22275f37ce4e96b57b13a44140
  Stored in directory: /private/var/folders/v2/jqc0m2dd6gs20k4c_rjmd66m0000gn/T/pip-ephem-wheel-cache-kbcbym4x/wheels/f9/58/05/17415b8d285b9d15a7dd2d717e0f5143669d5280174ef6ae33
Successfully built sage

## Data Pre-Processing

In this example we are using the `datasets` library to load a dataset and pre-process it into a `.jsonl` format to make it compatible for batch transform. You could also provide a `.csv` instead of a `.jsonl` file. But the tasks `"zero-shot-classification"` and `"table-question-answering"` are currently nto supported with `csv` due to their nested input structure.

In [41]:
import json
from datasets import load_dataset
from sagemaker.s3 import S3Uploader

local_file = "imbd_input.jsonl"

# load dataset imdb dataset from `datasets``
dataset = load_dataset("imdb",split=["test"])[0]

# map_to_jsonl
with open(local_file, "w") as jsonl_file:
    for sample in dataset:
        json.dump({"inputs":sample["text"]},jsonl_file)
        jsonl_file.write("\n")
        
# uploads a given file to S3.
s3_file_uri = S3Uploader.upload(local_file,f"batch_transform/{local_file}")

Reusing dataset imdb (/Users/philipp/.cache/huggingface/datasets/imdb/plain_text/1.0.0/e3c66f1788a67a89c7058d97ff62b6c30531e05b549de56d3ab91891f0561f9a)


ValueError: Must setup local AWS configuration with a region supported by SageMaker.

## Create Inference Transformer to run the batch job

In [None]:
from sagemaker.huggingface.model import HuggingFaceModel

# Hub Model configuration. <https://huggingface.co/models>
hub = {
    'HF_MODEL_ID':'distilbert-base-uncased-finetuned-sst-2-english',
    'HF_TASK':'text-classification'
}

# create Hugging Face Model Class
huggingface_model = HuggingFaceModel(
   env=hub, # configuration for loading model from Hub
   role=role, # iam role with permissions to create an Endpoint
   transformers_version="4.6", # transformers version used
   pytorch_version="1.7", # pytorch version used
   py_version='py36', # python version used
)

# create Transformer to run our batch job
batch_job = huggingface_model.transformer(
    instance_count=1,
    instance_type='ml.c5.2xlarge',
    strategy='SingleRecord')

In [None]:
# starts batch transform job and uses s3 data as input
batch_job.transform(
    data=s3_file_uri,
    content_type='application/json',    
    split_type='Line')

In [None]:
# get results 
batch_job.output_path