# Fine-tuning Hugging Face DistilBERT model for multi-label text classification on Amazon SageMaker

 ( *Distributed training, data parallelism or model parallelism, is optional. It can be turned on by uncommenting relevant sections.* )

## Introduction

In this notebook, you will use the Hugging Faces `transformers` and `datasets` library with Amazon SageMaker to fine-tune a pre-trained transformer on multi-label text classification dataset. 

You will then deploy the resulting model for inference using SageMaker Endpoint.

This notebook is modified from [this example.](https://github.com/aws-samples/finetune-deploy-bert-with-amazon-sagemaker-for-hugging-face/blob/main/finetune-distilbert.ipynb)

### The model

You'll be using an offshoot of [BERT](https://arxiv.org/abs/1810.04805) called [DistilBERT](https://arxiv.org/abs/1910.01108) that is smaller, and so faster and cheaper for both training and inference. A pre-trained model is available in the [`transformers`](https://huggingface.co/transformers/) library from [Hugging Face](https://huggingface.co/).

### The data

The [Twitter multi-label dateset](https://huggingface.co/maxpe/twitter-roberta-base-jun2022_sem_eval_2018_task_1) is used. One twitter text could have multiple labels.

```
"text": "I'm very happy I fine-tune a hugging face DistilBert model for multi-label classification"
"label": ["joy", "optimism"]
```




## Setup
### Dependencies
Install the dependencies required.

In [None]:
!pip install -qq "sagemaker>=2.48.0" --upgrade
!pip install -qq sagemaker-huggingface-inference-toolkit 
!pip install -qq ipywidgets
!pip install -qq watermark 
!pip install -qq "seaborn>=0.11.0"
!pip install -qq transformers "datasets[s3]==1.18.4"

If you are running this on a SageMaker environment, make sure to reboot the Kernel via the dropdown menu at the top after you've installed the above dependencies.

### Development environment 

In [None]:
import sagemaker
from sagemaker.pytorch import PyTorch
from sagemaker.debugger import ProfilerConfig, DebuggerHookConfig, Rule, ProfilerRule, rule_configs
import sagemaker.huggingface
from sagemaker.huggingface import HuggingFace
import transformers
from transformers import AutoTokenizer
from datasets import load_dataset

import numpy as np
import pandas as pd
import seaborn as sns
from pylab import rcParams
import matplotlib.pyplot as plt
from matplotlib import rc
from textwrap import wrap

import boto3
import pprint
import time
import json
import os
os.environ["TOKENIZERS_PARALLELISM"] = "false"

In [None]:
sns.set(style='whitegrid', palette='muted', font_scale=1.2)
rcParams['figure.figsize'] = 17, 8

## Set up SageMaker session and bucket

In [None]:
sess = sagemaker.Session()
sagemaker_session_bucket=None
if sagemaker_session_bucket is None and sess is not None:
    # set to default bucket if a bucket name is not given
    sagemaker_session_bucket = sess.default_bucket()

role = sagemaker.get_execution_role()
sess = sagemaker.Session(default_bucket=sagemaker_session_bucket)

print(f"sagemaker role arn: {role}")
print(f"sagemaker bucket: {sess.default_bucket()}")
print(f"sagemaker session region: {sess.boto_region_name}")

In [None]:
# Set up local environment, if not using SageMaker/Studio Notebook.
iam = boto3.client('iam')
role = iam.get_role(
        RoleName='AmazonSageMaker-ExecutionRole')['Role']['Arn']

# Data preparation

The data preparation is straightforward as you're using the `datasets` library to download and preprocess the `
amazon_polarity` dataset directly from Hugging face. After preprocessing, the dataset will be uploaded to our `sagemaker_session_bucket` to be used within our training job.

In [None]:
dataset = load_dataset("sem_eval_2018_task_1", "subtask5.english" )

In [None]:
dataset['train'][12]

Create a list that contains the labels. And create two dict to map labels to integers and back.

In [None]:
labels = [label for label in dataset['train'].features.keys() if label not in ['ID', 'Tweet']]
id2label = {idx:label for idx, label in enumerate(labels)}
label2id = {label:idx for idx, label in enumerate(labels)}
labels

### Preparing the dataset to be used with PyTorch

Specify model to be used and define tokenizer.

In [None]:
from transformers import AutoTokenizer
import numpy as np
model_name = 'distilbert-base-uncased'
tokenizer = AutoTokenizer.from_pretrained(model_name)

Preprocessing data 
[reference](https://colab.research.google.com/github/NielsRogge/Transformers-Tutorials/blob/master/BERT/Fine_tuning_BERT_(and_friends)_for_multi_label_text_classification.ipynb#scrollTo=AFWlSsbZaRLc)
1. Tokenize the text feature
2. For multi-label text classification, this is a matrix of shape (batch_size, num_labels). Also important: this should be a tensor of floats rather than integers, otherwise PyTorch' BCEWithLogitsLoss (which the model will use) will complain, as explained [here](https://discuss.pytorch.org/t/multi-label-binary-classification-result-type-float-cant-be-cast-to-the-desired-output-type-long/117915/3).

In [None]:

def preprocess_data(examples):
  # take a batch of texts
  text = examples["Tweet"]
  # encode them
  encoding = tokenizer(text, padding="max_length", truncation=True, max_length=128)
  # add labels
  labels_batch = {k: examples[k] for k in examples.keys() if k in labels}
  # create numpy array of shape (batch_size, num_labels)
  labels_matrix = np.zeros((len(text), len(labels)))
  # fill numpy array
  for idx, label in enumerate(labels):
    labels_matrix[:, idx] = labels_batch[label]

  encoding["labels"] = labels_matrix.tolist()
  
  return encoding

In [None]:
encoded_dataset = dataset.map(preprocess_data, batched=True, remove_columns=dataset['train'].column_names)

In [None]:
encoded_dataset['train'][0].keys()

In [None]:
encoded_dataset.set_format("torch")

In [None]:
train_dataset = encoded_dataset['train']
test_dataset = encoded_dataset['validation']

### Save data to S3

In [None]:
import botocore
from datasets.filesystems import S3FileSystem

# Upload to S3
s3 = S3FileSystem()
s3_prefix = f'samples/datasets/twitter'
training_input_path = f's3://{sess.default_bucket()}/{s3_prefix}/train'
train_dataset.save_to_disk(training_input_path,fs=s3)
test_input_path = f's3://{sess.default_bucket()}/{s3_prefix}/test'
test_dataset.save_to_disk(test_input_path,fs=s3)

print(f'Uploaded training data to {training_input_path}')
print(f'Uploaded testing data to {test_input_path}')

# Fine-tuning & starting Sagemaker Training Job

In order to create a sagemaker training job you need an `HuggingFace` Estimator. The Estimator handles end-to-end Amazon SageMaker training and deployment tasks. In a Estimator you define, which fine-tuning script should be used as `entry_point`, which `instance_type` should be used, which `hyperparameters` are passed in.
```python
huggingface_estimator = HuggingFace(
    entry_point="train.py",
    instance_type="ml.p3.2xlarge",
    instance_count=1,
    role=role,
    transformers_version="4.17.0",
    pytorch_version="1.10.2",
    py_version="py38",
    hyperparameters=hyperparameters,
    compiler_config=compiler_config,  # the compiler configuration used in the training job
    disable_profiler=True,
    debugger_hook_config = False, 
    max_run=36000,  # expected max run in seconds
)
```
When you create a SageMaker training job, SageMaker takes care of starting and managing all the required compute instances with the `huggingface` container, uploads the provided fine-tuning script `train.py` and downloads the data from our `sagemaker_session_bucket` into the container local storage at `/opt/ml/input/data`. Then, it starts the training job by running. 
```python
/opt/conda/bin/python train.py --epochs 5 --model_name distilbert-base-uncased --token_name distilbert-base-cased --train_batch_size 32
```

The `hyperparameters` you define in the `HuggingFace` estimator are passed in as named arguments. The training script expect the `HuggingFace` model and token name so it can retrieve them.

Sagemaker is providing other useful properties about the training environment through various environment variables, including the following:

* `SM_MODEL_DIR`: A string that represents the path where the training job writes the model artifacts to. After training, artifacts in this directory are uploaded to S3 for model hosting.

* `SM_NUM_GPUS`: An integer representing the number of GPUs available to the host.

* `SM_CHANNEL_XXXX:` A string that represents the path to the directory that contains the input data for the specified channel. For example, if you specify two input channels in the HuggingFace estimator’s fit call, named `train` and `test`, the environment variables `SM_CHANNEL_TRAIN` and `SM_CHANNEL_TEST` are set.

### Create training script

In [None]:
# train.py is modified from https://github.com/aws-samples/finetune-deploy-bert-with-amazon-sagemaker-for-hugging-face/blob/main/scripts/train.py
# https://colab.research.google.com/github/NielsRogge/Transformers-Tutorials/blob/master/BERT/Fine_tuning_BERT_(and_friends)_for_multi_label_text_classification.ipynb#scrollTo=797b2WHJqUgZ

In [None]:

%%writefile train.py

"""
Training script for Hugging Face SageMaker Estimator
"""
import logging
import sys
import argparse
import os
from transformers import AutoModelForSequenceClassification, AutoTokenizer
#from transformers import Trainer, TrainingArguments
from datasets import load_from_disk, load_metric
from sklearn.metrics import accuracy_score, precision_recall_fscore_support
import json
from sklearn.metrics import f1_score, roc_auc_score, accuracy_score
from transformers import EvalPrediction
import torch
import numpy as np

# for model parallelism, need to use sagemaker specific trainer
from transformers.sagemaker import SageMakerTrainingArguments as TrainingArguments
from transformers.sagemaker import SageMakerTrainer as Trainer

if __name__ == "__main__":

    parser = argparse.ArgumentParser()

    # hyperparameters sent by the client are passed as command-line arguments to the script.
    parser.add_argument("--epochs", type=int, default=3)
    parser.add_argument("--train_batch_size", type=int, default=32)
    parser.add_argument("--eval_batch_size", type=int, default=32)
    parser.add_argument("--warmup_steps", type=int, default=500)
    parser.add_argument("--model_name", type=str)
    parser.add_argument("--learning_rate", type=str, default=5e-5)

    # Data, model, and output directories
    parser.add_argument("--output_data_dir", type=str, default=os.environ["SM_OUTPUT_DATA_DIR"])
    parser.add_argument("--model_dir", type=str, default=os.environ["SM_MODEL_DIR"])
    parser.add_argument("--n_gpus", type=str, default=os.environ["SM_NUM_GPUS"])
    parser.add_argument("--training_dir", type=str, default=os.environ["SM_CHANNEL_TRAIN"])
    parser.add_argument("--test_dir", type=str, default=os.environ["SM_CHANNEL_TEST"])

    args, _ = parser.parse_known_args()
    # is needed for Amazon SageMaker Training Compiler
    os.environ["GPU_NUM_DEVICES"] = args.n_gpus
    
    # Set up logging
    logger = logging.getLogger(__name__)

    logging.basicConfig(
        level=logging.getLevelName("INFO"),
        handlers=[logging.StreamHandler(sys.stdout)],
        format="%(asctime)s - %(name)s - %(levelname)s - %(message)s",
    )

    # load datasets
    train_dataset = load_from_disk(args.training_dir)
    test_dataset = load_from_disk(args.test_dir)

    logger.info(f" loaded train_dataset length is: {len(train_dataset)}")
    logger.info(f" loaded test_dataset length is: {len(test_dataset)}")

    # https://scikit-learn.org/stable/modules/generated/sklearn.metrics.precision_recall_fscore_support.html
    # https://engineering.freeagent.com/2021/09/15/fine-tuning-bert-for-multiclass-categorisation-with-amazon-sagemaker/
  
    def multi_label_metrics(predictions, labels, threshold=0.5):
        # first, apply sigmoid on predictions which are of shape (batch_size, num_labels)
        sigmoid = torch.nn.Sigmoid()
        probs = sigmoid(torch.Tensor(predictions))
        # next, use threshold to turn them into integer predictions
        y_pred = np.zeros(probs.shape)
        y_pred[np.where(probs >= threshold)] = 1
        # finally, compute metrics
        y_true = labels
        f1_micro_average = f1_score(y_true=y_true, y_pred=y_pred, average='micro')
        roc_auc = roc_auc_score(y_true, y_pred, average = 'micro')
        accuracy = accuracy_score(y_true, y_pred)
        # return as dictionary
        metrics = {'f1': f1_micro_average,
                   'roc_auc': roc_auc,
                   'accuracy': accuracy}
        return metrics
    
    def compute_metrics(eval_pred):
        preds, labels = eval_pred
        print('labels:',labels, labels.shape)
        print('preds:',preds, preds.shape)
        
        result = multi_label_metrics(
            predictions=preds, 
            labels=labels)
        return result

    # labels is a list
    labels = train_dataset[0]["labels"]
    num_labels = len(labels)

    label2id={'anger': 0,
              'anticipation': 1,
              'disgust': 2,
              'fear': 3,
              'joy': 4,
              'love': 5,
              'optimism': 6,
              'pessimism': 7,
              'sadness': 8,
              'surprise': 9,
              'trust': 10}
    id2label={0: 'anger',
              1: 'anticipation',
              2: 'disgust',
              3: 'fear',
              4: 'joy',
              5: 'love',
              6: 'optimism',
              7: 'pessimism',
              8: 'sadness',
              9: 'surprise',
              10: 'trust'}
   

    # download model and tokenizer from model hub
    model = AutoModelForSequenceClassification.from_pretrained(args.model_name, problem_type="multi_label_classification", num_labels=num_labels,id2label=id2label,
                                                           label2id=label2id)
    tokenizer = AutoTokenizer.from_pretrained(args.model_name)


    # define training args
    training_args = TrainingArguments(
        output_dir=args.model_dir,
        num_train_epochs=args.epochs,
        per_device_train_batch_size=args.train_batch_size,
        per_device_eval_batch_size=args.eval_batch_size,
        warmup_steps=args.warmup_steps,
        evaluation_strategy="epoch",
        save_strategy="epoch",
        save_total_limit=2,
        logging_dir=f"{args.output_data_dir}/logs",
        learning_rate=float(args.learning_rate),
        #load_best_model_at_end=True,
        #metric_for_best_model="f1",
    )

    # create Trainer instance
    trainer = Trainer(
        model=model,
        args=training_args,
        compute_metrics=compute_metrics,
        train_dataset=train_dataset,
        eval_dataset=test_dataset,
        tokenizer=tokenizer,
    )

    # train model
    trainer.train()

    # evaluate model
    eval_result = trainer.evaluate(eval_dataset=test_dataset)

    # writes eval result to file which can be accessed later in s3 ouput
    with open(os.path.join(args.output_data_dir, "eval_results.txt"), "w") as writer:
        print("***** Eval results *****")
        for key, value in sorted(eval_result.items()):
            writer.write(f"{key} = {value}\n")

    # Saves the model to s3
    trainer.save_model(args.model_dir)


## Creating an Estimator and start a training job

Name your training job so you can follow it:

In [None]:
import datetime
ct = datetime.datetime.now() 
current_time = str(ct.now()).replace(":", "-").replace(" ", "-")[:19]
training_job_name=f'finetune-{model_name}-{current_time}'
print( training_job_name )

In [None]:
from sagemaker.huggingface import HuggingFace, TrainingCompilerConfig

# initialize the Amazon Training Compiler
compiler_config=TrainingCompilerConfig()

In [None]:
# Batch size limits
# https://docs.aws.amazon.com/sagemaker/latest/dg/training-compiler-support.html#training-compiler-supported-frameworks-pytorch
# Using p3.2xlarge for bert-base-uncased, batch limits:  16 or 24 (compiler)

In [None]:
hyperparameters={'epochs': 3,
                 'train_batch_size': 24,
                 'eval_batch_size': 24,   
                 'model_name': model_name,
                 'learning_rate': 3e-5,
                 }

In [None]:
# emit metrics to cloudwatch
metric_definitions=[
    {'Name': 'eval_loss', 'Regex': "'eval_loss': ([0-9]+(.|e\-)[0-9]+),?"},
    {'Name': 'eval_accuracy', 'Regex': "'eval_accuracy': ([0-9]+(.|e\-)[0-9]+),?"},
    {'Name': 'eval_f1', 'Regex': "'eval_f1': ([0-9]+(.|e\-)[0-9]+),?"},
    {'Name': 'eval_runtime', 'Regex': "'eval_runtime': ([0-9]+(.|e\-)[0-9]+),?"},
    {'Name': 'eval_samples_per_second', 'Regex': "'eval_samples_per_second': ([0-9]+(.|e\-)[0-9]+),?"},
    {'Name': 'epoch', 'Regex': "'epoch': ([0-9]+(.|e\-)[0-9]+),?"}]

#### Distributed training (Optional)
https://sagemaker.readthedocs.io/en/v2.59.8/frameworks/huggingface/sagemaker.huggingface.html

The 'distribution' dictionary with information on how to run distributed training (default: None). Currently, the following are supported: distributed training with parameter servers, SageMaker Distributed (SMD) Data and Model Parallelism, and MPI. SMD Model Parallelism can only be used with MPI. To enable parameter server use the following setup:
```
{
    "parameter_server": {
        "enabled": True
    }
}
```
To enable MPI:
```
{
    "mpi": {
        "enabled": True
    }
}
```
To enable SMDistributed Data Parallel or Model Parallel:
```
{
    "smdistributed": {
        "dataparallel": {
            "enabled": True
        },
        "modelparallel": {
            "enabled": True,
            "parameters": {}
        }
    }
}
```

### Change to the train.py if using parallelism



[Distributed Training: Data-Parallel](https://huggingface.co/transformers/v4.4.2/sagemaker.html#prepare-a-transformers-fine-tuning-script)
You can use SageMaker Data Parallelism Library out of the box for distributed training. We added the functionality of Data Parallelism directly into the Trainer. If your train.py uses the Trainer API you only need to define the distribution parameter in the HuggingFace Estimator.

[Distributed Training: Model-Parallel](https://huggingface.co/transformers/v4.4.2/sagemaker.html#prepare-a-transformers-fine-tuning-script)
You can use SageMaker Model Parallelism Library out of the box for distributed training. We extended the Trainer API to the SageMakerTrainer to use the model parallelism library. Therefore you only have to change the imports in your train.py.

[Example Notebook](https://github.com/huggingface/notebooks/blob/main/sagemaker/04_distributed_training_model_parallelism/sagemaker-notebook.ipynb)

from transformers.sagemaker import SageMakerTrainingArguments as TrainingArguments
from transformers.sagemaker import SageMakerTrainer as Trainer
After the adjustments in the train.py you need to extend the distribution configuration in the HuggingFace Estimator. For detailed information about the adjustments take a look [here.](https://sagemaker.readthedocs.io/en/stable/api/training/smd_model_parallel_general.html?highlight=modelparallel#required-sagemaker-python-sdk-parameters)

[Model parallelism basics](https://docs.aws.amazon.com/sagemaker/latest/dg/model-parallel-intro.html)

In [None]:
# configuration for running training on smdistributed Data Parallel
distribution_dataparallel = {'smdistributed':{'dataparallel':{ 'enabled': True }}}


*Please note smdataparallel supports instance types:('ml.p3.16xlarge', 'ml.p3dn.24xlarge', 'ml.p4d.24xlarge', 'ml.p4de.24xlarge')*

In [None]:
# configuration for running training on smdistributed Model Parallel
# https://docs.aws.amazon.com/sagemaker/latest/dg/model-parallel-intro.html
mpi_options = {
    "enabled" : True,
    "processes_per_host" : 1,
}
smp_options = {
    "enabled":True,
    "parameters": {
        "microbatches": 1,
        "placement_strategy": "spread",
        "pipeline": "interleaved",
        "optimize": "speed",
        "partitions": 2,
        "ddp": True,
    }
}

distribution_modelparallel={
    "smdistributed": {"modelparallel": smp_options},
    "mpi": mpi_options
}

*Please note smmodelparallel does not support SageMaker training compiler. 5/10/2023*

fp16 was not set up for model parallel, otherwise gave error. 

In [None]:
# instance configurations
instance_type='ml.p3.2xlarge'
instance_count=2
instance_volume_size=60

### Define SageMaker Hugging Face Estimator 

In [None]:

# https://github.com/aws/deep-learning-containers/blob/master/available_images.md
# https://docs.aws.amazon.com/sagemaker/latest/dg/training-compiler-support.html#training-compiler-supported-frameworks
# 05/10/2023

huggingface_estimator = HuggingFace(
    entry_point="train.py",
    instance_type=instance_type,
    instance_count=instance_count,
    role=role,
    transformers_version="4.17.0",
    pytorch_version="1.10.2",
    py_version="py38",
    volume_size=instance_volume_size,
    hyperparameters=hyperparameters,
    #compiler_config=compiler_config,  # the compiler does not work with distributed model parallel.
    disable_profiler=True,
    debugger_hook_config = False, 
    metric_definitions=metric_definitions,
    max_run=36000,  # expected max run in seconds
    distribution= distribution_modelparallel, #comment out if not using distributed model parallel

)

Starts the training job using the estimator fit function:

In [None]:
huggingface_estimator.fit({'train': training_input_path, 'test': test_input_path}, job_name=training_job_name)

Wait for the training to finish. Training takes approximately 10 mins to complete.

## Training metrics
You can now display the training metrics

In [None]:
from sagemaker import TrainingJobAnalytics

# Captured metrics can be accessed as a Pandas dataframe
df = TrainingJobAnalytics(training_job_name=training_job_name).dataframe()
df.head(10)

# Deploy Endpoint

The `transformers pipelines` API allows you to use the `pipelines` features. 

The API is oriented at the API of the [🤗  Accelerated Inference API](https://api-inference.huggingface.co/docs/python/html/detailed_parameters.html), meaning your inputs need to be defined in the `inputs` key and if you want additional supported `pipelines` parameters you can add them in the `parameters` key. 



You can now proceed and create an endpoint with the trained model.

In [None]:
predictor = huggingface_estimator.deploy(initial_instance_count=1, instance_type="ml.m5.xlarge", endpoint_name=training_job_name)

In [None]:
data = {"inputs": "Tiller and breezy should do a collab album. Rapping and singing prolly be fire"}

# request
predictor.predict(data)

The current inference image predicts one label with the highest score. The inference.py within the image can be modified to display multiple labels with scores. Alternatively, you can pack new custom inference.py with model artifact and deploy to endpoint. In this way, it can predict multiple labels.  

Here're references:

https://huggingface.co/docs/sagemaker/inference

https://discuss.huggingface.co/t/inference-toolkit-init-and-default-template-for-custom-inference/10469

https://github.com/philschmid/sample-custom-inference-sagemaker-huggingface/blob/master/code/inference.py

https://github.com/huggingface/notebooks/blob/main/sagemaker/23_stable_diffusion_inference/sagemaker-notebook.ipynb

https://github.com/huggingface/notebooks/blob/main/sagemaker/17_custom_inference_script/code/inference.py

## (Optional) Create new inference.py and deploy to endpoint

#### Get model artifact from S3 

In [None]:
import boto3

sm_client = boto3.client('sagemaker')

#sm_client.describe_training_job(TrainingJobName=training_job_name)

In [None]:
response=sm_client.describe_training_job(TrainingJobName=training_job_name)
model_url=response['ModelArtifacts']['S3ModelArtifacts']
model_url

In [None]:
!aws s3 cp $model_url model.tar.gz

#### Create custom inference.py

In [None]:
!mkdir code

In [None]:
%%writefile code/inference.py
import os
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import numpy as np
from sagemaker_huggingface_inference_toolkit import decoder_encoder
import torch

id2label = {
    0: "anger",
    1: "anticipation",
    2: "disgust",
    3: "fear",
    4: "joy",
    5: "love",
    6: "optimism",
    7: "pessimism",
    8: "sadness",
    9: "surprise",
    10: "trust",
}

def model_fn(model_dir):
    tokenizer = AutoTokenizer.from_pretrained(model_dir)
    model = AutoModelForSequenceClassification.from_pretrained(model_dir)
    return {"model": model, "tokenizer": tokenizer}


def input_fn(input_data, content_type):
    decoded_input_data = decoder_encoder.decode(input_data, content_type)
    return decoded_input_data


def predict_fn(data, model):
    inputs = model["tokenizer"](
        data["inputs"],
        return_tensors="pt",
        max_length=128,
        padding="max_length",
        truncation=True,
    )
    with torch.no_grad():
        logits = model["model"](*tuple(inputs.values()))[0]

    sigmoid = torch.nn.Sigmoid()
    probs = sigmoid(logits.squeeze().cpu())

    predictions = np.zeros(probs.shape)
    predictions[np.where(probs >= 0.5)] = 1
    # turn predicted id's into actual label names
    predicted_labels = [
        id2label[idx] for idx, label in enumerate(predictions) if label == 1.0
    ]

    return predicted_labels

#### Pack model artifact with new inference.py and upload to S3

In [None]:
from pathlib import Path    
import random

# create model dir
model_tar = Path(f"model-{random.getrandbits(16)}")
model_tar.mkdir(exist_ok=True)

In [None]:
from distutils.dir_util import copy_tree
# copy code/ to model dir
copy_tree("code/", str(model_tar.joinpath("code")))

In [None]:
!tar -xvf model.tar.gz -C model-40770

In [None]:
import tarfile
import os

# helper to create the model.tar.gz
def compress(tar_dir=None,output_file="model.tar.gz"):
    parent_dir=os.getcwd()
    os.chdir(tar_dir)
    with tarfile.open(os.path.join(parent_dir, output_file), "w:gz") as tar:
        for item in os.listdir('.'):
          print(item)
          tar.add(item, arcname=item)    
    os.chdir(parent_dir)
            

In [None]:
compress(str(model_tar))

In [None]:
from sagemaker.s3 import S3Uploader
# upload model.tar.gz to s3
s3_model_uri=S3Uploader.upload(local_path="model.tar.gz", desired_s3_uri=f"s3://{sess.default_bucket()}/samples/datasets/twitter/bert-model")

print(f"model uploaded to: {s3_model_uri}")

In [None]:
!aws s3 cp $s3_model_uri model2.tar.gz

#### Deploy the custom Hugging Face Model to Amazon SageMaker

In [None]:
from sagemaker.huggingface.model import HuggingFaceModel

# create Hugging Face Model Class
huggingface_model = HuggingFaceModel(
   model_data=s3_model_uri,       # path to your model and script
   role=role,                    # iam role with permissions to create an Endpoint
   transformers_version="4.17.0",  # transformers version used
   pytorch_version="1.10.2",        # pytorch version used
   py_version='py38',            # python version used
)

# deploy the endpoint endpoint
new_predictor = huggingface_model.deploy(
    initial_instance_count=1,
    instance_type="ml.m5.xlarge"
    )

In [None]:
data = {"inputs": "I'm very happy I fine-tune a hugging face DistilBert model for multi-label classification"}

# request
new_predictor.predict(data)

### Cleanup
After you are finished experimenting with this notebook, run the following cell to delete the predictor.

In [None]:
predictor.delete_model()
predictor.delete_endpoint()
new_predictor.delete_model()
new_predictor.delete_endpoint()

### References:
 
https://github.com/aws-samples/finetune-deploy-bert-with-amazon-sagemaker-for-hugging-face/blob/main/finetune-distilbert.ipynb

https://github.com/huggingface/notebooks/blob/main/sagemaker/01_getting_started_pytorch/sagemaker-notebook.ipynb

https://github.com/huggingface/notebooks/blob/main/sagemaker/15_training_compiler/sagemaker-notebook.ipynb


https://github.com/DhavalTaunk08/NLP_scripts/blob/master/Transformers_multilabel_distilbert.ipynb

https://colab.research.google.com/github/NielsRogge/Transformers-Tutorials/blob/master/BERT/Fine_tuning_BERT_(and_friends)_for_multi_label_text_classification.ipynb#scrollTo=bIH9NP0MZ6-O

https://engineering.freeagent.com/2021/09/15/fine-tuning-bert-for-multiclass-categorisation-with-amazon-sagemaker/

https://www.alexanderjunge.net/blog/til-multi-label-automodelforsequenceclassification/

https://jesusleal.io/2021/04/21/Longformer-multilabel-classification/
