# Working with Hugging Face models

<img align="left" width="130" src="https://raw.githubusercontent.com/PacktPublishing/Amazon-SageMaker-Cookbook/master/Extra/cover-small-padded.png"/>

This notebook contains the code to help readers work through one of the recipes of the book [Machine Learning with Amazon SageMaker Cookbook: 80 proven recipes for data scientists and developers to perform ML experiments and deployments](https://www.amazon.com/Machine-Learning-Amazon-SageMaker-Cookbook/dp/1800567030)

### How to do it...

In [None]:
!mkdir -p scripts

In [None]:
g = "raw.githubusercontent.com"
p = "PacktPublishing"
a = "Amazon-SageMaker-Cookbook"
mc = "master/Chapter09"

path = f"https://{g}/{p}/{a}/{mc}/scripts"

In [None]:
!wget -P scripts {path}/setup.py
!wget -P scripts {path}/train.py
!wget -P scripts {path}/inference.py
!wget -P scripts {path}/requirements.txt

In [None]:
!mkdir -p tmp

In [None]:
g = "raw.githubusercontent.com"
p = "PacktPublishing"
a = "Amazon-SageMaker-Cookbook"
mc = "master/Chapter09"

path = f"https://{g}/{p}/{a}/{mc}/files"

In [None]:
!wget -P tmp {path}/synthetic.train.txt

In [None]:
!wget -P tmp {path}/synthetic.validation.txt

In [None]:
s3_bucket = "<insert S3 bucket name here>"
prefix = "chapter09"

In [None]:
s3_train_data = 's3://{}/{}/input/{}'.format(
    s3_bucket, 
    prefix, 
    "synthetic.train.txt"
)
s3_validation_data = 's3://{}/{}/input/{}'.format(
    s3_bucket, 
    prefix, 
    "synthetic.validation.txt"
)

In [None]:
!aws s3 cp tmp/synthetic.train.txt {s3_train_data}

In [None]:
!aws s3 cp tmp/synthetic.validation.txt {s3_validation_data}

In [None]:
import sagemaker
from sagemaker import Session

role = sagemaker.get_execution_role()
session = sagemaker.Session()

In [None]:
from sagemaker.huggingface import HuggingFace

hyperparameters = {
    'epochs': 1,
    'train_batch_size': 32,
    'model_name':'distilbert-base-uncased'
}

In [None]:
estimator = HuggingFace(
    entry_point='train.py',
    source_dir='./scripts',
    instance_type='ml.p3.2xlarge',
    instance_count=1,
    role=role,
    transformers_version='4.4',
    pytorch_version='1.6',
    py_version='py36',
    hyperparameters=hyperparameters
)

In [None]:
from sagemaker.inputs import TrainingInput

train_data = TrainingInput(s3_train_data)
validation_data = TrainingInput(s3_validation_data)

data_channels = {
    'train': train_data, 
    'valid': validation_data
}

In [None]:
%%time

estimator.fit(data_channels)

In [None]:
from sagemaker.pytorch.model import PyTorchModel

model_data = estimator.model_data

model = PyTorchModel(
    model_data=model_data, 
    role=role, 
    source_dir="scripts",
    entry_point='inference.py', 
    framework_version='1.6.0',
    py_version="py3"
)

In [None]:
%%time

predictor = model.deploy(
    instance_type='ml.m5.xlarge', 
    initial_instance_count=1
)

In [None]:
from sagemaker.serializers import JSONSerializer
from sagemaker.deserializers import JSONDeserializer

predictor.serializer = JSONSerializer()
predictor.deserializer = JSONDeserializer()

In [None]:
test_data = {
    "text": "This tastes bad. I hate this place."
}

predictor.predict(test_data)

In [None]:
test_data = {
    "text": "Very delicious. I would recommend this to my friends"
}

predictor.predict(test_data)

In [None]:
predictor.delete_endpoint()