# Train and evaluate a Sentence Transformer model in AWS SageMaker
# Sentence transformer model for news in spanish

This notebook is a simple tutorial about how to train or finetune a sentence transformer model for spanish language. Our pretrained model is [Bertin](https://huggingface.co/bertin-project/bertin-roberta-base-spanish), a BERT-based model in spanish. For this exercise, we will use a very helpful library, SentenceTransformers is a Python framework for state-of-the-art sentence, text and image embeddings. The initial work is described in the paper [Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks](https://arxiv.org/abs/1908.10084).

This framework provides an easy method to compute dense vector representations for sentences, paragraphs, and images. The models are based on transformer networks like BERT / RoBERTa / XLM-RoBERTa etc. and achieve state-of-the-art performance in various task. Text is embedding in vector space such that similar text is close and can efficiently be found using cosine similarity. The framework provides a large list of Pretrained Models for more than 100 languages. Some models are general purpose models, while others produce embeddings for specific use cases. Pre-trained models can be loaded by just passing the model name: `SentenceTransformer('model_name')`. Then it allows you to fine-tune your own sentence embedding methods, so that you get task-specific sentence embeddings. You have various options to choose from in order to get perfect sentence embeddings for your specific task. 

You can use Sentence Transformer for:

- Computing Sentence Embeddings
- Semantic Textual Similarity
- Clustering
- Paraphrase Mining
- Translated Sentence Mining
- Semantic Search
- Retrieve & Re-Rank
- Text Summarization
- Multilingual Image Search, Clustering & Duplicate Detection

Following steps will be explained: 
 
1. Create an Experiment and Trial to keep track of our experiments

2. Load the training data to our training instance

3. Create the scripts to train our custom model, a Transformer.

4. Create an Estimator to train our model in a Tensorflow 2.1 container in script mode

5. Create metric definitions to keep track of them in SageMaker

4. Download the trained model to make predictions

5. Resume training using the latest checkpoint from a previous training 

1. [Introduction](#Introduction)  
2. [Development Environment and Permissions](#Set-up-the-environment)
    1. [Installation](#Installation)  
    2. [Development environment](#Development-environment)  
    3. [Permissions](#Permissions)
3. [Processing](#Preprocessing)   
    1. [Tokenization](#Tokenization)  
    2. [Uploading data to sagemaker_session_bucket](#Uploading-data-to-sagemaker_session_bucket)  
4. [Fine-tuning & starting Sagemaker Training Job](#Fine-tuning-\&-starting-Sagemaker-Training-Job)  
    1. [Creating an Estimator and start a training job](#Creating-an-Estimator-and-start-a-training-job)  
    2. [Estimator Parameters](#Estimator-Parameters)   
    3. [Download fine-tuned model from s3](#Download-fine-tuned-model-from-s3)
    3. [Attach to old training job to an estimator ](#Attach-to-old-training-job-to-an-estimator)  
5. [_Coming soon_:Push model to the Hugging Face hub](#Push-model-to-the-Hugging-Face-hub)

# Introduction

_**NOTE: You can run this demo in Sagemaker Studio, your local machine or Sagemaker Notebook Instances**_

# Development Environment and Permissions 

## Installation

_*Note:* we only install the required libraries from Hugging Face and AWS. You also need PyTorch or Tensorflow, if you haven´t it installed_

In [2]:
#!pip install sagemaker transformers datasets[s3] --upgrade

Collecting sagemaker>=2.48.0
  Using cached sagemaker-2.69.0-py2.py3-none-any.whl
Collecting transformers==4.6.1
  Using cached transformers-4.6.1-py3-none-any.whl (2.2 MB)
Collecting datasets[s3]==1.6.2
  Using cached datasets-1.6.2-py3-none-any.whl (221 kB)
Collecting tokenizers<0.11,>=0.10.1
  Using cached tokenizers-0.10.3-cp36-cp36m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (3.3 MB)
Collecting filelock
  Using cached filelock-3.4.0-py3-none-any.whl (9.8 kB)
Collecting regex!=2019.12.17
  Using cached regex-2021.11.10-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (748 kB)
Collecting sacremoses
  Using cached sacremoses-0.0.46-py3-none-any.whl (895 kB)
Collecting huggingface-hub==0.0.8
  Using cached huggingface_hub-0.0.8-py3-none-any.whl (34 kB)
Collecting tqdm>=4.27
  Using cached tqdm-4.49.0-py2.py3-none-any.whl (69 kB)
Collecting xxhash
  Using cached xxhash-2.0.2-cp36-cp36m-manylinux2010_x86_64.whl (243 kB)
Collecting boto

# Set up the environment

Let's start by setting up the environment:

In [1]:
import os

## Define global variables and parameters

In [2]:
#
# Specify your bucket name
bucket_name = 'edumunozsala-ml-sagemaker'
# Set the training data folder in S3
training_folder = r'cc-news-es/train'
# Set the validation data folder in S3
validation_folder = r'cc-news-es/test'
# Set the training data folder in S3
test_folder = r'cc-news-es/test'

# Set the output folder in S3
output_folder = r'sentence-transformer-spanish'
# Set the checkpoint in S3 folder for our model 
ckpt_folder = r'sentence-transformer-spanish/ckpt'

training_data_uri = r's3://' + bucket_name + r'/' + training_folder
validation_data_uri = r's3://' + bucket_name + r'/' + validation_folder
test_data_uri = r's3://' + bucket_name + r'/' + test_folder

output_data_uri = r's3://' + bucket_name + r'/' + output_folder
ckpt_data_uri = r's3://' + bucket_name + r'/' + ckpt_folder


## Create the Session and Permissions to run the jobs in SageMaker

_If you are going to use Sagemaker in a local environment. You need access to an IAM Role with the required permissions for Sagemaker. You can find [here](https://docs.aws.amazon.com/sagemaker/latest/dg/sagemaker-roles.html) more about it._

In [3]:
import sagemaker

sess = sagemaker.Session()
# sagemaker session bucket -> used for uploading data, models and logs
# sagemaker will automatically create this bucket if it not exists
# Specify your bucket name
sagemaker_session_bucket= bucket_name
if sagemaker_session_bucket is None and sess is not None:
    # set to default bucket if a bucket name is not given
    sagemaker_session_bucket = sess.default_bucket()

role = sagemaker.get_execution_role()
sess = sagemaker.Session(default_bucket=sagemaker_session_bucket)
region = sess.boto_session.region_name
print(role)
print(region)

print(f"sagemaker role arn: {role}")
print(f"sagemaker bucket: {sess.default_bucket()}")
print(f"sagemaker session region: {sess.boto_region_name}")

arn:aws:iam::223817798831:role/service-role/AmazonSageMaker-ExecutionRole-20200708T194212
us-east-1
sagemaker role arn: arn:aws:iam::223817798831:role/service-role/AmazonSageMaker-ExecutionRole-20200708T194212
sagemaker bucket: edumunozsala-ml-sagemaker
sagemaker session region: us-east-1


## Preprocessing and splitting the datasets

We are using a dataset of IMDB reviews in spanish. After preprocessing, the dataset will be uploaded to our `sagemaker_session_bucket` to be used within our training job. First we load the dataset from a CSV file:

## Setting the model and training parameters

Now it is time to set the model and training parameters, they will be passed to the dataset generator and to the Trainer object in a latter section.

In [4]:
TRAIN_BATCH_SIZE = 16   # input batch size for training (default: 64)
TRAIN_EPOCHS = 3       # number of epochs to train (default: 10)
#MAX_LEN = 256           # Max length for product description

# Fine-tuning & starting Sagemaker Training Job

In order to create a sagemaker training job we need an `HuggingFace` Estimator. The Estimator handles end-to-end Amazon SageMaker training and deployment tasks. In a Estimator we define, which fine-tuning script should be used as `entry_point`, which `instance_type` should be used, which `hyperparameters` are passed in .....



```python
huggingface_estimator = HuggingFace(entry_point='train.py',
                            source_dir='./scripts',
                            base_job_name='huggingface-sdk-extension',
                            instance_type='ml.p3.2xlarge',
                            instance_count=1,
                            transformers_version='4.4',
                            pytorch_version='1.6',
                            py_version='py36',
                            role=role,
                            hyperparameters = {'epochs': 1,
                                               'train_batch_size': 32,
                                               'model_name':'distilbert-base-uncased'
                                                })
```

When we create a SageMaker training job, SageMaker takes care of starting and managing all the required ec2 instances for us with the `huggingface` container, uploads the provided fine-tuning script `train.py` and downloads the data from our `sagemaker_session_bucket` into the container at `/opt/ml/input/data`. Then, it starts the training job by running. 

```python
/opt/conda/bin/python train.py --epochs 1 --model_name distilbert-base-uncased --train_batch_size 32
```

The `hyperparameters` you define in the `HuggingFace` estimator are passed in as named arguments. 

Sagemaker is providing useful properties about the training environment through various environment variables, including the following:

* `SM_MODEL_DIR`: A string that represents the path where the training job writes the model artifacts to. After training, artifacts in this directory are uploaded to S3 for model hosting.

* `SM_NUM_GPUS`: An integer representing the number of GPUs available to the host.

* `SM_CHANNEL_XXXX:` A string that represents the path to the directory that contains the input data for the specified channel. For example, if you specify two input channels in the HuggingFace estimator’s fit call, named `train` and `test`, the environment variables `SM_CHANNEL_TRAIN` and `SM_CHANNEL_TEST` are set.


To run your training job locally you can define `instance_type='local'` or `instance_type='local_gpu'` for gpu usage. _Note: this does not working within SageMaker Studio_


In [8]:
!pygmentize ./scripts/train.py


*** Error while highlighting:
pkg_resources.VersionConflict: (jedi 0.18.0 (/opt/conda/lib/python3.6/site-packages), Requirement.parse('jedi<=0.17.2,>=0.10'))
   (file "/opt/conda/lib/python3.6/site-packages/pkg_resources/__init__.py", line 782, in resolve)
*** If this is a bug you want to report, please rerun with -v.


## Creating an Estimator and start a training job

In [8]:
!pip install sagemaker --upgrade

Collecting sagemaker
  Using cached sagemaker-2.105.0-py2.py3-none-any.whl
Collecting boto3<2.0,>=1.20.21
  Using cached boto3-1.23.10-py3-none-any.whl (132 kB)
Collecting botocore<1.27.0,>=1.26.10
  Using cached botocore-1.26.10-py3-none-any.whl (8.8 MB)
Collecting s3transfer<0.6.0,>=0.5.0
  Using cached s3transfer-0.5.2-py3-none-any.whl (79 kB)
Installing collected packages: botocore, s3transfer, boto3, sagemaker
  Attempting uninstall: botocore
    Found existing installation: botocore 1.20.60
    Uninstalling botocore-1.20.60:
      Successfully uninstalled botocore-1.20.60
  Attempting uninstall: s3transfer
    Found existing installation: s3transfer 0.4.2
    Uninstalling s3transfer-0.4.2:
      Successfully uninstalled s3transfer-0.4.2
  Attempting uninstall: boto3
    Found existing installation: boto3 1.17.60
    Uninstalling boto3-1.17.60:
      Successfully uninstalled boto3-1.17.60
  Attempting uninstall: sagemaker
    Found existing installation: sagemaker 2.39.0
    Unins

In [5]:
from sagemaker.huggingface import HuggingFace

# Set the model name to fine tune
model_name = "bertin-project/bertin-roberta-base-spanish"
# Set the trained model name
trained_model_name = "bertin-sts-cc-news-es"
    
# hyperparameters, which are passed into the training job
hyperparameters={'epochs': TRAIN_EPOCHS,
                 'train-batch-size': TRAIN_BATCH_SIZE,
                 'model-name': model_name,
                 'text-column': 'text',
                 'target-column': 'output_text',
                 #'max-source': MAX_LEN,
                 #'num-examples': 3000,
                 'trained-model-name': trained_model_name,
                 }

In [6]:
output_data_uri,training_data_uri

('s3://edumunozsala-ml-sagemaker/sentence-transformer-spanish',
 's3://edumunozsala-ml-sagemaker/cc-news-es/train')

In [7]:
huggingface_estimator = HuggingFace(entry_point='train.py',
                            source_dir='./scripts',
                            instance_type='ml.p2.xlarge',
                            #instance_type='local',
                            instance_count=1,
                            role=role,
                            output_path=output_data_uri,
                            code_location=output_data_uri,
                            #checkpoint_s3_uri = ckpt_data_uri, #output_data_uri,
                            #checkpoint_local_path= '/opt/ml/model',
                            transformers_version='4.6',
                            pytorch_version='1.7',
                            py_version='py36',
                            hyperparameters = hyperparameters)

In [8]:
# starting the train job with our uploaded datasets as input
huggingface_estimator.fit({'train': training_data_uri, 'validation': validation_data_uri, 'test': test_data_uri}, wait=False)

# Upload the model to Hugginface Hub

code extracted from a post in the huggingface blog by 
https://huggingface.co/blog/sagemaker-distributed-training-seq2seq

First install git-lfs for model upload.

In [41]:
!curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.rpm.sh | sudo bash
!sudo yum install git-lfs -y
!git lfs install

/bin/sh: 1: sudo: not found
/bin/sh: 1: sudo: not found
git: 'lfs' is not a git command. See 'git --help'.

Did you mean this?
	log


Next, we are going to upload our trained model to huggingface.co. To upload a model you need to create an account in huggingface.com.

We can download our model from Amazon S3 and unzip it using the following snippet.

In [23]:
# Define the model folder
# S3_uri_model= huggingface_estimator.model_data
S3_uri_model= 's3://edumunozsala-ml-sagemaker/sentence-transformer-spanish/huggingface-pytorch-training-2022-08-22-12-56-59-852/output/'
# Define the my final model name
mymodel_name='bertin-sts-cc-news-es'
# Dataset name
dataset_name='LeoCordoba/CC-NEWS-ES-titles'

In [22]:
import os
import tarfile
from sagemaker.s3 import S3Downloader

local_path = mymodel_name

os.makedirs(local_path, exist_ok = True)

# download model from S3
S3Downloader.download(
    s3_uri=S3_uri_model, # s3 uri where the trained model is located
    local_path=local_path, # local path where *.tar.gz will be saved
    sagemaker_session=sess # sagemaker session used for training the model
)

# unzip model
tar = tarfile.open(f"{local_path}/model.tar.gz", "r:gz")
tar.extractall(path=local_path)
tar.close()
os.remove(f"{local_path}/model.tar.gz")

Once the model is in our local filesystem we can load it to a SentenceTransformer and we will test the model in a pair of simple sentences

In [25]:
!pip install sentence-transformers


Collecting sentence-transformers
  Using cached sentence-transformers-2.2.2.tar.gz (85 kB)
Collecting transformers<5.0.0,>=4.6.0
  Downloading transformers-4.18.0-py3-none-any.whl (4.0 MB)
[K     |████████████████████████████████| 4.0 MB 28.4 MB/s eta 0:00:01
Collecting nltk
  Downloading nltk-3.6.7-py3-none-any.whl (1.5 MB)
[K     |████████████████████████████████| 1.5 MB 76.6 MB/s eta 0:00:01
[?25hCollecting sentencepiece
  Downloading sentencepiece-0.1.97-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.3 MB)
[K     |████████████████████████████████| 1.3 MB 60.6 MB/s eta 0:00:01
[?25hCollecting huggingface-hub>=0.4.0
  Downloading huggingface_hub-0.4.0-py3-none-any.whl (67 kB)
[K     |████████████████████████████████| 67 kB 788 kB/s  eta 0:00:01
Collecting filelock
  Downloading filelock-3.4.1-py3-none-any.whl (9.9 kB)
Collecting regex!=2019.12.17
  Downloading regex-2022.8.17-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (751 kB)
[K     |█████████████

In [30]:
from sentence_transformers import SentenceTransformer, util

# Create a Sentence transformer model using the saved model we trained on Sagemaker
model = SentenceTransformer(local_path)

INFO:sentence_transformers.SentenceTransformer:Load pretrained SentenceTransformer: bertin-sts-cc-news-es
INFO:sentence_transformers.SentenceTransformer:Use pytorch device: cpu


In [31]:
model

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: RobertaModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False})
)

In [32]:
sentences1 = ["Me gusta el lenguaje Python por que puedo hacer aplicaciones de IA", "El partido fue una historica decepción para los aficionados locales"]   
sentences2 = ["Conocer Python me permite desarrollar cuadros de mando analiticos", "La decepción no está en nuestro vocabulario"]

# encode list of sentences to get their embeddings
embedding1 = model.encode(sentences1, show_progress_bar=False )
embedding2 = model.encode(sentences2, show_progress_bar=False)

# compute similarity scores of two embeddings
cosine_scores = util.pytorch_cos_sim(embedding1, embedding2)

for i in range(len(sentences1)):
    for j in range(len(sentences2)):
        print("Sentence 1:", sentences1[i])
        print("Sentence 2:", sentences2[j])
        print("Similarity Score:", cosine_scores[i][j].item())
        print()

Sentence 1: Me gusta el lenguaje Python por que puedo hacer aplicaciones de IA
Sentence 2: Conocer Python me permite desarrollar cuadros de mando analiticos
Similarity Score: 0.6908174753189087

Sentence 1: Me gusta el lenguaje Python por que puedo hacer aplicaciones de IA
Sentence 2: La decepción no está en nuestro vocabulario
Similarity Score: 0.2870871424674988

Sentence 1: El partido fue una historica decepción para los aficionados locales
Sentence 2: Conocer Python me permite desarrollar cuadros de mando analiticos
Similarity Score: 0.13976731896400452

Sentence 1: El partido fue una historica decepción para los aficionados locales
Sentence 2: La decepción no está en nuestro vocabulario
Similarity Score: 0.37149864435195923



First we need to log in to the Huggingface Hub

In [37]:
!pip install ipywidgets

Collecting ipywidgets
  Downloading ipywidgets-7.7.2-py2.py3-none-any.whl (123 kB)
[K     |████████████████████████████████| 123 kB 31.1 MB/s eta 0:00:01
Collecting widgetsnbextension~=3.6.0
  Downloading widgetsnbextension-3.6.1-py2.py3-none-any.whl (1.6 MB)
[K     |████████████████████████████████| 1.6 MB 55.3 MB/s eta 0:00:01
[?25hCollecting jupyterlab-widgets<3,>=1.0.0
  Downloading jupyterlab_widgets-1.1.1-py3-none-any.whl (245 kB)
[K     |████████████████████████████████| 245 kB 86.6 MB/s eta 0:00:01
Collecting jedi<=0.17.2,>=0.10
  Downloading jedi-0.17.2-py2.py3-none-any.whl (1.4 MB)
[K     |████████████████████████████████| 1.4 MB 35.2 MB/s eta 0:00:01
Collecting parso<0.8.0,>=0.7.0
  Downloading parso-0.7.1-py2.py3-none-any.whl (109 kB)
[K     |████████████████████████████████| 109 kB 90.6 MB/s eta 0:00:01
Collecting notebook>=4.4.1
  Downloading notebook-6.4.10-py3-none-any.whl (9.9 MB)
[K     |████████████████████████████████| 9.9 MB 91.8 MB/s eta 0:00:01
[?25hColle

In [38]:
!huggingface-cli login


        _|    _|  _|    _|    _|_|_|    _|_|_|  _|_|_|  _|      _|    _|_|_|      _|_|_|_|    _|_|      _|_|_|  _|_|_|_|
        _|    _|  _|    _|  _|        _|          _|    _|_|    _|  _|            _|        _|    _|  _|        _|
        _|_|_|_|  _|    _|  _|  _|_|  _|  _|_|    _|    _|  _|  _|  _|  _|_|      _|_|_|    _|_|_|_|  _|        _|_|_|
        _|    _|  _|    _|  _|    _|  _|    _|    _|    _|    _|_|  _|    _|      _|        _|    _|  _|        _|
        _|    _|    _|_|      _|_|_|    _|_|_|  _|_|_|  _|      _|    _|_|_|      _|        _|    _|    _|_|_|  _|_|_|_|

        To login, `huggingface_hub` now requires a token generated from https://huggingface.co/settings/token.
        (Deprecated, will be removed in v0.3.0) To login with username and password instead, interrupt with Ctrl+C.
        
Username: 

In [39]:
from huggingface_hub import notebook_login

notebook_login()

VBox(children=(HTML(value='<center>\n<img src=https://huggingface.co/front/assets/huggingface_logo-noborder.sv…

And finally we save the model to the hub, using `save_to_hub`

In [42]:
!curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.rpm.sh | sudo bash
!sudo yum install git-lfs -y
!git lfs install

/bin/sh: 1: sudo: not found
/bin/sh: 1: sudo: not found
git: 'lfs' is not a git command. See 'git --help'.

Did you mean this?
	log


In [43]:
!curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.rpm.sh | sudo bash
!sudo apt-get install git-lfs
!git lfs install

/bin/sh: 1: sudo: not found
/bin/sh: 1: sudo: not found
git: 'lfs' is not a git command. See 'git --help'.

Did you mean this?
	log


In [44]:
!sudo apt-get update -y
!curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.deb.sh | sudo bash
!sudo apt-get install git-lfs git -y

/bin/sh: 1: sudo: not found
/bin/sh: 1: sudo: not found
/bin/sh: 1: sudo: not found


In [40]:
model.save_to_hub(
    mymodel_name, 
    #organization="embedding-data",
    train_datasets=[dataset_name],
    exist_ok=True, 
    )

INFO:sentence_transformers.SentenceTransformer:Create repository and clone it if it exists


OSError: Looks like you do not have git-lfs installed, please install. You can install from https://git-lfs.github.com/. Then run `git lfs install` (you only have to do this once).

In [9]:
from getpass import getpass

hf_username = "edumunozsala" # your username on huggingface.co
hf_email = "edumunozsala@gmail.com" # email used for commit
repository_name = mymodel_name
password = getpass("Enter your password:") # creates a prompt for entering password


Enter your password:········


In [12]:
repository_name,  local_path

('RuPERTa_base_sentiment_analysis_es', 'RuPERTa_base_sentiment_analysis_es')

After we have our unzipped model and model card located in my_bart_model we can use the either huggingface_hub SDK to create a repository and upload it to huggingface.co – or just to https://huggingface.co/new an create a new repository and upload it.

In [14]:
from huggingface_hub import HfApi, Repository

# get hf token
token = HfApi().login(username=hf_username, password=password)

# create repository
repo_url = HfApi().create_repo(token=token, name=repository_name, exist_ok=True)

# create a Repository instance
model_repo = Repository(use_auth_token=token,
                        clone_from=repo_url,
                        local_dir=local_path,
                        git_user=hf_username,
                        git_email=hf_email)

# push model to the hub
model_repo.push_to_hub()

[Repository] local_dir is not empty, so let's try to pull the remote over a non-empty folder.


'https://huggingface.co/edumunozsala/RuPERTa_base_sentiment_analysis_es/commit/b10e6247a58877f88863fbdc703ff5a848e7978e'

After we uploaded our model we can access it at https://huggingface.co/{hf_username}/{repository_name}:

In [15]:
print(f"https://huggingface.co/{hf_username}/{repository_name}")

https://huggingface.co/edumunozsala/RuPERTa_base_sentiment_analysis_es


## Deploying the endpoint

To deploy our endpoint, we call `deploy()` on our HuggingFace estimator object, passing in our desired number of instances and instance type.

In [None]:
predictor = huggingface_estimator.deploy(1,"ml.g4dn.xlarge")

Then, we use the returned predictor object to call the endpoint.

In [None]:
sentiment_input= {"inputs":"I love using the new Inference DLC."}

predictor.predict(sentiment_input)

Finally, we delete the endpoint again.

In [12]:
predictor.delete_endpoint()

# Extras

### Estimator Parameters

In [3]:
# container image used for training job
print(f"container image used for training job: \n{huggingface_estimator.image_uri}\n")

# s3 uri where the trained model is located
print(f"s3 uri where the trained model is located: \n{huggingface_estimator.model_data}\n")

# latest training job name for this estimator
print(f"latest training job name for this estimator: \n{huggingface_estimator.latest_training_job.name}\n")



NameError: name 'huggingface_estimator' is not defined

In [None]:
# access the logs of the training job
huggingface_estimator.sagemaker_session.logs_for_job(huggingface_estimator.latest_training_job.name)

### Attach to old training job to an estimator 

In Sagemaker you can attach an old training job to an estimator to continue training, get results etc..

In [32]:
from sagemaker.estimator import Estimator

# job which is going to be attached to the estimator
old_training_job_name=''

In [34]:
# attach old training job
huggingface_estimator_loaded = Estimator.attach(old_training_job_name)

# get model output s3 from training job
huggingface_estimator_loaded.model_data


2021-01-15 19:31:50 Starting - Preparing the instances for training
2021-01-15 19:31:50 Downloading - Downloading input data
2021-01-15 19:31:50 Training - Training image download completed. Training in progress.
2021-01-15 19:31:50 Uploading - Uploading generated training model
2021-01-15 19:31:50 Completed - Training job completed


's3://philipps-sagemaker-bucket-eu-central-1/huggingface-sdk-extension-2021-01-15-19-14-13-725/output/model.tar.gz'