## Project Description

Imagine you're developing a deep learning system tailored for sentiment analysis of product reviews, specifically for a newly established online beautiy product retail company. The goal is to assist the company in making informed decisions about inventory management – deciding what products to retain and what to remove from stock. The company, keen on enhancing customer satisfaction, has been actively monitoring comments on their website and has invested in annotators to label sentiments. They hand you a dataset comprising 80,000 customer reviews, each labeled with 0 for negative sentiment and 1 for positive sentiment. After extensive effort and refinement, you successfully train and deploy a classifier that predicts sentiment based on online comments. Excitedly, you report an 86% accuracy on a held-out test set to your bosses. However, to your disappointment, management expresses dissatisfaction, insisting on a minimum of 90% accuracy before considering the widespread implementation of the AI model. 
You suspect that certain annotators might have made errors, potentially affecting your model's effectiveness. Empowered by a newfound "confidence," you opt for "confidence" learning to pinpoint and rectify any inaccuracies in the dataset before embarking on the retraining process once more.

First, we prepare the environment for AWS SageMaker operations by setting up clients and retrieving essential configuration details like the default S3 bucket, execution role, and AWS region. 

In [1]:
import sagemaker
import logging
import boto3
import sagemaker
import pandas as pd
import json
import botocore
from botocore.exceptions import ClientError

config = botocore.config.Config(user_agent_extra='dlai-pds/c2/w3')

# low-level service client of the boto3 session
sm = boto3.client(service_name='sagemaker', 
                  config=config)

sm_runtime = boto3.client('sagemaker-runtime',
                          config=config)

sess = sagemaker.Session(sagemaker_client=sm,
                         sagemaker_runtime_client=sm_runtime)

bucket = sess.default_bucket()
role = sagemaker.get_execution_role()
region = sess.boto_region_name

print(bucket)

sagemaker.config INFO - Not applying SDK defaults from location: /etc/xdg/sagemaker/config.yaml
sagemaker.config INFO - Not applying SDK defaults from location: /home/ec2-user/.config/sagemaker/config.yaml
sagemaker-us-east-1-729090654745


We then configure the data source for a training job in SageMaker, defining where the training data is located (in this case, an S3 bucket) and the nature of the data.

In [2]:
from sagemaker.inputs import TrainingInput

# TODO: set the path to the train data
train_data = TrainingInput(
    "s3://{}/data/".format(bucket),
    content_type='application/x-sagemaker-training-data'
)


A PyTorch estimator with the specified configurations for a SageMaker training job is created. The training job will use the provided entry point script, run on the specified instance type, and output the trained model to the specified S3 path. The entry point script main.py contains the main steps that needs to be completed in this project.

In [3]:
from sagemaker.pytorch import PyTorch

# TODO: create the estimator
estimator = PyTorch(
    entry_point= "main.py",
    source_dir= "./",
    base_job_name="sagemaker-script-mode",
    role=role,
    instance_count=1,
    instance_type="ml.p3.2xlarge",
    framework_version="2.1",
    py_version="py310",
    dependencies=  ["requirements.txt"],
    output_data_config={
     'S3OutputPath': "s3://{}/output".format(bucket),    },
    
    output_path=  "s3://{}/output".format(bucket),
    environment={'PYTHONPATH': 'src'}
)

The following script sets up a ModelCheckpoint callback to automatically save the best model (based on development loss) during the training process in a SageMaker training job. The best model will be stored at the specified directory path within the SageMaker environment.

In [4]:
# Save the best model during training by specifying the output path
# (Note: The output path should be where the best model will be saved within the S3 bucket)
model_checkpoint = {
    'ModelCheckpoint': {
        'monitor': 'dev_loss',
        'dirpath': '/opt/ml/model/',
        'filename': 'best_model',
        'save_top_k': 1,
        'mode': 'min'
    }
}

# Attach the ModelCheckpoint callback to the estimator
estimator._hyperparameters['callbacks'] = [model_checkpoint]


Starting the training process: 

In [5]:
# TODO: train the model
estimator.fit({"train": train_data})


INFO:sagemaker.image_uris:image_uri is not presented, retrieving image_uri based on instance_type, framework etc.
INFO:sagemaker:Creating training-job with name: sagemaker-script-mode-2024-05-02-18-19-20-792


2024-05-02 18:19:30 Starting - Starting the training job...
2024-05-02 18:19:30 Pending - Training job waiting for capacity............
2024-05-02 18:21:58 Pending - Preparing the instances for training...
2024-05-02 18:22:29 Downloading - Downloading input data......
2024-05-02 18:23:05 Downloading - Downloading the training image...............
2024-05-02 18:25:46 Training - Training image download completed. Training in progress...[34mbash: cannot set terminal process group (-1): Inappropriate ioctl for device[0m
[34mbash: no job control in this shell[0m
[34m2024-05-02 18:26:13,531 sagemaker-training-toolkit INFO     Imported framework sagemaker_pytorch_container.training[0m
[34m2024-05-02 18:26:13,547 sagemaker-training-toolkit INFO     No Neurons detected (normal if no neurons installed)[0m
[34m2024-05-02 18:26:13,559 sagemaker_pytorch_container.training INFO     Block until all host DNS lookups succeed.[0m
[34m2024-05-02 18:26:13,561 sagemaker_pytorch_container.trainin

## 





## Model Deployment

We need to copy the training artifacts, i.e, output.tar.gz, from the corresponding S3 bucket to the current working directory.

In [8]:
#TODO: copy the training artifacts from the S3 bucket to the current working directory
!aws s3 cp s3://sagemaker-us-east-1-729090654745/output/sagemaker-script-mode-2024-05-02-18-19-20-792/output/output.tar.gz ./
!aws s3 cp s3://sagemaker-us-east-1-729090654745/output/sagemaker-script-mode-2024-05-02-18-19-20-792/output/model.tar.gz ./


download: s3://sagemaker-us-east-1-729090654745/output/sagemaker-script-mode-2024-05-02-18-19-20-792/output/output.tar.gz to ./output.tar.gz
download: s3://sagemaker-us-east-1-729090654745/output/sagemaker-script-mode-2024-05-02-18-19-20-792/output/model.tar.gz to ./model.tar.gz


We can decompress the training artifacts to `extracted_files` for further exploration.

In [9]:
!tar -xzf output.tar.gz -C extracted_training_artifacts


tar: Ignoring unknown extended header keyword `LIBARCHIVE.creationtime'
tar: Ignoring unknown extended header keyword `LIBARCHIVE.creationtime'
tar: Ignoring unknown extended header keyword `LIBARCHIVE.creationtime'
tar: Ignoring unknown extended header keyword `LIBARCHIVE.creationtime'


In [10]:
!tar -xzf model.tar.gz -C extracted_training_artifacts


tar: Ignoring unknown extended header keyword `LIBARCHIVE.creationtime'


We then create an endpoint 'sentiment-analysis-endpoint-2' and deploy the model to that endpoint.

In [14]:
# TODO: deploy the trained model
predictor = estimator.deploy(initial_instance_count = 1,
                             instance_type = 'ml.m4.xlarge',
                             endpoint_name = 'sentiment-analysis-endpoint-2')

INFO:sagemaker:Repacking model artifact (s3://sagemaker-us-east-1-729090654745/output/sagemaker-script-mode-2024-05-02-18-19-20-792/output/model.tar.gz), script artifact (s3://sagemaker-us-east-1-729090654745/sagemaker-script-mode-2024-05-02-18-19-20-792/source/sourcedir.tar.gz), and dependencies (['requirements.txt']) into single tar.gz file located at s3://sagemaker-us-east-1-729090654745/sagemaker-script-mode-2024-05-02-18-39-53-136/model.tar.gz. This may take some time depending on model size...
INFO:sagemaker:Creating model with name: sagemaker-script-mode-2024-05-02-18-39-53-136
INFO:sagemaker:Creating endpoint-config with name sentiment-analysis-endpoint-2
INFO:sagemaker:Creating endpoint with name sentiment-analysis-endpoint-2


-------!