## Configuring SageMaker Local Mode

Let’s train simple model in local mode and then deploy inference endpoint locally.

We start by installing all required dependencies for local mode:

In [None]:
! pip install 'sagemaker[local]' –upgrade

We then configure SageMaker local runtime. Note, that we are using LocalSession class to let SageMaker SDK know that we want to provision resources locally.

In [None]:
import boto3
from sagemaker.local import LocalSession

sagemaker_local_session = LocalSession()
sagemaker_local_session.config = {'local': {'local_code': True}}
account = boto3.client('sts').get_caller_identity().get('Account')
role = f"arn:aws:iam::{account}:role/service-role/AmazonSageMaker-ExecutionRole-<YOUR_ROLE_ID>" 

In this notebook we intend to use public PyTorch image from SageMaker ECR repository. For this, we need to store credentials, so docker daemon can pull images. 

In [None]:
# loging to Sagemaker ECR with Deep Learning Containers, so SageMaker can pull images in local mode
!aws ecr get-login-password --region us-east-1 | docker login --username AWS --password-stdin 763104351884.dkr.ecr.us-east-1.amazonaws.com

Now, we need to decide whether we will use GPU (if available) or CPU device (default choice). The following code snippet determines whether CUDA compatible device is available (“local_gpu” value), and if not, defaults to CPU device ( “local” value)

In [None]:
import os
import subprocess

instance_type = "local"

try:
    if subprocess.call("nvidia-smi") == 0:
        ## Set type to GPU if one is present
        instance_type = "local_gpu"
except:
    pass

print("Instance type = " + instance_type)

Once we define which local device, we configure and run SageMaker training job. SageMaker Python SDK performs following operations automatically:
- pull appropriate PyTorch image from public ECR repository.
- generate docker-compose YML file with appropriate volume mount points to access code and training data.
- starts docker container with train command.

SageMaker will output the output of docker compose command and the STDOUT/STDERR of training container to Jupyter cell.

In [None]:
from sagemaker.pytorch import PyTorch
import os

# Configure an MXNet Estimator (no training happens yet)
pytorch_estimator = PyTorch(
                        session=sagemaker_local_session,
                        entry_point=f'{os.getcwd()}/sources/cifar10.py',
                        role=role,
                        instance_type=instance_type,
                        instance_count=1,
                        job_name="test",
                        framework_version="1.9.0",
                        py_version="py38",
                        hyperparameters={
                            "epochs": 1,
                            "batch-size": 16
                            }
                        )

pytorch_estimator.fit()

After training job finished let’s see how we can deploy trained model to local real-time endpoint. Note, by default we are training only for single epoch, so don’t expect great results!
You can deploy inference container locally just by running `deploy()` method on your estimator:


In [None]:
pytorch_estimator.deploy(initial_instance_count=1, instance_type=instance_type)

Once endpoint is deployed, SageMaker SDK will start sending output of model server to Jupyter cell. You can. Also observe container logs in Docker client UI or via terminal command `docker logs CONTAINER_ID`. We can send now test image and observe how our inference scripts handles inference request in docker logs.

For this we first download locally CIFAR10 dataset and then pick one of test images.

In [None]:
# move this down on inference test

import torchvision
import torchvision.transforms as transforms

transform = transforms.Compose(
    [transforms.ToTensor(), transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))]
)

testset = torchvision.datasets.CIFAR10(
    root="./",
    train=False,
    download=True,
    transform=transform,
)



We can send now test image and observe how our inference scripts handles inference request in docker logs.

In [None]:
import requests
import json 


payload = testset[0][0].numpy().tobytes()
url = 'http://127.0.0.1:8080/invocations'
content_type = 'application/x-npy'
accept_type = "application/json"
headers = {'content-type': content_type, 'accept': accept_type}

response = requests.post(url, data=payload, headers=headers)
print(f"Model response: {json.loads(response.content)[0]}")
