## Setup Hosting Container

For production workload, we recommend to built a custom MME container with stable diffusion base model and custom packages pre-installed. This has 2 advantages over the other solution which extend the container on the fly wiht the setup model:

1) Multi-instance scalability: as of today, there is no control over placement of the setup model behind MME endpoints. Therefore when you scale to multiple instances, it's not posible to guarantee you can preload the base stable diffusion model and conda environment on the each instance. Custom containers preloads all shared components and ensure they are available on every instance.

2) Improve cold start: when we invoke a MME model for the first time, every single model will install the conda environment leads to unnecessary overhead. Using custom container, we will directly install the package onto the container. This shave off 10-20s when cold start a model and reduce the redundency of installing the same conda package for each model.

This notebook is tested on a `ml.g4dn.2xlarge` SageMaker notebook instance using a `conda_pytorch_p310` kernel. **DO NOT use SageMaker Studio**

In [48]:
!pip install -Uq nvidia-pyindex 
!pip install -Uq tritonclient[http]
!pip install -Uq sagemaker ipywidgets pillow numpy transformers accelerate diffusers

In [21]:
import boto3
import sagemaker
from sagemaker import get_execution_role

import tritonclient.http as httpclient
from tritonclient.utils import *
import time
from PIL import Image
import numpy as np

# variables
s3_client = boto3.client("s3")

# sagemaker variables
role = get_execution_role()
sm_client = boto3.client(service_name="sagemaker")
runtime_sm_client = boto3.client("sagemaker-runtime")
sagemaker_session = sagemaker.Session(boto_session=boto3.Session())
region = sagemaker_session.boto_region_name
account = sagemaker_session.account_id()
bucket = sagemaker_session.default_bucket()

prefix = "stable-diffusion-dreambooth"

### Import and Save Stable Diffusion Model

In [5]:
import diffusers
import torch 

pipeline = diffusers.StableDiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-2-1",
                                                             cache_dir='hf_cache',
                                                             torch_dtype=torch.float16,
                                                             revision="fp16")

Downloading (…)p16/model_index.json:   0%|          | 0.00/517 [00:00<?, ?B/s]

vae/diffusion_pytorch_model.safetensors not found


Fetching 12 files:   0%|          | 0/12 [00:00<?, ?it/s]

Downloading pytorch_model.bin:   0%|          | 0.00/681M [00:00<?, ?B/s]

Downloading (…)cial_tokens_map.json:   0%|          | 0.00/460 [00:00<?, ?B/s]

Downloading (…)tokenizer/merges.txt:   0%|          | 0.00/525k [00:00<?, ?B/s]

Downloading (…)_encoder/config.json:   0%|          | 0.00/628 [00:00<?, ?B/s]

Downloading (…)okenizer_config.json:   0%|          | 0.00/819 [00:00<?, ?B/s]

Downloading (…)75a/unet/config.json:   0%|          | 0.00/999 [00:00<?, ?B/s]

Downloading (…)cheduler_config.json:   0%|          | 0.00/351 [00:00<?, ?B/s]

Downloading (…)on_pytorch_model.bin:   0%|          | 0.00/1.73G [00:00<?, ?B/s]

Downloading (…)on_pytorch_model.bin:   0%|          | 0.00/167M [00:00<?, ?B/s]

Downloading (…)tokenizer/vocab.json:   0%|          | 0.00/1.06M [00:00<?, ?B/s]

Downloading (…)d75a/vae/config.json:   0%|          | 0.00/612 [00:00<?, ?B/s]

In [6]:
sd_dir = 'stable_diff'
pipeline.save_pretrained(sd_dir)

In [7]:
import os
import tarfile

sd_tar = f"docker/{sd_dir}.tar.gz"

def make_tarfile(output_filename, source_dir):
    with tarfile.open(output_filename, "w:gz") as tar:
        tar.add(source_dir, arcname=os.path.basename(source_dir))

make_tarfile(sd_tar, sd_dir)

### Extend SageMaker Managed Triton Container

In [71]:
# account mapping for SageMaker Triton Image
account_id_map = {
    "us-east-1": "785573368785",
    "us-east-2": "007439368137",
    "us-west-1": "710691900526",
    "us-west-2": "301217895009",
    "eu-west-1": "802834080501",
    "eu-west-2": "205493899709",
    "eu-west-3": "254080097072",
    "eu-north-1": "601324751636",
    "eu-south-1": "966458181534",
    "eu-central-1": "746233611703",
    "ap-east-1": "110948597952",
    "ap-south-1": "763008648453",
    "ap-northeast-1": "941853720454",
    "ap-northeast-2": "151534178276",
    "ap-southeast-1": "324986816169",
    "ap-southeast-2": "355873309152",
    "cn-northwest-1": "474822919863",
    "cn-north-1": "472730292857",
    "sa-east-1": "756306329178",
    "ca-central-1": "464438896020",
    "me-south-1": "836785723513",
    "af-south-1": "774647643957",
}



region = boto3.Session().region_name
if region not in account_id_map.keys():
    raise ("UNSUPPORTED REGION")

base = "amazonaws.com.cn" if region.startswith("cn-") else "amazonaws.com"
mme_triton_image_uri = (
    "{account_id}.dkr.ecr.{region}.{base}/sagemaker-tritonserver:22.12-py3".format(
        account_id=account_id_map[region], region=region, base=base
    )
)
triton_account_id = account_id_map[region]
mme_triton_image_uri

'301217895009.dkr.ecr.us-west-2.amazonaws.com/sagemaker-tritonserver:22.12-py3'

Preview docker file

In [72]:
!cat docker/Dockerfile

ARG BASE_IMAGE

FROM $BASE_IMAGE

#Install any additional libraries
RUN echo "Adding stable diffusion base model to Docker image"
RUN mkdir -p /home/
COPY stable_diff.tar.gz /tmp/
# Install tar
RUN apt-get update && apt-get install -y tar

# Untar the file
RUN tar -xzf /tmp/stable_diff.tar.gz -C /home/

RUN rm /tmp/stable_diff.tar.gz

RUN echo "Install required packages"
COPY requirements.txt /tmp/ 

RUN pip install -r /tmp/requirements.txt


Create new container

In [73]:
# Change this var to change the name of new container image
new_image_name = f"sagemaker-tritonserver-{prefix}-prod"

In [74]:
%%capture build_output
!cd docker && bash build_and_push.sh "$new_image_name" "latest" "$mme_triton_image_uri" "$region" "$account" "$triton_account_id"

In [75]:
print(build_output)
if 'Error response from daemon' in str(build_output):    
    raise SystemExit('\n\n!!There was an error with the container build!!')
else:
    extended_triton_image_uri = str(build_output).strip().split('\n')[-1]

https://docs.docker.com/engine/reference/commandline/login/#credentials-store

Login Succeeded
Sending build context to Docker daemon  557.1kBSending build context to Docker daemon  28.97MBSending build context to Docker daemon  59.05MBSending build context to Docker daemon  88.01MBSending build context to Docker daemon  116.4MBSending build context to Docker daemon  144.8MBSending build context to Docker daemon  173.2MBSending build context to Docker daemon  201.7MBSending build context to Docker daemon  232.3MBSending build context to Docker daemon  262.9MBSending build context to Docker daemon  292.5MBSending build context to Docker daemon  323.6MBSending build context to Docker daemon  353.2MBSending build context to Docker daemon  384.4MBSending build context to Docker daemon  416.7MBSending build context to Docker daemon  446.2MBSending build context to Docker daemon  476.3MBSending build context to Docker daemon  508.6MBSending build context to Docker daemon

Store new container image uri from ECR

In [76]:
%store extended_triton_image_uri
extended_triton_image_uri

Stored 'extended_triton_image_uri' (str)


'376678947624.dkr.ecr.us-west-2.amazonaws.com/sagemaker-tritonserver-stable-diffusion-dreambooth-prod:latest'