# Build and push custom container to ECR using Amazon SageMaker AI Studion v2

In this notebook we build and push into Amazon ECR custom container

## Prepare the SGLang SageMaker container

SageMaker AI makes extensive use of¬†Docker containers¬†for build and runtime tasks. Using containers, you can train machine learning algorithms and deploy models quickly and reliably at any scale. See [this link](https://docs.aws.amazon.com/sagemaker/latest/dg/your-algorithms-inference-code.html#your-algorithms-inference-code-run-image) to understand how SageMaker AI runs your inference image. 

- For model inference, SageMaker AI runs the container as:
```
docker run image serve
```

- You can provide your entrypoint script as `exec` form to provide instruction of how to perform the inference process, for example:
```
ENTRYPOINT ["python", "inference.py"]
```

- When deploying ML models, one option is to archive and compress the model artifacts into a `tar.gz` format and provided the s3 path of the model artifacts as the `ModelDataUrl` in the [`CreateModel`](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateModel.html) API request. SageMaker AI will copy the model artifacts from the S3 location 
 and decompresses this tar file into `/opt/ml/model` directory before your container starts for use by your inference code. However, for deploying large models, SageMaker AI allows you to [deploy uncompressed models](https://docs.aws.amazon.com/sagemaker/latest/dg/large-model-inference-uncompressed.html). In this example, we will show you how to use the uncompressed DeepSeek R1 Distilled Llama 70B model.

- To receive inference requests, the container must have a web server listening on port `8080` and must accept `POST` requests to the `/invocations` and `/ping` endpoints.

If you already have a docker image, you can see more instructions for [adapting your own inference container for SageMaker AI](https://docs.aws.amazon.com/sagemaker/latest/dg/adapt-inference-container.html). Also it is important to note that, SageMaker AI provided containers automatically implements a web server for serving requests that responds to `/invocations` and `/ping` (for healthcheck) requests. You can find more about the [prebuilt SageMaker AI docker images for deep learning in our SageMaker doc](https://docs.aws.amazon.com/sagemaker/latest/dg/pre-built-containers-frameworks-deep-learning.html).



## Step 1: Setup

Fetch and import dependencies

Container preparation:

    Enable Docker access in your Studio domain.
    Install Docker in your Studion environment.

## Build and push

### Check if Docker is installed

In [None]:
# Install Docker in SageMaker Studio
import subprocess
import sys
import os

def check_docker():
    """Install Docker in SageMaker Studio following AWS guidelines"""
    try:
        # Check if Docker is already installed
        result = subprocess.run(['docker', '--version'], capture_output=True, text=True)
        if result.returncode == 0:
            print(f"Docker already installed: {result.stdout.strip()}")
            return True
    except FileNotFoundError:
        pass

# Check if Docker is installed
check_docker()

Docker already installed: Docker version unknown-version, build unknown-commit


True

** Uncomment and run the next line if Docker is not istalled ***

In [None]:
!install_docker_sagemaker.sh

### Build and push the container

Please change the values for `TAG` and `SRC_TAG` accordingly

In [None]:
%%writefile ./build.sh
#!/bin/bash

# Build and Push Custom SGLang Container to Amazon ECR

set -e

# Configuration
AWS_REGION=${AWS_DEFAULT_REGION:-us-east-1}
AWS_ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text)
REPOSITORY_NAME="sglang"
TAG=v0.5.4
SRC_TAG=v0.5.4.post1
IMAGE_URI="${AWS_ACCOUNT_ID}.dkr.ecr.${AWS_REGION}.amazonaws.com/${REPOSITORY_NAME}:${TAG}"

echo "Repository: ${REPOSITORY_NAME}"
echo "Image URI: ${IMAGE_URI}"
echo "Region: ${AWS_REGION}"

# Step 1: Create ECR repository if it doesn't exist
echo "üì¶ Creating ECR repository if it doesn't exist..."
aws ecr describe-repositories --repository-names ${REPOSITORY_NAME} --region ${AWS_REGION} 2>/dev/null || \
aws ecr create-repository --repository-name ${REPOSITORY_NAME} --region ${AWS_REGION}

# Step 2: Get ECR login token
echo "üîê Logging into Amazon ECR..."
aws ecr get-login-password --region ${AWS_REGION} | docker login --username AWS --password-stdin ${AWS_ACCOUNT_ID}.dkr.ecr.${AWS_REGION}.amazonaws.com

# Step 3: Build the Docker image
echo "üî® Building Docker image for SQLang..."
docker build . --tag ${REPOSITORY_NAME}:${TAG} --file Dockerfile --build-arg BASE_IMAGE=lmsysorg/sglang:${SRC_TAG}

# Step 4: Tag the image for ECR
echo "üè∑Ô∏è   Tagging image for ECR..."
docker tag ${REPOSITORY_NAME}:${TAG} ${IMAGE_URI}

# Step 5: Push the image to ECR
echo "‚¨ÜÔ∏è  Pushing image to ECR..."
docker push ${IMAGE_URI}

echo "‚úÖ Successfully built and pushed custom SGLang container!"
echo "Image URI: ${IMAGE_URI}"
echo ""
echo "You can now use this image URI in your SageMaker deployment:"
echo "image_uri = \"${IMAGE_URI}\""

# Optional: Clean up local images to save space
read -p "üóëÔ∏è   Clean up local Docker images? (y/N): " -n 1 -r
echo
if [[ $REPLY =~ ^[Yy]$ ]]; then
    echo "üßπ Cleaning up local images..."
    docker rmi ${REPOSITORY_NAME}:${TAG} ${IMAGE_URI}
    echo "Local images cleaned up"
fi

echo "üéâ Build and push completed successfully!"

In [None]:
!./build.sh;