# Proposed solution

In this repo we propose to ***extend*** the existing Hugging Face DLCs by pulling them from the public ECR and running a simple Dockerfile on top of them that will install the latest available version of `transformers`.

Note that in this notebook we only extend the Inference container, but the same also works for the [Training DLCs](https://github.com/aws/deep-learning-containers/blob/master/available_images.md#huggingface-training-containers).

## Writing Dockerfile
We wrtite the docker file. First we pull the existing DLC (which can be found [here](https://github.com/aws/deep-learning-containers/blob/master/available_images.md#huggingface-inference-containers)) and then we add a `pip install` command to upgrade the `transformers` library.

In [1]:
%%writefile Dockerfile
FROM 763104351884.dkr.ecr.us-east-1.amazonaws.com/huggingface-pytorch-inference:1.10.2-transformers4.17.0-gpu-py38-cu113-ubuntu20.04
RUN pip install --upgrade 'transformers==4.24.0'

Writing Dockerfile


We change into the directory where the Docker file is

In [3]:
%cd ~/SageMaker/sm-extend-container

/home/ec2-user/SageMaker/sm-extend-container


This is an adaptation of the official [tutorial](https://docs.aws.amazon.com/sagemaker/latest/dg/prebuilt-containers-extend.html) of extending pre-built containers. It will create a container with a name that we can choose and pushes the container into the ECR in our own AWS account. 

**Make sure that the role you're using to run this script has the corresponding IAM priviliges to write to ECR.** To learn more about IAM for ECR head over to https://docs.aws.amazon.com/AmazonECR/latest/userguide/security-iam.html. This notebook was tested with AdministratorAccess priviliges attached to the SageMaker Execution role. 

In [None]:
%%sh

# Specify a name and a tag
algorithm_name=huggingface-pytorch-inference-extended
tag=1.10.2-transformers4.24.0-gpu-py38-cu113-ubuntu20.04

account=$(aws sts get-caller-identity --query Account --output text)


# Get the region defined in the current configuration (default to us-west-2 if none defined)
region=$(aws configure get region)

fullname="${account}.dkr.ecr.${region}.amazonaws.com/${algorithm_name}:${tag}"

# If the repository doesn't exist in ECR, create it.

aws ecr describe-repositories --repository-names "${algorithm_name}" > /dev/null 2>&1
if [ $? -ne 0 ]
then
aws ecr create-repository --repository-name "${algorithm_name}" > /dev/null
fi

# Log into Docker
aws ecr get-login-password --region ${region}|docker login --username AWS --password-stdin ${fullname}

# Build the docker image locally with the image name and then push it to ECR
# with the full name.

docker build -t ${algorithm_name} .
docker tag ${algorithm_name} ${fullname}

docker push ${fullname}