# [Module 3.4.1] Train Docker Image 생성 및 ECR 퍼블리시

이 노트북은 Docker 이미지를 생성하고, Amazon ECR(Elastic Container Registry)에 퍼블리스를 합니다. <br>
아래는 상세 내용을 기술 합니다. BYOC에 대해서 알기위해서는 아래 BYOC Reference를 참고 하세요.

- Train Script를 Dockefile이 사용할 수 있게 복사
- Dockerfile의 정의를 확인
    - Built-in Tensorflow:2.1.1-gpu Docker Image를 가져와서 사용 
    - transfomer 설치 명령어 기술
- ECR에 생성된 Docker Image를 퍼블리시 함

---
노트북의 소요 시간은 약 2분 걸립니다.

---

## BYOC Reference:
- Get Started: Build Your Custom Training Container with Amazon SageMaker
    - https://docs.aws.amazon.com/sagemaker/latest/dg/build-container-to-train-script-get-started.html
- **추천:  Building your own algorithm container: BYOC 컨테이너 (학습, 추론)**
    - https://github.com/awslabs/amazon-sagemaker-examples/blob/master/advanced_functionality/scikit_bring_your_own/scikit_bring_your_own.ipynb
    
    
- Built-in Container Image
    - https://github.com/aws/deep-learning-containers/blob/master/available_images.md

## Train Script를 다커 폴더에 복사

In [5]:
bert_train_file = "tf_script_bert_tweet.py"
! cp {bert_train_file} "train_container/"

## train_container 폴더의 Dockerfile 
<font color="red">**해당 Region에 따라 Dockerfile을 수정해야 합니다.**</font><br>
현재의 Dockerfile은 'ap-northeast-2' 으로 되어 있습니다.
민일 'us-west-2' 이면 train_container 폴더를 클릭한 후에 Dockerfile 파일을 열고
아래와 같이 Dockerfile 의 내용을 바꾸어야 합니다. <br>

FROM 763104351884.dkr.ecr.us-west-2.amazonaws.com/tensorflow-training:2.1.1-gpu-py36-cu101-ubuntu18.04

In [6]:
! pygmentize train_container/Dockerfile


[37m# 기존의  Pre-built-in TF2.1-gpu image를 가져옴[39;49;00m
[34mFROM[39;49;00m [33m763104351884.dkr.ecr.ap-northeast-2.amazonaws.com/tensorflow-training:2.1.1-gpu-py36-cu101-ubuntu18.04[39;49;00m

[37m# transformers 설치[39;49;00m
[34mRUN[39;49;00m pip install [31mtransformers[39;49;00m==[34m2[39;49;00m.8.0


[34mENV[39;49;00m [31mPYTHONUNBUFFERED[39;49;00m=TRUE
[34mENV[39;49;00m [31mPYTHONDONTWRITEBYTECODE[39;49;00m=TRUE

[34mENV[39;49;00m [31mPATH[39;49;00m=[33m"[39;49;00m[33m/opt/ml/code:[39;49;00m[33m${[39;49;00m[31mPATH[39;49;00m[33m}[39;49;00m[33m"[39;49;00m

[37m# Copy training code[39;49;00m
[34mCOPY[39;49;00m tf_script_bert_tweet.py /opt/ml/code/
 
[34mWORKDIR[39;49;00m[33m /opt/ml/code[39;49;00m

[34mENV[39;49;00m SAGEMAKER_PROGRAM tf_script_bert_tweet.py


# Train Docker 이미지를 ECR (Elastic Container Registry) 에 Push

In [7]:
import os
os.environ['train_container_name']= "bert2tweet"

In [8]:
%%sh
cd train_container

# The name of our algorithm
algorithm_name=$train_container_name


account=$(aws sts get-caller-identity --query Account --output text)

# Get the region defined in the current configuration (default to us-west-2 if none defined)
region=$(aws configure get region)
region=${region:-us-west-2}

echo $region

fullname="${account}.dkr.ecr.${region}.amazonaws.com/${algorithm_name}:latest"

# If the repository doesn't exist in ECR, create it.

aws ecr describe-repositories --repository-names "${algorithm_name}" > /dev/null 2>&1

if [ $? -ne 0 ]
then
    aws ecr create-repository --repository-name "${algorithm_name}" > /dev/null
fi

# Get the login command from ECR and execute it directly
$(aws ecr get-login --region ${region} --no-include-email)

# Get the login command from ECR in order to pull down the Tensorflow-gpu:1.5 image
$(aws ecr get-login --registry-ids 763104351884 --region ${region} --no-include-email)

# Build the docker image locally with the image name and then push it to ECR
# with the full name.

docker build  -t ${algorithm_name} . --build-arg REGION=${region}
docker tag ${algorithm_name} ${fullname}

docker push ${fullname}

ap-northeast-2
Login Succeeded
Login Succeeded
Sending build context to Docker daemon  201.7kB
Step 1/8 : FROM 763104351884.dkr.ecr.ap-northeast-2.amazonaws.com/tensorflow-training:2.1.1-gpu-py36-cu101-ubuntu18.04
 ---> edb9e75607cd
Step 2/8 : RUN pip install transformers==2.8.0
 ---> Using cache
 ---> d3d2a62e4e68
Step 3/8 : ENV PYTHONUNBUFFERED=TRUE
 ---> Using cache
 ---> 3f59ae2dc8f4
Step 4/8 : ENV PYTHONDONTWRITEBYTECODE=TRUE
 ---> Using cache
 ---> 7d663fa5b7b1
Step 5/8 : ENV PATH="/opt/ml/code:${PATH}"
 ---> Using cache
 ---> 2e856f64c9fc
Step 6/8 : COPY tf_script_bert_tweet.py /opt/ml/code/
 ---> 0428f7974671
Step 7/8 : WORKDIR /opt/ml/code
 ---> Running in d5ee3d4166b1
Removing intermediate container d5ee3d4166b1
 ---> 1eddc5e6a53b
Step 8/8 : ENV SAGEMAKER_PROGRAM tf_script_bert_tweet.py
 ---> Running in e01f7b1a22ae
Removing intermediate container e01f7b1a22ae
 ---> 9e2496216d59
Successfully built 9e2496216d59
Successfully tagged bert2tweet:latest
The push refers to repositor

https://docs.docker.com/engine/reference/commandline/login/#credentials-store

https://docs.docker.com/engine/reference/commandline/login/#credentials-store

