# LightGBMを、SageMakerカスタムコンテナで実行する

https://dev.classmethod.jp/articles/sagemaker-container-image-lightgbm/

* カスタムコンテナ作成
* SageMaker学習ジョブ - ローカルモード
* Sagemaker学習ジョブ
* エンドポイントデプロイ
* 推論実施


In [12]:
!cat container/Dockerfile

# Build an image that can do training and inference in SageMaker
# This is a Python 3 image that uses the nginx, gunicorn, flask stack
# for serving inferences in a stable way.

FROM ubuntu:18.04

MAINTAINER Amazon AI <sage-learner@amazon.com>


RUN apt -y update && apt install -y --no-install-recommends \
    wget \
    python3-distutils \
    nginx \
    ca-certificates \
    libgomp1 \
    && apt clean

# Here we get all python packages.
# There's substantial overlap between scipy and numpy that we eliminate by
# linking them together. Likewise, pip leaves the install caches populated which uses
# a significant amount of space. These optimizations save a fair amount of space in the
# image, which reduces start up time.
RUN wget https://bootstrap.pypa.io/get-pip.py && python3 get-pip.py && \
    pip install wheel numpy scipy scikit-learn pandas lightgbm flask gevent gunicorn && \
    rm -rf /root/.cache

# Set some environment variables. PYTHONUNBUFFERED keeps Python from buffering o

In [13]:
%run ./container/build_and_push.sh

SyntaxError: invalid syntax (build_and_push.sh, line 7)

In [9]:
!which sh

/usr/bin/sh


In [16]:
%%sh

# アルゴリズム名
algorithm_name=sagemaker-lightgbm

# ファイルを実行可能にする
chmod +x container/src/train
chmod +x container/src/serve

# アカウントID取得
account=$(aws sts get-caller-identity --query Account --output text)

# リージョン名
#region='ap-northeast-1'
region='us-west-2'

# リポジトリarn
fullname="${account}.dkr.ecr.${region}.amazonaws.com/${algorithm_name}:latest"

# ECRのリポジトリが存在しなければ作成する
aws --region ${region} ecr describe-repositories --repository-names "${algorithm_name}" > /dev/null 2>&1

if [ $? -ne 0 ]
then
    aws --region ${region} ecr create-repository --repository-name "${algorithm_name}" > /dev/null
fi

# ECRへのログインコマンドを取得し、ログインする
$(aws ecr get-login --region ${region} --no-include-email)


# コンテナイメージをビルドする
docker build  -t ${algorithm_name} .
docker tag ${algorithm_name} ${fullname}

# ECRのリポジトリへプッシュする
docker push ${fullname}

Login Succeeded
The push refers to repository [805433377179.dkr.ecr.us-west-2.amazonaws.com/sagemaker-lightgbm]


https://docs.docker.com/engine/reference/commandline/login/#credentials-store

unable to prepare context: unable to evaluate symlinks in Dockerfile path: lstat /home/ec2-user/SageMaker/aws_distributed_training/tabular_data/lightgbm_sm_trainingjob/Dockerfile: no such file or directory
Error response from daemon: No such image: sagemaker-lightgbm:latest
An image does not exist locally with the tag: 805433377179.dkr.ecr.us-west-2.amazonaws.com/sagemaker-lightgbm


CalledProcessError: Command 'b'\n# \xe3\x82\xa2\xe3\x83\xab\xe3\x82\xb4\xe3\x83\xaa\xe3\x82\xba\xe3\x83\xa0\xe5\x90\x8d\nalgorithm_name=sagemaker-lightgbm\n\n# \xe3\x83\x95\xe3\x82\xa1\xe3\x82\xa4\xe3\x83\xab\xe3\x82\x92\xe5\xae\x9f\xe8\xa1\x8c\xe5\x8f\xaf\xe8\x83\xbd\xe3\x81\xab\xe3\x81\x99\xe3\x82\x8b\nchmod +x container/src/train\nchmod +x container/src/serve\n\n# \xe3\x82\xa2\xe3\x82\xab\xe3\x82\xa6\xe3\x83\xb3\xe3\x83\x88ID\xe5\x8f\x96\xe5\xbe\x97\naccount=$(aws sts get-caller-identity --query Account --output text)\n\n# \xe3\x83\xaa\xe3\x83\xbc\xe3\x82\xb8\xe3\x83\xa7\xe3\x83\xb3\xe5\x90\x8d\n#region=\'ap-northeast-1\'\nregion=\'us-west-2\'\n\n# \xe3\x83\xaa\xe3\x83\x9d\xe3\x82\xb8\xe3\x83\x88\xe3\x83\xaaarn\nfullname="${account}.dkr.ecr.${region}.amazonaws.com/${algorithm_name}:latest"\n\n# ECR\xe3\x81\xae\xe3\x83\xaa\xe3\x83\x9d\xe3\x82\xb8\xe3\x83\x88\xe3\x83\xaa\xe3\x81\x8c\xe5\xad\x98\xe5\x9c\xa8\xe3\x81\x97\xe3\x81\xaa\xe3\x81\x91\xe3\x82\x8c\xe3\x81\xb0\xe4\xbd\x9c\xe6\x88\x90\xe3\x81\x99\xe3\x82\x8b\naws --region ${region} ecr describe-repositories --repository-names "${algorithm_name}" > /dev/null 2>&1\n\nif [ $? -ne 0 ]\nthen\n    aws --region ${region} ecr create-repository --repository-name "${algorithm_name}" > /dev/null\nfi\n\n# ECR\xe3\x81\xb8\xe3\x81\xae\xe3\x83\xad\xe3\x82\xb0\xe3\x82\xa4\xe3\x83\xb3\xe3\x82\xb3\xe3\x83\x9e\xe3\x83\xb3\xe3\x83\x89\xe3\x82\x92\xe5\x8f\x96\xe5\xbe\x97\xe3\x81\x97\xe3\x80\x81\xe3\x83\xad\xe3\x82\xb0\xe3\x82\xa4\xe3\x83\xb3\xe3\x81\x99\xe3\x82\x8b\n$(aws ecr get-login --region ${region} --no-include-email)\n\n\n# \xe3\x82\xb3\xe3\x83\xb3\xe3\x83\x86\xe3\x83\x8a\xe3\x82\xa4\xe3\x83\xa1\xe3\x83\xbc\xe3\x82\xb8\xe3\x82\x92\xe3\x83\x93\xe3\x83\xab\xe3\x83\x89\xe3\x81\x99\xe3\x82\x8b\ndocker build  -t ${algorithm_name} .\ndocker tag ${algorithm_name} ${fullname}\n\n# ECR\xe3\x81\xae\xe3\x83\xaa\xe3\x83\x9d\xe3\x82\xb8\xe3\x83\x88\xe3\x83\xaa\xe3\x81\xb8\xe3\x83\x97\xe3\x83\x83\xe3\x82\xb7\xe3\x83\xa5\xe3\x81\x99\xe3\x82\x8b\ndocker push ${fullname}\n'' returned non-zero exit status 1.