### Download Data 

* use coco dataset 2017 
* to save time, use validation dataset as training set 
* make folder structure to fulfill the bring your container folder structure 

In [None]:
%%bash 
cd ~/SageMaker
mkdir input 
mkdir input/data 
cd input/data 
wget -O coco.zip https://tinyurl.com/yhtbxr6q 
unzip coco.zip 

### Build AlphaPose container 

* reference to [install instruction](https://github.com/catwhiskers/AlphaPose/blob/master/docs/INSTALL.md)
* fix package dependency pycocotools==2.0.2a1
* download pretrained models for inferences (yolov3 and fastpose)


In [2]:
%%bash 
cd container 
./build_and_push.sh 

230755935769
us-west-2
Login Succeeded
Login Succeeded
1.15.5-gpu-py37-cu100-ubuntu18.04: Pulling from tensorflow-training
Digest: sha256:b9bf85ad4689c7a728ba47866fdfe868ee7c7bcdb0417b3939df4cbc7053a5bc
Status: Image is up to date for 763104351884.dkr.ecr.us-west-2.amazonaws.com/tensorflow-training:1.15.5-gpu-py37-cu100-ubuntu18.04
763104351884.dkr.ecr.us-west-2.amazonaws.com/tensorflow-training:1.15.5-gpu-py37-cu100-ubuntu18.04
Sending build context to Docker daemon  4.608kB
Step 1/4 : ARG BASE_IMG=${BASE_IMG}
Step 2/4 : FROM ${BASE_IMG}
 ---> 53fefee0e5ac
Step 3/4 : RUN apt-get update -y && apt-get install ffmpeg libsm6 libxext6 -y
 ---> Using cache
 ---> 5242af8b2332
Step 4/4 : RUN pip install nibabel opencv-python matplotlib keras==2.3.1
 ---> Using cache
 ---> a08698b5d41b
Successfully built a08698b5d41b
Successfully tagged dunet:latest
The push refers to repository [230755935769.dkr.ecr.us-west-2.amazonaws.com/dunet]
156f3779f1f9: Preparing
c62bd00454db: Preparing
85d8371500dd:

https://docs.docker.com/engine/reference/commandline/login/#credentials-store

https://docs.docker.com/engine/reference/commandline/login/#credentials-store



In [6]:
import boto3 
client = boto3.client("sts")
account_id = client.get_caller_identity()["Account"]
image_uri = "{}.dkr.ecr.{}.amazonaws.com/dunet".format(account_id, "us-west-2")
image_uri

'230755935769.dkr.ecr.us-west-2.amazonaws.com/dunet'

### Do inference based on the docker image
* reference to the [inference instruction](https://github.com/catwhiskers/AlphaPose)

In [None]:
!nvidia-docker run -it -v /home/ec2-user/SageMaker/:/opt/ml  --entrypoint='' 230755935769.dkr.ecr.us-west-2.amazonaws.com/alphapose-byos  python scripts/demo_inference.py --cfg configs/coco/resnet/256x192_res50_lr1e-3_1x.yaml --checkpoint pretrained_models/fast_res50_256x192.pth --indir examples/demo/ --outdir /opt/ml/demo --save_img

### Training locally based on the docker image 

In [None]:
!nvidia-docker run -it -v /home/ec2-user/SageMaker/:/opt/ml 230755935769.dkr.ecr.us-west-2.amazonaws.com/alphapose-byos 

### Use SageMaker Training Jobs 

In [None]:
import sagemaker
from sagemaker import get_execution_role
role = get_execution_role()
sagemaker_session = sagemaker.Session()
bucket = sagemaker_session.default_bucket()
prefix = "alphapose"

#### upload data to s3 

In [None]:
!cd ~/SageMaker/input/data/ && aws s3 cp --recursive coco s3://{bucket}/{prefix}/coco

#### define s3 input and output paths 

In [None]:
coco_data = "s3://{}/{}/coco/".format(bucket, prefix)
outpath = "s3://{}/{}/output/".format(bucket, prefix)
repositoryUri = "230755935769.dkr.ecr.us-west-2.amazonaws.com/alphapose-byos"

#### define job_name and  and hyperparameters

In [None]:
from datetime import datetime
now = datetime.now()
timestamp = datetime.timestamp(now)
job_name = "alphapose-{}".format(str(int(timestamp))) 
job_name 

#### submit training job 

In [None]:
coco_input = sagemaker.inputs.TrainingInput(coco_data)

In [None]:
from sagemaker.pytorch import PyTorch

estimator = PyTorch(entry_point='scripts/train.py',
                        role=role,
                        image_uri=repositoryUri,
                        source_dir='.',
                        instance_count=1,
                        instance_type='ml.p3.8xlarge',
                        framework_version='1.6.0',
                        py_version='py3',
                        sagemaker_session=sagemaker_session,
                        volume_size=100, 
                        debugger_hook_config=False
                   )


In [None]:
estimator.fit(inputs={"coco":coco_input}, job_name=job_name)