
## Bring Your Own Algorithm to SageMaker


###  Training
a. [Bring Your Own Container](#byoc)

b. [Training locally](#local_train)

c. [Trigger remote training job](#remote_train)





### BYOC (Bring Your Own Container) for CascadeTabNet
<a name="byoc"></a>


* prepare necessry variables
using `boto3` to get region and account_id for later usage - ECR uri construction 

In [1]:
import boto3 

session = boto3.session.Session()
region = session.region_name
client = boto3.client("sts")
account_id = client.get_caller_identity()["Account"]
algorithm_name = "cascade-tab-net"

#### 3 elements to build bring your own container 
* `build_and_push.sh` is the script communicating with ECR 
* `Dockerfile` defines the training and serving environment 
* `code/train` and `code/serve` defines entry point of our container 

In [16]:
%%bash 
cd docker 
./build_and_push.sh

230755935769
us-west-2
Login Succeeded
Login Succeeded
1.4.0-gpu-py36-cu101-ubuntu16.04: Pulling from pytorch-training
Digest: sha256:0a352ccd7298c8039bea2d6dff533fd91df758dd7b42b08191db4cc9271fdff5
Status: Image is up to date for 763104351884.dkr.ecr.us-west-2.amazonaws.com/pytorch-training:1.4.0-gpu-py36-cu101-ubuntu16.04
763104351884.dkr.ecr.us-west-2.amazonaws.com/pytorch-training:1.4.0-gpu-py36-cu101-ubuntu16.04
Sending build context to Docker daemon  6.656kB
Step 1/19 : ARG BASE_IMG=${BASE_IMG}
Step 2/19 : FROM ${BASE_IMG}
 ---> 7c08530cf40c
Step 3/19 : RUN pip install -q mmcv terminaltables
 ---> Using cache
 ---> d973e0ac87e8
Step 4/19 : RUN git clone --branch v1.2.0 'https://github.com/open-mmlab/mmdetection.git'
 ---> Using cache
 ---> 6133b9597a09
Step 5/19 : WORKDIR /mmdetection
 ---> Using cache
 ---> a7bae8fc407f
Step 6/19 : ENV FORCE_CUDA="1"
 ---> Using cache
 ---> f86ae3b27d1d
Step 7/19 : RUN pip install -r requirements/optional.txt
 ---> Using cache
 ---> e6c59f8c94

https://docs.docker.com/engine/reference/commandline/login/#credentials-store

https://docs.docker.com/engine/reference/commandline/login/#credentials-store



In [3]:
!cat docker/Dockerfile

ARG BASE_IMG=${BASE_IMG}
FROM ${BASE_IMG} 
RUN pip install -q mmcv terminaltables
RUN git clone --branch v1.2.0 'https://github.com/open-mmlab/mmdetection.git'
WORKDIR /mmdetection
ENV FORCE_CUDA="1"
RUN pip install -r requirements/optional.txt
RUN python setup.py install 
RUN python setup.py develop
RUN pip install -r requirements.txt 
RUN pip install pillow==6.2.1 mmcv==0.4.3 pycocotools
COPY download_model.sh . 
RUN ./download_model.sh 




In [4]:
!cat docker/build_and_push.sh

#!/bin/bash

# The name of our algorithm
algorithm_name=cascade-tab-net

#cd container
# get information - account and region, required by ECR https://aws.amazon.com/ecr/
account=$(aws sts get-caller-identity --query Account --output text)
echo $account
# Get the region defined in the current configuration (default to us-west-2 if none defined)
region=$(aws configure get region)
region=${region:-us-west-2}
echo $region


# derive fullname of docker image 
fullname="${account}.dkr.ecr.${region}.amazonaws.com/${algorithm_name}:latest"

# If the repository doesn't exist in ECR, create it.

aws ecr describe-repositories --repository-names "${algorithm_name}" > /dev/null 2>&1
if [ $? -ne 0 ]
then
    aws ecr create-repository --repository-name "${algorithm_name}" > /dev/null
fi

# Get the login command from ECR and execute it directly
$(aws ecr get-login --region ${region} --no-include-email)

# Get the login command from ECR in order to pull down the SageMaker 

* construct image uri by account_id, region and algorithm_name

In [17]:
image_uri=f"{account_id}.dkr.ecr.{region}.amazonaws.com/{algorithm_name}"
image_uri

'230755935769.dkr.ecr.us-west-2.amazonaws.com/cascade-tab-net'

* prepare necessary variables/object for training 

In [18]:
import sagemaker 
session = sagemaker.session.Session()
bucket = session.default_bucket()

In [19]:
from sagemaker import get_execution_role

role = get_execution_role()
print(role)

s3_path = f"s3://{bucket}/data/icdar_table_cells_dataset"
s3_path

arn:aws:iam::230755935769:role/SageMakerExecutionRoleMLOps


's3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset'

### Dataset Description - 

Dataset used here is manually labeled by CascadeTabNet Team, here is the [link](https://drive.google.com/drive/folders/1mNDbbhu-Ubz87oRDjdtLA4BwQwwNOO-G) 

In [9]:
# s3://tomofun-audio-classification-yianc
# data/data.zip
!aws s3 cp --recursive ~/SageMaker/icdar_table_cells_dataset $s3_path

upload: ../icdar_table_cells_dataset/chunk_images/cTDaR_t10003.jpg to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/chunk_images/cTDaR_t10003.jpg
upload: ../icdar_table_cells_dataset/chunk_images/cTDaR_t10000.jpg to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/chunk_images/cTDaR_t10000.jpg
upload: ../icdar_table_cells_dataset/chunk_images/cTDaR_t10005.jpg to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/chunk_images/cTDaR_t10005.jpg
upload: ../icdar_table_cells_dataset/chunk_images/cTDaR_t10001.jpg to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/chunk_images/cTDaR_t10001.jpg
upload: ../icdar_table_cells_dataset/chunk_images/cTDaR_t10007.jpg to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/chunk_images/cTDaR_t10007.jpg
upload: ../icdar_table_cells_dataset/chunk_images/cTDaR_t10006.jpg to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/chunk_images/cTD

upload: ../icdar_table_cells_dataset/chunk_images/cTDaR_t10056.jpg to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/chunk_images/cTDaR_t10056.jpg
upload: ../icdar_table_cells_dataset/chunk_images/cTDaR_t10048.jpg to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/chunk_images/cTDaR_t10048.jpg
upload: ../icdar_table_cells_dataset/chunk_images/cTDaR_t10052.jpg to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/chunk_images/cTDaR_t10052.jpg
upload: ../icdar_table_cells_dataset/chunk_images/cTDaR_t10054.jpg to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/chunk_images/cTDaR_t10054.jpg
upload: ../icdar_table_cells_dataset/chunk_images/cTDaR_t10060.jpg to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/chunk_images/cTDaR_t10060.jpg
upload: ../icdar_table_cells_dataset/chunk_images/cTDaR_t10053.jpg to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/chunk_images/cTD

upload: ../icdar_table_cells_dataset/chunk_images/cTDaR_t10104.jpg to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/chunk_images/cTDaR_t10104.jpg
upload: ../icdar_table_cells_dataset/chunk_images/cTDaR_t10096.jpg to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/chunk_images/cTDaR_t10096.jpg
upload: ../icdar_table_cells_dataset/chunk_images/cTDaR_t10105.jpg to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/chunk_images/cTDaR_t10105.jpg
upload: ../icdar_table_cells_dataset/chunk_images/cTDaR_t10111.jpg to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/chunk_images/cTDaR_t10111.jpg
upload: ../icdar_table_cells_dataset/chunk_images/cTDaR_t10103.jpg to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/chunk_images/cTDaR_t10103.jpg
upload: ../icdar_table_cells_dataset/chunk_images/cTDaR_t10107.jpg to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/chunk_images/cTD

upload: ../icdar_table_cells_dataset/chunk_images/cTDaR_t10271.jpg to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/chunk_images/cTDaR_t10271.jpg
upload: ../icdar_table_cells_dataset/chunk_images/cTDaR_t10263.jpg to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/chunk_images/cTDaR_t10263.jpg
upload: ../icdar_table_cells_dataset/chunk_images/cTDaR_t10276.jpg to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/chunk_images/cTDaR_t10276.jpg
upload: ../icdar_table_cells_dataset/chunk_images/cTDaR_t10270.jpg to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/chunk_images/cTDaR_t10270.jpg
upload: ../icdar_table_cells_dataset/chunk_images/cTDaR_t10269.jpg to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/chunk_images/cTDaR_t10269.jpg
upload: ../icdar_table_cells_dataset/chunk_images/cTDaR_t10272.jpg to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/chunk_images/cTD

upload: ../icdar_table_cells_dataset/chunk_images/cTDaR_t10322.jpg to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/chunk_images/cTDaR_t10322.jpg
upload: ../icdar_table_cells_dataset/chunk_images/cTDaR_t10326.jpg to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/chunk_images/cTDaR_t10326.jpg
upload: ../icdar_table_cells_dataset/chunk_images/cTDaR_t10328.jpg to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/chunk_images/cTDaR_t10328.jpg
upload: ../icdar_table_cells_dataset/chunk_images/cTDaR_t10324.jpg to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/chunk_images/cTDaR_t10324.jpg
upload: ../icdar_table_cells_dataset/chunk_images/cTDaR_t10321.jpg to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/chunk_images/cTDaR_t10321.jpg
upload: ../icdar_table_cells_dataset/chunk_images/cTDaR_t10320.jpg to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/chunk_images/cTD

upload: ../icdar_table_cells_dataset/chunk_images/cTDaR_t10425.jpg to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/chunk_images/cTDaR_t10425.jpg
upload: ../icdar_table_cells_dataset/chunk_images/cTDaR_t10419.jpg to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/chunk_images/cTDaR_t10419.jpg
upload: ../icdar_table_cells_dataset/chunk_images/cTDaR_t10427.jpg to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/chunk_images/cTDaR_t10427.jpg
upload: ../icdar_table_cells_dataset/chunk_images/cTDaR_t10407.jpg to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/chunk_images/cTDaR_t10407.jpg
upload: ../icdar_table_cells_dataset/chunk_images/cTDaR_t10426.jpg to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/chunk_images/cTDaR_t10426.jpg
upload: ../icdar_table_cells_dataset/chunk_images/cTDaR_t10416.jpg to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/chunk_images/cTD

upload: ../icdar_table_cells_dataset/chunk_images/cTDaR_t10489.jpg to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/chunk_images/cTDaR_t10489.jpg
upload: ../icdar_table_cells_dataset/chunk_images/cTDaR_t10488.jpg to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/chunk_images/cTDaR_t10488.jpg
upload: ../icdar_table_cells_dataset/chunk_images/cTDaR_t10493.jpg to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/chunk_images/cTDaR_t10493.jpg
upload: ../icdar_table_cells_dataset/chunk_images/cTDaR_t10490.jpg to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/chunk_images/cTDaR_t10490.jpg
upload: ../icdar_table_cells_dataset/chunk_images/cTDaR_t10498.jpg to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/chunk_images/cTDaR_t10498.jpg
upload: ../icdar_table_cells_dataset/chunk_images/cTDaR_t10500.jpg to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/chunk_images/cTD

upload: ../icdar_table_cells_dataset/chunk_images/cTDaR_t10561.jpg to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/chunk_images/cTDaR_t10561.jpg
upload: ../icdar_table_cells_dataset/chunk_images/cTDaR_t10562.jpg to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/chunk_images/cTDaR_t10562.jpg
upload: ../icdar_table_cells_dataset/chunk_images/cTDaR_t10565.jpg to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/chunk_images/cTDaR_t10565.jpg
upload: ../icdar_table_cells_dataset/chunk_images/cTDaR_t10568.jpg to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/chunk_images/cTDaR_t10568.jpg
upload: ../icdar_table_cells_dataset/chunk_images/cTDaR_t10567.jpg to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/chunk_images/cTDaR_t10567.jpg
upload: ../icdar_table_cells_dataset/orig_chunk/cTDaR_t10001.xml to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/orig_chunk/cTDaR_t

upload: ../icdar_table_cells_dataset/orig_chunk/cTDaR_t10045.xml to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/orig_chunk/cTDaR_t10045.xml
upload: ../icdar_table_cells_dataset/orig_chunk/cTDaR_t10049.xml to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/orig_chunk/cTDaR_t10049.xml
upload: ../icdar_table_cells_dataset/orig_chunk/cTDaR_t10051.xml to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/orig_chunk/cTDaR_t10051.xml
upload: ../icdar_table_cells_dataset/orig_chunk/cTDaR_t10050.xml to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/orig_chunk/cTDaR_t10050.xml
upload: ../icdar_table_cells_dataset/orig_chunk/cTDaR_t10047.xml to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/orig_chunk/cTDaR_t10047.xml
upload: ../icdar_table_cells_dataset/orig_chunk/cTDaR_t10054.xml to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/orig_chunk/cTDaR_t10054.xml
upload: ..

upload: ../icdar_table_cells_dataset/orig_chunk/cTDaR_t10095.xml to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/orig_chunk/cTDaR_t10095.xml
upload: ../icdar_table_cells_dataset/orig_chunk/cTDaR_t10093.xml to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/orig_chunk/cTDaR_t10093.xml
upload: ../icdar_table_cells_dataset/orig_chunk/cTDaR_t10100.xml to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/orig_chunk/cTDaR_t10100.xml
upload: ../icdar_table_cells_dataset/orig_chunk/cTDaR_t10094.xml to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/orig_chunk/cTDaR_t10094.xml
upload: ../icdar_table_cells_dataset/orig_chunk/cTDaR_t10121.xml to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/orig_chunk/cTDaR_t10121.xml
upload: ../icdar_table_cells_dataset/orig_chunk/cTDaR_t10128.xml to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/orig_chunk/cTDaR_t10128.xml
upload: ..

upload: ../icdar_table_cells_dataset/orig_chunk/cTDaR_t10275.xml to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/orig_chunk/cTDaR_t10275.xml
upload: ../icdar_table_cells_dataset/orig_chunk/cTDaR_t10278.xml to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/orig_chunk/cTDaR_t10278.xml
upload: ../icdar_table_cells_dataset/orig_chunk/cTDaR_t10277.xml to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/orig_chunk/cTDaR_t10277.xml
upload: ../icdar_table_cells_dataset/orig_chunk/cTDaR_t10276.xml to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/orig_chunk/cTDaR_t10276.xml
upload: ../icdar_table_cells_dataset/orig_chunk/cTDaR_t10280.xml to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/orig_chunk/cTDaR_t10280.xml
upload: ../icdar_table_cells_dataset/orig_chunk/cTDaR_t10279.xml to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/orig_chunk/cTDaR_t10279.xml
upload: ..

upload: ../icdar_table_cells_dataset/orig_chunk/cTDaR_t10338.xml to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/orig_chunk/cTDaR_t10338.xml
upload: ../icdar_table_cells_dataset/orig_chunk/cTDaR_t10339.xml to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/orig_chunk/cTDaR_t10339.xml
upload: ../icdar_table_cells_dataset/orig_chunk/cTDaR_t10333.xml to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/orig_chunk/cTDaR_t10333.xml
upload: ../icdar_table_cells_dataset/orig_chunk/cTDaR_t10342.xml to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/orig_chunk/cTDaR_t10342.xml
upload: ../icdar_table_cells_dataset/orig_chunk/cTDaR_t10326.xml to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/orig_chunk/cTDaR_t10326.xml
upload: ../icdar_table_cells_dataset/orig_chunk/cTDaR_t10343.xml to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/orig_chunk/cTDaR_t10343.xml
upload: ..

upload: ../icdar_table_cells_dataset/orig_chunk/cTDaR_t10462.xml to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/orig_chunk/cTDaR_t10462.xml
upload: ../icdar_table_cells_dataset/orig_chunk/cTDaR_t10472.xml to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/orig_chunk/cTDaR_t10472.xml
upload: ../icdar_table_cells_dataset/orig_chunk/cTDaR_t10471.xml to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/orig_chunk/cTDaR_t10471.xml
upload: ../icdar_table_cells_dataset/orig_chunk/cTDaR_t10467.xml to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/orig_chunk/cTDaR_t10467.xml
upload: ../icdar_table_cells_dataset/orig_chunk/cTDaR_t10473.xml to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/orig_chunk/cTDaR_t10473.xml
upload: ../icdar_table_cells_dataset/orig_chunk/cTDaR_t10476.xml to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/orig_chunk/cTDaR_t10476.xml
upload: ..

upload: ../icdar_table_cells_dataset/orig_chunk/cTDaR_t10539.xml to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/orig_chunk/cTDaR_t10539.xml
upload: ../icdar_table_cells_dataset/orig_chunk/cTDaR_t10541.xml to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/orig_chunk/cTDaR_t10541.xml
upload: ../icdar_table_cells_dataset/orig_chunk/cTDaR_t10533.xml to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/orig_chunk/cTDaR_t10533.xml
upload: ../icdar_table_cells_dataset/orig_chunk/cTDaR_t10534.xml to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/orig_chunk/cTDaR_t10534.xml
upload: ../icdar_table_cells_dataset/orig_chunk/cTDaR_t10545.xml to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/orig_chunk/cTDaR_t10545.xml
upload: ../icdar_table_cells_dataset/orig_chunk/cTDaR_t10535.xml to s3://sagemaker-us-west-2-230755935769/data/icdar_table_cells_dataset/orig_chunk/cTDaR_t10535.xml
upload: ..

### Train model in a docker container with terminal interface 
<a name="local_train"></a>

* start container in interactive mode
```
IMAGE_ID=$(sudo docker images --filter=reference=cascade-tab-net --format "{{.ID}}")
nvidia-docker run -it -v $PWD:$PWD --shm-size=4096m $IMAGE_ID bash 
```
* train model based on README.md
```
cd /home/ec2-user/SageMaker/CascadeTabNet/
python /mmdetection/tools/train.py  Config/cascade_mask_rcnn_hrnetv2p_w32_20e_smnb.py

cd /home/ec2-user/SageMaker/CascadeTabNet/Table Structure Recognition/
python main.py
```

In [26]:
from datetime import datetime
now = datetime.now()
timestamp = datetime.timestamp(now)
job_name = "cascadetabnet-{}".format(str(int(timestamp))) 
job_name

'cascadetabnet-1627990041'

### Start SageMaker Training Job
<a name="remote_train"></a>
* sagemaker training jobs can run either locally or remotely 

In [27]:
mode = 'remote'
if mode == 'local':
    csess = sagemaker.local.LocalSession()
else:    
    csess = session

print(csess)
estimator = sagemaker.estimator.Estimator( 
                        role=role,
                        image_uri=image_uri,
                        instance_count=1,
#                         instance_type='local_gpu',
                        instance_type='ml.p3.8xlarge',
                        sagemaker_session=csess,
                        volume_size=100, 
                        debugger_hook_config=False
                   )

<sagemaker.session.Session object at 0x7fdcb1aefe80>


In [None]:
estimator.fit(inputs={"icdar_table_cells_dataset":s3_path}, job_name=job_name)

2021-08-03 11:27:25 Starting - Starting the training job...
2021-08-03 11:27:49 Starting - Launching requested ML instancesProfilerReport-1627990044: InProgress
.........
2021-08-03 11:29:09 Starting - Preparing the instances for training...
2021-08-03 11:29:55 Downloading - Downloading input data...
2021-08-03 11:30:16 Training - Downloading the training image....................[34mbash: cannot set terminal process group (-1): Inappropriate ioctl for device[0m
[34mbash: no job control in this shell[0m
[34m2021-08-03 11:33:40,962 - mmdet - INFO - Environment info:[0m
[34m------------------------------------------------------------[0m
[34msys.platform: linux[0m
[34mPython: 3.6.13 |Anaconda, Inc.| (default, Feb 23 2021, 21:15:04) [GCC 7.3.0][0m
[34mCUDA available: True[0m
[34mCUDA_HOME: /usr/local/cuda[0m
[34mNVCC: Cuda compilation tools, release 10.1, V10.1.243[0m
[34mGPU 0,1,2,3: Tesla V100-SXM2-16GB[0m
[34mGCC: gcc (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609


[0m
[34munexpected key in source state_dict: incre_modules.0.0.conv1.weight, incre_modules.0.0.bn1.weight, incre_modules.0.0.bn1.bias, incre_modules.0.0.bn1.running_mean, incre_modules.0.0.bn1.running_var, incre_modules.0.0.bn1.num_batches_tracked, incre_modules.0.0.conv2.weight, incre_modules.0.0.bn2.weight, incre_modules.0.0.bn2.bias, incre_modules.0.0.bn2.running_mean, incre_modules.0.0.bn2.running_var, incre_modules.0.0.bn2.num_batches_tracked, incre_modules.0.0.conv3.weight, incre_modules.0.0.bn3.weight, incre_modules.0.0.bn3.bias, incre_modules.0.0.bn3.running_mean, incre_modules.0.0.bn3.running_var, incre_modules.0.0.bn3.num_batches_tracked, incre_modules.0.0.downsample.0.weight, incre_modules.0.0.downsample.1.weight, incre_modules.0.0.downsample.1.bias, incre_modules.0.0.downsample.1.running_mean, incre_modules.0.0.downsample.1.running_var, incre_modules.0.0.downsample.1.num_batches_tracked, incre_modules.1.0.conv1.weight, incre_modules.1.0.bn1.weight, incre_modules.1.0.bn1.