# Tensorflow Object Detection API and AWS Sagemaker

In this notebook, you will train and evaluate different models using the [Tensorflow Object Detection API](https://tensorflow-object-detection-api-tutorial.readthedocs.io/en/latest/) and [AWS Sagemaker](https://aws.amazon.com/sagemaker/). 

If you ever feel stuck, you can refer to this [tutorial](https://aws.amazon.com/blogs/machine-learning/training-and-deploying-models-using-tensorflow-2-with-the-object-detection-api-on-amazon-sagemaker/).

## Dataset

We are using the [Waymo Open Dataset](https://waymo.com/open/) for this project. The dataset has already been exported using the tfrecords format. The files have been created following the format described [here](https://tensorflow-object-detection-api-tutorial.readthedocs.io/en/latest/training.html#create-tensorflow-records). You can find data stored on [AWS S3](https://aws.amazon.com/s3/), AWS Object Storage. The images are saved with a resolution of 640x640.

In [13]:
%%capture
%pip install tensorflow_io sagemaker -U

In [14]:
import os
import sagemaker
from sagemaker.estimator import Estimator
from framework import CustomFramework

Save the IAM role in a variable called `role`. This would be useful when training the model.

In [15]:
role = sagemaker.get_execution_role()
print(role)

INFO:botocore.credentials:Found credentials from IAM Role: BaseNotebookInstanceEc2InstanceRole


arn:aws:iam::406551161089:role/service-role/AmazonSageMaker-ExecutionRole-20230623T150985


In [16]:
# The train and val paths below are public S3 buckets created by Udacity for this project
inputs = {'train': 's3://cd2688-object-detection-tf2/train/', 
        'val': 's3://cd2688-object-detection-tf2/val/'} 

# Insert path of a folder in your personal S3 bucket to store tensorboard logs.
tensorboard_s3_prefix = 's3://object-detection-project-subodh/logs/'

## Container

To train the model, you will first need to build a [docker](https://www.docker.com/) container with all the dependencies required by the TF Object Detection API. The code below does the following:
* clone the Tensorflow models repository
* get the exporter and training scripts from the the repository
* build the docker image and push it 
* print the container name

In [17]:
%%bash
rm -rf docker/models
# clone the repo and get the scripts
git clone https://github.com/tensorflow/models.git docker/models

# get model_main and exporter_main files from TF2 Object Detection GitHub repository
cp docker/models/research/object_detection/exporter_main_v2.py source_dir 
cp docker/models/research/object_detection/model_main_tf2.py source_dir

Cloning into 'docker/models'...


In [18]:
# build and push the docker image. This code can be commented after being ran once.
# This will take around 10 mins.
image_name = 'tf2-object-detection'
!sh ./docker/build_and_push.sh $image_name

https://docs.docker.com/engine/reference/commandline/login/#credentials-store

Login Succeeded
Building image with name tf2-object-detection
Sending build context to Docker daemon  727.8MB
Step 1/17 : FROM tensorflow/tensorflow:2.9.0-gpu
 ---> c8d9ee2a0ff4
Step 2/17 : ARG DEBIAN_FRONTEND=noninteractive
 ---> Running in a9681712a473
Removing intermediate container a9681712a473
 ---> deb5e06241f8
Step 3/17 : RUN rm /etc/apt/sources.list.d/cuda.list
 ---> Running in 8be9d2a39a78
Removing intermediate container 8be9d2a39a78
 ---> 1c315ee67efb
Step 4/17 : RUN apt-key del 7fa2af80
 ---> Running in b9cec4568994
OK
Removing intermediate container b9cec4568994
 ---> 3d42ce43b6ba
Step 5/17 : RUN apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/3bf863cc.pub
 ---> Running in a7d63d32b3e2
[0mExecuting: /tmp/apt-key-gpghome.YGOegihApo/gpg.1.sh --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/3bf863cc.pub
[91m

Get:3 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 gpg-wks-server amd64 2.2.19-3ubuntu2.2 [90.2 kB]
Get:4 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 gnupg-utils amd64 2.2.19-3ubuntu2.2 [481 kB]
Get:5 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 gpg-agent amd64 2.2.19-3ubuntu2.2 [232 kB]
Get:6 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 gpg amd64 2.2.19-3ubuntu2.2 [482 kB]
Get:7 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 gpgconf amd64 2.2.19-3ubuntu2.2 [124 kB]
Get:8 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 gnupg-l10n all 2.2.19-3ubuntu2.2 [51.7 kB]
Get:9 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 gnupg all 2.2.19-3ubuntu2.2 [259 kB]
Get:10 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 gpgsm amd64 2.2.19-3ubuntu2.2 [217 kB]
Get:11 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 gpgv amd64 2.2.19-3ubuntu2.2 [200 kB]
Get:12 http://archive.ubuntu.com/ubuntu focal-upda

Get:82 http://archive.ubuntu.com/ubuntu focal/main amd64 x11proto-dev all 2019.2-1ubuntu1 [594 kB]
Get:83 http://archive.ubuntu.com/ubuntu focal/main amd64 x11proto-core-dev all 2019.2-1ubuntu1 [2620 B]
Get:84 http://archive.ubuntu.com/ubuntu focal/main amd64 libxau-dev amd64 1:1.0.9-0ubuntu1 [9552 B]
Get:85 http://archive.ubuntu.com/ubuntu focal/main amd64 libxdmcp-dev amd64 1:1.1.3-0ubuntu1 [25.3 kB]
Get:86 http://archive.ubuntu.com/ubuntu focal/main amd64 xtrans-dev all 1.4.0-1 [68.9 kB]
Get:87 http://archive.ubuntu.com/ubuntu focal/main amd64 libpthread-stubs0-dev amd64 0.4-1 [5384 B]
Get:88 http://archive.ubuntu.com/ubuntu focal/main amd64 libxcb1-dev amd64 1.14-2 [80.5 kB]
Get:89 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 libx11-dev amd64 2:1.6.9-2ubuntu1.5 [647 kB]
Get:90 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 libglx-dev amd64 1.3.2-1~ubuntu0.20.04.2 [14.0 kB]
Get:91 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 libgl-dev amd64 1.3.2

Selecting previously unselected package libdrm2:amd64.
Preparing to unpack .../007-libdrm2_2.4.107-8ubuntu1~20.04.2_amd64.deb ...
Unpacking libdrm2:amd64 (2.4.107-8ubuntu1~20.04.2) ...
Selecting previously unselected package libedit2:amd64.
Preparing to unpack .../008-libedit2_3.1-20191231-1_amd64.deb ...
Unpacking libedit2:amd64 (3.1-20191231-1) ...
Selecting previously unselected package libfido2-1:amd64.
Preparing to unpack .../009-libfido2-1_1.3.1-1ubuntu2_amd64.deb ...
Unpacking libfido2-1:amd64 (1.3.1-1ubuntu2) ...
Selecting previously unselected package libxau6:amd64.
Preparing to unpack .../010-libxau6_1%3a1.0.9-0ubuntu1_amd64.deb ...
Unpacking libxau6:amd64 (1:1.0.9-0ubuntu1) ...
Selecting previously unselected package libxdmcp6:amd64.
Preparing to unpack .../011-libxdmcp6_1%3a1.1.3-0ubuntu1_amd64.deb ...
Unpacking libxdmcp6:amd64 (1:1.1.3-0ubuntu1) ...
Selecting previously unselected package libxcb1:amd64.
Preparing to unpack .../012-libxcb1_1.14-2_amd64.deb ...
Unpacking lib

Selecting previously unselected package libxcb-xfixes0:amd64.
Preparing to unpack .../054-libxcb-xfixes0_1.14-2_amd64.deb ...
Unpacking libxcb-xfixes0:amd64 (1.14-2) ...
Selecting previously unselected package libxshmfence1:amd64.
Preparing to unpack .../055-libxshmfence1_1.3-1_amd64.deb ...
Unpacking libxshmfence1:amd64 (1.3-1) ...
Selecting previously unselected package libegl-mesa0:amd64.
Preparing to unpack .../056-libegl-mesa0_21.2.6-0ubuntu0.1~20.04.2_amd64.deb ...
Unpacking libegl-mesa0:amd64 (21.2.6-0ubuntu0.1~20.04.2) ...
Selecting previously unselected package libegl1:amd64.
Preparing to unpack .../057-libegl1_1.3.2-1~ubuntu0.20.04.2_amd64.deb ...
Unpacking libegl1:amd64 (1.3.2-1~ubuntu0.20.04.2) ...
Selecting previously unselected package libxcb-glx0:amd64.
Preparing to unpack .../058-libxcb-glx0_1.14-2_amd64.deb ...
Unpacking libxcb-glx0:amd64 (1.14-2) ...
Selecting previously unselected package libxfixes3:amd64.
Preparing to unpack .../059-libxfixes3_1%3a5.0.3-2_amd64.deb 

Selecting previously unselected package libxcb-randr0:amd64.
Preparing to unpack .../099-libxcb-randr0_1.14-2_amd64.deb ...
Unpacking libxcb-randr0:amd64 (1.14-2) ...
Selecting previously unselected package libxslt1.1:amd64.
Preparing to unpack .../100-libxslt1.1_1.1.34-4ubuntu0.20.04.1_amd64.deb ...
Unpacking libxslt1.1:amd64 (1.1.34-4ubuntu0.20.04.1) ...
Selecting previously unselected package mesa-vulkan-drivers:amd64.
Preparing to unpack .../101-mesa-vulkan-drivers_21.2.6-0ubuntu0.1~20.04.2_amd64.deb ...
Unpacking mesa-vulkan-drivers:amd64 (21.2.6-0ubuntu0.1~20.04.2) ...
Selecting previously unselected package python3-soupsieve.
Preparing to unpack .../102-python3-soupsieve_1.9.5+dfsg-1_all.deb ...
Unpacking python3-soupsieve (1.9.5+dfsg-1) ...
Selecting previously unselected package python3-bs4.
Preparing to unpack .../103-python3-bs4_4.8.2-1_all.deb ...
Unpacking python3-bs4 (4.8.2-1) ...
Selecting previously unselected package python3-ply.
Preparing to unpack .../104-python3-ply

Setting up libxcb1-dev:amd64 (1.14-2) ...
Setting up gpg-wks-client (2.2.19-3ubuntu2.2) ...
Setting up libxrender1:amd64 (1:0.9.10-1) ...
Setting up libgbm1:amd64 (21.2.6-0ubuntu0.1~20.04.2) ...
Setting up libdrm-radeon1:amd64 (2.4.107-8ubuntu1~20.04.2) ...
Setting up openssh-client (1:8.2p1-4ubuntu0.7) ...
Setting up libdrm-intel1:amd64 (2.4.107-8ubuntu1~20.04.2) ...
Setting up libgl1-mesa-dri:amd64 (21.2.6-0ubuntu0.1~20.04.2) ...
Setting up libx11-dev:amd64 (2:1.6.9-2ubuntu1.5) ...
Setting up libxext6:amd64 (2:1.3.4-0ubuntu1) ...
Setting up libcairo2:amd64 (1.16.0-4ubuntu1) ...
Setting up libxxf86vm1:amd64 (1:1.1.4-1build1) ...
Setting up libegl-mesa0:amd64 (21.2.6-0ubuntu0.1~20.04.2) ...
Setting up libxfixes3:amd64 (1:5.0.3-2) ...
Setting up libgdk-pixbuf2.0-0:amd64 (2.40.0+dfsg-3ubuntu0.4) ...
Setting up python3-cairocffi (0.9.0-4) ...
Setting up xauth (1:1.1-0ubuntu1) ...
Setting up libgdk-pixbuf2.0-bin (2.40.0+dfsg-3ubuntu0.4) ...
Setting up libegl1:amd64 (1.3.2-1~ubuntu0.20.04.2

     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 63.3/63.3 kB 15.8 MB/s eta 0:00:00
  Preparing metadata (setup.py): started
  Preparing metadata (setup.py): finished with status 'done'
Collecting oauth2client (from tf-models-official>=2.5.1->object-detection==0.1)
  Downloading oauth2client-4.1.3-py2.py3-none-any.whl (98 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 98.2/98.2 kB 25.8 MB/s eta 0:00:00
Collecting opencv-python-headless (from tf-models-official>=2.5.1->object-detection==0.1)
  Downloading opencv_python_headless-4.7.0.72-cp37-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (49.2 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 49.2/49.2 MB 39.0 MB/s eta 0:00:00
Collecting psutil>=5.4.3 (from tf-models-official>=2.5.1->object-detection==0.1)
  Downloading psutil-5.9.5-cp36-abi3-manylinux_2_12_x86_64.manylinux2010_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (282 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 282.1/282.1 kB 65.1 MB/s eta 0:00:00
Collecting p

Collecting opencv-python>=4.1.0.25 (from lvis->object-detection==0.1)
  Downloading opencv_python-4.7.0.72-cp37-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (61.8 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 61.8/61.8 MB 37.7 MB/s eta 0:00:00
Collecting contourpy>=1.0.1 (from matplotlib->object-detection==0.1)
  Downloading contourpy-1.1.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (300 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 300.4/300.4 kB 60.5 MB/s eta 0:00:00
Collecting fonttools>=4.22.0 (from matplotlib->object-detection==0.1)
  Downloading fonttools-4.40.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.4 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 4.4/4.4 MB 104.0 MB/s eta 0:00:00
Collecting importlib-resources>=3.2.0 (from matplotlib->object-detection==0.1)
  Downloading importlib_resources-5.12.0-py3-none-any.whl (36 kB)
Collecting tensorflow-io-gcs-filesystem==0.32.0 (from tensorflow_io->object-detection==0.1)
  Downloading t

Collecting scikit-learn>=0.21.3 (from seqeval->tf-models-official>=2.5.1->object-detection==0.1)
  Downloading scikit_learn-1.2.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (9.8 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 9.8/9.8 MB 106.4 MB/s eta 0:00:00
Collecting typeguard<3.0.0,>=2.7 (from tensorflow-addons->tf-models-official>=2.5.1->object-detection==0.1)
  Downloading typeguard-2.13.3-py3-none-any.whl (17 kB)
Collecting array-record (from tensorflow-datasets->tf-models-official>=2.5.1->object-detection==0.1)
  Downloading array_record-0.4.0-py38-none-any.whl (3.0 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.0/3.0 MB 48.9 MB/s eta 0:00:00
Collecting click (from tensorflow-datasets->tf-models-official>=2.5.1->object-detection==0.1)
  Downloading click-8.1.3-py3-none-any.whl (96 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 96.6/96.6 kB 21.8 MB/s eta 0:00:00
Collecting etils[enp,epath]>=0.9.0 (from tensorflow-datasets->tf-models-official>=2.5.1->objec

  Building wheel for jax (pyproject.toml): finished with status 'done'
  Created wheel for jax: filename=jax-0.4.13-py3-none-any.whl size=1518707 sha256=4b1e4c0af1dc0d094623142e08de14341a8da9bdf071a446d73c270663e79304
  Stored in directory: /root/.cache/pip/wheels/46/d9/15/d2800d4089dc4c77299ac7513c6aa1036f5491edbd2bf6ba16
  Building wheel for docopt (setup.py): started
  Building wheel for docopt (setup.py): finished with status 'done'
  Created wheel for docopt: filename=docopt-0.6.2-py2.py3-none-any.whl size=13706 sha256=f245ac8eee4a1da1470ce4061090871010f534faacc23206999ff83197eb3dee
  Stored in directory: /root/.cache/pip/wheels/56/ea/58/ead137b087d9e326852a851351d1debf4ada529b6ac0ec4e8c
  Building wheel for promise (setup.py): started
  Building wheel for promise (setup.py): finished with status 'done'
  Created wheel for promise: filename=promise-2.3-py3-none-any.whl size=21485 sha256=9ee960ea31257027eb89c082c05364ac826ce8c87e896d5a5d6f1d8a8194b75b
  Stored in directory: /root/.

Collecting gevent (from sagemaker-training)
  Downloading gevent-22.10.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (6.5 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 6.5/6.5 MB 45.6 MB/s eta 0:00:00
Collecting inotify_simple==1.2.1 (from sagemaker-training)
  Downloading inotify_simple-1.2.1.tar.gz (7.9 kB)
  Preparing metadata (setup.py): started
  Preparing metadata (setup.py): finished with status 'done'
Collecting paramiko>=2.4.2 (from sagemaker-training)
  Downloading paramiko-3.2.0-py3-none-any.whl (224 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 224.2/224.2 kB 48.1 MB/s eta 0:00:00
Collecting protobuf<=3.20.3,>=3.9.2 (from sagemaker-training)
  Downloading protobuf-3.20.3-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.whl (1.0 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.0/1.0 MB 109.9 MB/s eta 0:00:00
Collecting bcrypt>=3.2 (from paramiko>=2.4.2->sagemaker-training)
  Downloading bcrypt-4.0.1-cp36-abi3-manylinux_2_28_x86_64.whl (593 kB)
     ━━━━

[24B40865fd: Pushing  1.498GB/3.605GB[25A[2K[22A[2K[26A[2K[25A[2K[26A[2K[26A[2K[25A[2K[26A[2K[25A[2K[26A[2K[25A[2K[24A[2K[26A[2K[24A[2K[26A[2K[25A[2K[22A[2K[23A[2K[26A[2K[24A[2K[23A[2K[23A[2K[26A[2K[26A[2K[25A[2K[26A[2K[23A[2K[26A[2K[23A[2K[25A[2K[24A[2K[25A[2K[21A[2K[25A[2K[26A[2K[26A[2K[24A[2K[26A[2K[23A[2K[25A[2K[23A[2K[24A[2K[25A[2K[23A[2K[26A[2K[25A[2K[26A[2K[25A[2K[26A[2K[23A[2K[25A[2K[26A[2K[25A[2K[26A[2K[23A[2K[24A[2K[26A[2K[21A[2K[25A[2K[26A[2K[23A[2K[26A[2K[25A[2K[23A[2K[25A[2K[26A[2K[25A[2K[23A[2K[20A[2K[24A[2K[26A[2K[23A[2K[25A[2K[26A[2K[23A[2K[26A[2K[20A[2K[24A[2K[23A[2K[24A[2K[23A[2K[20A[2K[26A[2K[24A[2K[20A[2K[23A[2K[26A[2K[24A[2K[25A[2K[23A[2K[23A[2K[25A[2K[20A[2K[25A[2K[20A[2K[25A[2K[20A[2K[20A[2K[24A[2K[20A[2K[24A[2K[25A[2K[24A[2K[25A[2K[26A[2K[20A[2K[24A[2

[24B40865fd: Pushed   3.641GB/3.605GB[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2K[24A[2

To verify that the image was correctly pushed to the [Elastic Container Registry](https://aws.amazon.com/ecr/), you can look at it in the AWS webapp. For example, below you can see that three different images have been pushed to ECR. You should only see one, called `tf2-object-detection`.
![ECR Example](../data/example_ecr.png)


In [19]:
# display the container name
with open (os.path.join('docker', 'ecr_image_fullname.txt'), 'r') as f:
    container = f.readlines()[0][:-1]

print(container)

406551161089.dkr.ecr.us-east-1.amazonaws.com/tf2-object-detection:20230629021217


## Pre-trained model from model zoo

As often, we are not training from scratch and we will be using a pretrained model from the TF Object Detection model zoo. You can find pretrained checkpoints [here](https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/tf2_detection_zoo.md). Because your time is limited for this project, we recommend to only experiment with the following models:
* SSD MobileNet V2 FPNLite 640x640	
* SSD ResNet50 V1 FPN 640x640 (RetinaNet50)	
* Faster R-CNN ResNet50 V1 640x640	
* EfficientDet D1 640x640	
* Faster R-CNN ResNet152 V1 640x640	

In the code below, the EfficientDet D1 model is downloaded and extracted. This code should be ajusted if you were to experiment with other architectures.

In [39]:
%%bash
rm -rf /tmp/checkpoint
rm -rf source_dir/checkpoint
mkdir /tmp/checkpoint
mkdir source_dir/checkpoint
wget -O /tmp/ssd_mobilenet_v1_fpn_640x640_coco17_tpu-8.tar.gz http://download.tensorflow.org/models/object_detection/tf2/20200711/ssd_mobilenet_v1_fpn_640x640_coco17_tpu-8.tar.gz
tar -zxvf /tmp/ssd_mobilenet_v1_fpn_640x640_coco17_tpu-8.tar.gz --strip-components 2 --directory source_dir/checkpoint ssd_mobilenet_v1_fpn_640x640_coco17_tpu-8/checkpoint

--2023-06-29 03:58:27--  http://download.tensorflow.org/models/object_detection/tf2/20200711/ssd_mobilenet_v1_fpn_640x640_coco17_tpu-8.tar.gz
Resolving download.tensorflow.org (download.tensorflow.org)... 172.253.115.128, 2607:f8b0:4004:c09::80
Connecting to download.tensorflow.org (download.tensorflow.org)|172.253.115.128|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 90453990 (86M) [application/x-tar]
Saving to: ‘/tmp/ssd_mobilenet_v1_fpn_640x640_coco17_tpu-8.tar.gz’

     0K .......... .......... .......... .......... ..........  0% 9.57M 9s
    50K .......... .......... .......... .......... ..........  0% 18.8M 7s
   100K .......... .......... .......... .......... ..........  0% 18.5M 6s
   150K .......... .......... .......... .......... ..........  0% 18.2M 6s
   200K .......... .......... .......... .......... ..........  0% 55.9M 5s
   250K .......... .......... .......... .......... ..........  0% 53.2M 4s
   300K .......... .......... .......... ..

  5100K .......... .......... .......... .......... ..........  5%  114M 2s
  5150K .......... .......... .......... .......... ..........  5%  222M 2s
  5200K .......... .......... .......... .......... ..........  5%  314M 2s
  5250K .......... .......... .......... .......... ..........  5% 83.9M 2s
  5300K .......... .......... .......... .......... ..........  6% 80.9M 2s
  5350K .......... .......... .......... .......... ..........  6% 95.6M 1s
  5400K .......... .......... .......... .......... ..........  6%  195M 1s
  5450K .......... .......... .......... .......... ..........  6% 24.3M 2s
  5500K .......... .......... .......... .......... ..........  6% 80.3M 1s
  5550K .......... .......... .......... .......... ..........  6%  135M 1s
  5600K .......... .......... .......... .......... ..........  6%  324M 1s
  5650K .......... .......... .......... .......... ..........  6% 88.2M 1s
  5700K .......... .......... .......... .......... ..........  6%  186M 1s
  5750K ....

 10500K .......... .......... .......... .......... .......... 11% 92.3M 1s
 10550K .......... .......... .......... .......... .......... 11%  351M 1s
 10600K .......... .......... .......... .......... .......... 12%  328M 1s
 10650K .......... .......... .......... .......... .......... 12%  265M 1s
 10700K .......... .......... .......... .......... .......... 12%  193M 1s
 10750K .......... .......... .......... .......... .......... 12%  323M 1s
 10800K .......... .......... .......... .......... .......... 12% 9.31M 1s
 10850K .......... .......... .......... .......... .......... 12%  260M 1s
 10900K .......... .......... .......... .......... .......... 12%  371M 1s
 10950K .......... .......... .......... .......... .......... 12%  162M 1s
 11000K .......... .......... .......... .......... .......... 12%  308M 1s
 11050K .......... .......... .......... .......... .......... 12%  385M 1s
 11100K .......... .......... .......... .......... .......... 12%  263M 1s
 11150K ....

 15900K .......... .......... .......... .......... .......... 18%  307M 1s
 15950K .......... .......... .......... .......... .......... 18%  141M 1s
 16000K .......... .......... .......... .......... .......... 18%  264M 1s
 16050K .......... .......... .......... .......... .......... 18%  309M 1s
 16100K .......... .......... .......... .......... .......... 18%  288M 1s
 16150K .......... .......... .......... .......... .......... 18%  334M 1s
 16200K .......... .......... .......... .......... .......... 18%  141M 1s
 16250K .......... .......... .......... .......... .......... 18%  138M 1s
 16300K .......... .......... .......... .......... .......... 18%  194M 1s
 16350K .......... .......... .......... .......... .......... 18%  147M 1s
 16400K .......... .......... .......... .......... .......... 18%  194M 1s
 16450K .......... .......... .......... .......... .......... 18%  188M 1s
 16500K .......... .......... .......... .......... .......... 18%  124M 1s
 16550K ....

 21300K .......... .......... .......... .......... .......... 24%  346M 1s
 21350K .......... .......... .......... .......... .......... 24%  117M 1s
 21400K .......... .......... .......... .......... .......... 24%  227M 1s
 21450K .......... .......... .......... .......... .......... 24%  116M 1s
 21500K .......... .......... .......... .......... .......... 24%  170M 1s
 21550K .......... .......... .......... .......... .......... 24% 98.1M 1s
 21600K .......... .......... .......... .......... .......... 24% 62.0M 1s
 21650K .......... .......... .......... .......... .......... 24% 4.40M 1s
 21700K .......... .......... .......... .......... .......... 24%  343M 1s
 21750K .......... .......... .......... .......... .......... 24%  328M 1s
 21800K .......... .......... .......... .......... .......... 24%  323M 1s
 21850K .......... .......... .......... .......... .......... 24%  378M 1s
 21900K .......... .......... .......... .......... .......... 24%  348M 1s
 21950K ....

 26700K .......... .......... .......... .......... .......... 30%  265M 1s
 26750K .......... .......... .......... .......... .......... 30%  315M 1s
 26800K .......... .......... .......... .......... .......... 30%  222M 1s
 26850K .......... .......... .......... .......... .......... 30%  328M 1s
 26900K .......... .......... .......... .......... .......... 30%  388M 1s
 26950K .......... .......... .......... .......... .......... 30%  221M 1s
 27000K .......... .......... .......... .......... .......... 30%  320M 1s
 27050K .......... .......... .......... .......... .......... 30%  315M 1s
 27100K .......... .......... .......... .......... .......... 30%  200M 1s
 27150K .......... .......... .......... .......... .......... 30%  286M 1s
 27200K .......... .......... .......... .......... .......... 30%  274M 1s
 27250K .......... .......... .......... .......... .......... 30%  376M 1s
 27300K .......... .......... .......... .......... .......... 30%  358M 1s
 27350K ....

 32100K .......... .......... .......... .......... .......... 36%  376M 1s
 32150K .......... .......... .......... .......... .......... 36%  204M 1s
 32200K .......... .......... .......... .......... .......... 36%  132M 1s
 32250K .......... .......... .......... .......... .......... 36%  220M 1s
 32300K .......... .......... .......... .......... .......... 36%  213M 1s
 32350K .......... .......... .......... .......... .......... 36%  225M 1s
 32400K .......... .......... .......... .......... .......... 36%  206M 1s
 32450K .......... .......... .......... .......... .......... 36%  250M 1s
 32500K .......... .......... .......... .......... .......... 36%  202M 1s
 32550K .......... .......... .......... .......... .......... 36%  378M 1s
 32600K .......... .......... .......... .......... .......... 36%  303M 1s
 32650K .......... .......... .......... .......... .......... 37%  380M 1s
 32700K .......... .......... .......... .......... .......... 37%  390M 1s
 32750K ....

 37500K .......... .......... .......... .......... .......... 42%  384M 1s
 37550K .......... .......... .......... .......... .......... 42%  372M 1s
 37600K .......... .......... .......... .......... .......... 42%  212M 1s
 37650K .......... .......... .......... .......... .......... 42%  331M 1s
 37700K .......... .......... .......... .......... .......... 42%  397M 1s
 37750K .......... .......... .......... .......... .......... 42%  351M 1s
 37800K .......... .......... .......... .......... .......... 42%  225M 1s
 37850K .......... .......... .......... .......... .......... 42%  219M 1s
 37900K .......... .......... .......... .......... .......... 42%  286M 1s
 37950K .......... .......... .......... .......... .......... 43%  394M 1s
 38000K .......... .......... .......... .......... .......... 43%  286M 1s
 38050K .......... .......... .......... .......... .......... 43%  366M 1s
 38100K .......... .......... .......... .......... .......... 43%  317M 1s
 38150K ....

 42900K .......... .......... .......... .......... .......... 48%  387M 0s
 42950K .......... .......... .......... .......... .......... 48% 4.60M 0s
 43000K .......... .......... .......... .......... .......... 48% 49.9M 0s
 43050K .......... .......... .......... .......... .......... 48%  273M 0s
 43100K .......... .......... .......... .......... .......... 48%  161M 0s
 43150K .......... .......... .......... .......... .......... 48%  262M 0s
 43200K .......... .......... .......... .......... .......... 48%  215M 0s
 43250K .......... .......... .......... .......... .......... 49%  344M 0s
 43300K .......... .......... .......... .......... .......... 49%  386M 0s
 43350K .......... .......... .......... .......... .......... 49%  357M 0s
 43400K .......... .......... .......... .......... .......... 49%  305M 0s
 43450K .......... .......... .......... .......... .......... 49%  351M 0s
 43500K .......... .......... .......... .......... .......... 49%  353M 0s
 43550K ....

 48300K .......... .......... .......... .......... .......... 54% 54.4M 0s
 48350K .......... .......... .......... .......... .......... 54% 15.6M 0s
 48400K .......... .......... .......... .......... .......... 54%  271M 0s
 48450K .......... .......... .......... .......... .......... 54%  387M 0s
 48500K .......... .......... .......... .......... .......... 54%  288M 0s
 48550K .......... .......... .......... .......... .......... 55%  390M 0s
 48600K .......... .......... .......... .......... .......... 55%  207M 0s
 48650K .......... .......... .......... .......... .......... 55%  329M 0s
 48700K .......... .......... .......... .......... .......... 55%  239M 0s
 48750K .......... .......... .......... .......... .......... 55%  372M 0s
 48800K .......... .......... .......... .......... .......... 55%  198M 0s
 48850K .......... .......... .......... .......... .......... 55%  386M 0s
 48900K .......... .......... .......... .......... .......... 55%  353M 0s
 48950K ....

 53700K .......... .......... .......... .......... .......... 60%  197M 0s
 53750K .......... .......... .......... .......... .......... 60%  341M 0s
 53800K .......... .......... .......... .......... .......... 60%  278M 0s
 53850K .......... .......... .......... .......... .......... 61%  380M 0s
 53900K .......... .......... .......... .......... .......... 61%  393M 0s
 53950K .......... .......... .......... .......... .......... 61%  359M 0s
 54000K .......... .......... .......... .......... .......... 61%  288M 0s
 54050K .......... .......... .......... .......... .......... 61%  342M 0s
 54100K .......... .......... .......... .......... .......... 61%  392M 0s
 54150K .......... .......... .......... .......... .......... 61%  224M 0s
 54200K .......... .......... .......... .......... .......... 61%  296M 0s
 54250K .......... .......... .......... .......... .......... 61% 51.5M 0s
 54300K .......... .......... .......... .......... .......... 61%  243M 0s
 54350K ....

 59100K .......... .......... .......... .......... .......... 66%  214M 0s
 59150K .......... .......... .......... .......... .......... 67% 18.4M 0s
 59200K .......... .......... .......... .......... .......... 67% 83.3M 0s
 59250K .......... .......... .......... .......... .......... 67%  334M 0s
 59300K .......... .......... .......... .......... .......... 67%  340M 0s
 59350K .......... .......... .......... .......... .......... 67%  366M 0s
 59400K .......... .......... .......... .......... .......... 67%  333M 0s
 59450K .......... .......... .......... .......... .......... 67%  334M 0s
 59500K .......... .......... .......... .......... .......... 67% 38.1M 0s
 59550K .......... .......... .......... .......... .......... 67%  337M 0s
 59600K .......... .......... .......... .......... .......... 67%  293M 0s
 59650K .......... .......... .......... .......... .......... 67%  381M 0s
 59700K .......... .......... .......... .......... .......... 67%  341M 0s
 59750K ....

 64500K .......... .......... .......... .......... .......... 73%  139M 0s
 64550K .......... .......... .......... .......... .......... 73%  345M 0s
 64600K .......... .......... .......... .......... .......... 73%  282M 0s
 64650K .......... .......... .......... .......... .......... 73%  364M 0s
 64700K .......... .......... .......... .......... .......... 73%  342M 0s
 64750K .......... .......... .......... .......... .......... 73%  357M 0s
 64800K .......... .......... .......... .......... .......... 73%  293M 0s
 64850K .......... .......... .......... .......... .......... 73%  181M 0s
 64900K .......... .......... .......... .......... .......... 73%  358M 0s
 64950K .......... .......... .......... .......... .......... 73%  334M 0s
 65000K .......... .......... .......... .......... .......... 73%  276M 0s
 65050K .......... .......... .......... .......... .......... 73%  351M 0s
 65100K .......... .......... .......... .......... .......... 73%  375M 0s
 65150K ....

 69900K .......... .......... .......... .......... .......... 79%  377M 0s
 69950K .......... .......... .......... .......... .......... 79%  338M 0s
 70000K .......... .......... .......... .......... .......... 79%  101M 0s
 70050K .......... .......... .......... .......... .......... 79%  222M 0s
 70100K .......... .......... .......... .......... .......... 79% 46.6M 0s
 70150K .......... .......... .......... .......... .......... 79%  210M 0s
 70200K .......... .......... .......... .......... .......... 79%  320M 0s
 70250K .......... .......... .......... .......... .......... 79%  385M 0s
 70300K .......... .......... .......... .......... .......... 79%  201M 0s
 70350K .......... .......... .......... .......... .......... 79%  359M 0s
 70400K .......... .......... .......... .......... .......... 79%  239M 0s
 70450K .......... .......... .......... .......... .......... 79%  237M 0s
 70500K .......... .......... .......... .......... .......... 79% 17.4M 0s
 70550K ....

 75300K .......... .......... .......... .......... .......... 85% 62.1M 0s
 75350K .......... .......... .......... .......... .......... 85%  263M 0s
 75400K .......... .......... .......... .......... .......... 85%  318M 0s
 75450K .......... .......... .......... .......... .......... 85%  148M 0s
 75500K .......... .......... .......... .......... .......... 85%  316M 0s
 75550K .......... .......... .......... .......... .......... 85%  360M 0s
 75600K .......... .......... .......... .......... .......... 85% 29.0M 0s
 75650K .......... .......... .......... .......... .......... 85%  385M 0s
 75700K .......... .......... .......... .......... .......... 85%  316M 0s
 75750K .......... .......... .......... .......... .......... 85%  377M 0s
 75800K .......... .......... .......... .......... .......... 85% 22.6M 0s
 75850K .......... .......... .......... .......... .......... 85%  164M 0s
 75900K .......... .......... .......... .......... .......... 85%  344M 0s
 75950K ....

 80700K .......... .......... .......... .......... .......... 91% 34.6M 0s
 80750K .......... .......... .......... .......... .......... 91%  371M 0s
 80800K .......... .......... .......... .......... .......... 91%  245M 0s
 80850K .......... .......... .......... .......... .......... 91%  284M 0s
 80900K .......... .......... .......... .......... .......... 91% 36.1M 0s
 80950K .......... .......... .......... .......... .......... 91% 40.3M 0s
 81000K .......... .......... .......... .......... .......... 91%  317M 0s
 81050K .......... .......... .......... .......... .......... 91%  387M 0s
 81100K .......... .......... .......... .......... .......... 91%  123M 0s
 81150K .......... .......... .......... .......... .......... 91%  257M 0s
 81200K .......... .......... .......... .......... .......... 91%  152M 0s
 81250K .......... .......... .......... .......... .......... 92%  304M 0s
 81300K .......... .......... .......... .......... .......... 92%  386M 0s
 81350K ....

 86100K .......... .......... .......... .......... .......... 97%  228M 0s
 86150K .......... .......... .......... .......... .......... 97%  107M 0s
 86200K .......... .......... .......... .......... .......... 97%  324M 0s
 86250K .......... .......... .......... .......... .......... 97%  340M 0s
 86300K .......... .......... .......... .......... .......... 97%  354M 0s
 86350K .......... .......... .......... .......... .......... 97%  178M 0s
 86400K .......... .......... .......... .......... .......... 97%  309M 0s
 86450K .......... .......... .......... .......... .......... 97%  230M 0s
 86500K .......... .......... .......... .......... .......... 97%  185M 0s
 86550K .......... .......... .......... .......... .......... 98%  374M 0s
 86600K .......... .......... .......... .......... .......... 98%  247M 0s
 86650K .......... .......... .......... .......... .......... 98%  194M 0s
 86700K .......... .......... .......... .......... .......... 98%  380M 0s
 86750K ....

ssd_mobilenet_v1_fpn_640x640_coco17_tpu-8/checkpoint/ckpt-0.data-00000-of-00001
ssd_mobilenet_v1_fpn_640x640_coco17_tpu-8/checkpoint/checkpoint
ssd_mobilenet_v1_fpn_640x640_coco17_tpu-8/checkpoint/ckpt-0.index


## Edit pipeline.config file

The [`pipeline.config`](source_dir/pipeline.config) in the `source_dir` folder should be updated when you experiment with different models. The different config files are available [here](https://github.com/tensorflow/models/tree/master/research/object_detection/configs/tf2).

>Note: The provided `pipeline.config` file works well with the `EfficientDet` model. You would need to modify it when working with other models.

## Launch Training Job

Now that we have a dataset, a docker image and some pretrained model weights, we can launch the training job. To do so, we create a [Sagemaker Framework](https://sagemaker.readthedocs.io/en/stable/frameworks/index.html), where we indicate the container name, name of the config file, number of training steps etc.

The `run_training.sh` script does the following:
* train the model for `num_train_steps` 
* evaluate over the val dataset
* export the model

Different metrics will be displayed during the evaluation phase, including the mean average precision. These metrics can be used to quantify your model performances and compare over the different iterations.

You can also monitor the training progress by navigating to **Training -> Training Jobs** from the Amazon Sagemaker dashboard in the Web UI.

In [41]:
tensorboard_output_config = sagemaker.debugger.TensorBoardOutputConfig(
    s3_output_path=tensorboard_s3_prefix,
    container_local_output_path='/opt/training/'
)

estimator = CustomFramework(
    role=role,
    image_uri=container,
    entry_point='run_training.sh',
    source_dir='source_dir/',
    hyperparameters={
        "model_dir":"/opt/training",        
        "pipeline_config_path": "ssd_mobilenet_v1_fpn_640x640_coco17_tpu-8.config",
        "num_train_steps": "2000",    
        "sample_1_of_n_eval_examples": "1"
    },
    instance_count=1,
    instance_type='ml.m5.2xlarge',
    tensorboard_output_config=tensorboard_output_config,
    disable_profiler=True,
    base_job_name='tf2-object-detection'
)

estimator.fit(inputs)

INFO:botocore.credentials:Found credentials from IAM Role: BaseNotebookInstanceEc2InstanceRole


Using provided s3_resource


INFO:sagemaker:Creating training-job with name: tf2-object-detection-2023-06-29-04-06-46-203


2023-06-29 04:06:49 Starting - Starting the training job...
2023-06-29 04:07:05 Starting - Preparing the instances for training......
2023-06-29 04:08:03 Downloading - Downloading input data...
2023-06-29 04:08:25 Training - Downloading the training image............
2023-06-29 04:10:51 Training - Training image download completed. Training in progress.....[34m2023-06-29 04:11:16,353 sagemaker-training-toolkit INFO     No GPUs detected (normal if no gpus installed)[0m
[34m2023-06-29 04:11:16,356 sagemaker-training-toolkit INFO     No Neurons detected (normal if no neurons installed)[0m
[34m2023-06-29 04:11:16,369 sagemaker-training-toolkit INFO     No GPUs detected (normal if no gpus installed)[0m
[34m2023-06-29 04:11:16,372 sagemaker-training-toolkit INFO     No Neurons detected (normal if no neurons installed)[0m
[34m2023-06-29 04:11:16,385 sagemaker-training-toolkit INFO     No GPUs detected (normal if no gpus installed)[0m
[34m2023-06-29 04:11:16,387 sagemaker-training-t

[34mInstructions for updating:[0m
[34mCreate a `tf.sparse.SparseTensor` and use `tf.sparse.to_dense` instead.[0m
[34mW0629 04:11:28.332491 140603438024512 deprecation.py:364] From /usr/local/lib/python3.8/dist-packages/tensorflow/python/util/dispatch.py:1176: sparse_to_dense (from tensorflow.python.ops.sparse_ops) is deprecated and will be removed in a future version.[0m
[34mInstructions for updating:[0m
[34mCreate a `tf.sparse.SparseTensor` and use `tf.sparse.to_dense` instead.[0m
[34mInstructions for updating:[0m
[34m`seed2` arg is deprecated.Use sample_distorted_bounding_box_v2 instead.[0m
[34mW0629 04:11:30.884773 140603438024512 deprecation.py:364] From /usr/local/lib/python3.8/dist-packages/tensorflow/python/util/dispatch.py:1176: sample_distorted_bounding_box (from tensorflow.python.ops.image_ops_impl) is deprecated and will be removed in a future version.[0m
[34mInstructions for updating:[0m
[34m`seed2` arg is deprecated.Use sample_distorted_bounding_box_v2 i

[34mINFO:tensorflow:Step 800 per-step time 10.007s[0m
[34mI0629 06:23:17.144958 140603438024512 model_lib_v2.py:705] Step 800 per-step time 10.007s[0m
[34mINFO:tensorflow:{'Loss/classification_loss': 0.20956759,
 'Loss/localization_loss': 0.23887336,
 'Loss/regularization_loss': 0.77658945,
 'Loss/total_loss': 1.2250304,
 'learning_rate': 0.023999799}[0m
[34mI0629 06:23:17.145213 140603438024512 model_lib_v2.py:708] {'Loss/classification_loss': 0.20956759,
 'Loss/localization_loss': 0.23887336,
 'Loss/regularization_loss': 0.77658945,
 'Loss/total_loss': 1.2250304,
 'learning_rate': 0.023999799}[0m
[34mINFO:tensorflow:Step 900 per-step time 10.000s[0m
[34mI0629 06:39:57.112847 140603438024512 model_lib_v2.py:705] Step 900 per-step time 10.000s[0m
[34mINFO:tensorflow:{'Loss/classification_loss': 0.19667624,
 'Loss/localization_loss': 0.25020242,
 'Loss/regularization_loss': 0.7752411,
 'Loss/total_loss': 1.2221198,
 'learning_rate': 0.025333151}[0m
[34mI0629 06:39:57.1131

[34mInstructions for updating:[0m
[34mUse `tf.cast` instead.[0m
[34mW0629 09:43:18.530823 140337997973312 deprecation.py:364] From /usr/local/lib/python3.8/dist-packages/tensorflow/python/util/dispatch.py:1176: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.[0m
[34mInstructions for updating:[0m
[34mUse `tf.cast` instead.[0m
[34mW0629 09:43:21.052653 140337997973312 module_wrapper.py:149] From /usr/local/lib/python3.8/dist-packages/object_detection/builders/optimizer_builder.py:124: The name tf.keras.optimizers.SGD is deprecated. Please use tf.keras.optimizers.legacy.SGD instead.[0m
[34mINFO:tensorflow:Waiting for new checkpoint at /opt/training[0m
[34mI0629 09:43:21.052998 140337997973312 checkpoint_utils.py:168] Waiting for new checkpoint at /opt/training[0m
[34mINFO:tensorflow:Found new checkpoint at /opt/training/ckpt-3[0m
[34mI0629 09:43:21.053523 140337997973312 checkpoint_utils.py:177] Found new checkpoint a

[34mINFO:tensorflow:Timed-out waiting for a checkpoint.[0m
[34mI0629 09:52:47.123873 140337997973312 checkpoint_utils.py:231] Timed-out waiting for a checkpoint.[0m
[34mcreating index...[0m
[34mindex created![0m
[34mcreating index...[0m
[34mindex created![0m
[34mRunning per image evaluation...[0m
[34mEvaluate annotation type *bbox*[0m
[34mDONE (t=38.69s).[0m
[34mAccumulating evaluation results...[0m
[34mDONE (t=1.84s).
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.195
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.376
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.174
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.090
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.493
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.611
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.

You should be able to see your model training in the AWS webapp as shown below:
![ECR Example](../data/example_trainings.png)


## Improve on the intial model

Most likely, this initial experiment did not yield optimal results. However, you can make multiple changes to the `pipeline.config` file to improve this model. One obvious change consists in improving the data augmentation strategy. The [`preprocessor.proto`](https://github.com/tensorflow/models/blob/master/research/object_detection/protos/preprocessor.proto) file contains the different data augmentation method available in the Tf Object Detection API. Justify your choices of augmentations in the writeup.

Keep in mind that the following are also available:
* experiment with the optimizer: type of optimizer, learning rate, scheduler etc
* experiment with the architecture. The Tf Object Detection API model zoo offers many architectures. Keep in mind that the pipeline.config file is unique for each architecture and you will have to edit it.
* visualize results on the test frames using the `2_deploy_model` notebook available in this repository.

In the cell below, write down all the different approaches you have experimented with, why you have chosen them and what you would have done if you had more time and resources. Justify your choices using the tensorboard visualizations (take screenshots and insert them in your writeup), the metrics on the evaluation set and the generated animation you have created with [this tool](../2_run_inference/2_deploy_model.ipynb).

In [None]:
# your writeup goes here.