# WPILib ML Notebook


## Introduction

By using this notebook, you can train a TensorFlow Lite model for use on a Raspberry Pi and Google Coral USB Accelerator. We've designed this process to be as simple as possible. If you find an issue with this notebook, please create a new issue report on our [GitHub page](https://github.com/GrantPerkins/CoralSagemaker), where you downloaded this notebook.

### Training

1. Download the WPILIB dataset as a .tar file [here](https://github.com/GrantPerkins/CoralSagemaker/releases/download/v1/WPILib.tar)
2. Upload your .tar file to a new folder in an Amazon S3 bucket, or a brand new S3 bucket.
3. Create a new SageMaker notebook instance, and open the WPILib notebook.
4. Change estimator.fit() in the last code cell to use your new dataset, by specifying the folder in which the tar is stored.
5. Run the code block.
6. Training should take roughly 10 minutes and cost roughly \\$0.55 if using the GPU instance, or 45 minutes and cost roughly \\$0.45 if using the CPU instance. If you do not change anything in the notebook, other than the S3 location, it should absolutely not take longer than an hour.

## Notebook


In [10]:
%%sh
#!/usr/bin/env bash
docker system prune --force > /dev/null 2>&1

# The name of our algorithm
algorithm_name=wpi-gpu

cd container

chmod -R +x coral/

account=$(aws sts get-caller-identity --query Account --output text)

# Get the region defined in the current configuration (default to us-west-2 if none defined)
region=$(aws configure get region)
region=${region:us-east-1}

fullname="${account}.dkr.ecr.${region}.amazonaws.com/${algorithm_name}:latest"

# If the repository doesn't exist in ECR, create it.

aws ecr describe-repositories --repository-names "${algorithm_name}" > /dev/null 2>&1

if [ $? -ne 0 ]
then
    aws ecr create-repository --repository-name "${algorithm_name}" > /dev/null
fi

# Get the login command from ECR and execute it directly
$(aws ecr get-login --region ${region} --no-include-email) 2> /dev/null

# Build the docker image locally with the image name and then push it to ECR
# with the full name.

docker build  -t ${algorithm_name} . --no-cache # > /dev/null 2>&1
docker tag ${algorithm_name} ${fullname} > /dev/null 2>&1

docker push ${fullname} > /dev/null 2>&1



#!/bin/bash

# By Grant Perkins, 2019

# Make directories used in training
mkdir /tensorflow/models/research/learn
mkdir /tensorflow/models/research/learn/ckpt

# Remove directories created during training
rm -rf /tensorflow/models/research/learn/train
rm -rf /tensorflow/models/research/learn/models

cd /tensorflow/models/research

# Download checkpoints for MobileNet v2 hosted by Google
echo "\nDownloading model"
./prepare_checkpoint_and_dataset.sh --network_type mobilenet_v2_ssd --train_whole_model false  > /dev/null 2>/dev/null

./tar_to_record.sh

cd /opt/ml/input/data/training
classes=$(python3 /tensorflow/models/research/labels.py)
cd /tensorflow/models/research

sed -i "s%NUM_CLASSES%${classes}%g" "./pipeline.config"

# Copy custom pipeline into docker
cp pipeline.config /tensorflow/models/research/learn/ckpt/pipeline.config

# Get number of epochs from SageMaker
TRAIN_STEPS=$(python3 hyper.py)

echo "Beginning training on Docker image"
./retrain_detection_model.sh --num_trainin


Could not connect to the endpoint URL: "https://api.ecr.1.amazonaws.com/"


This step runs the training instance (default for GPU is a ml.p3.2xlarge and for the default is CPU is an ml.c4.2xlarge), and begins training with the data specified in `fit()`

This section has lots of configurable values
You need to change `estimator.fit(...)`:to be the location of the data used for training. (the bucket you uploaded the .tar to) It should be in the format `"s3://BUCKET-NAME"`


In [6]:
from sagemaker.estimator import Estimator
from sagemaker import get_execution_role


# Uses GPU by default, change to false to use CPU
use_gpu = True

role = get_execution_role()

instance_type = None
algorithm_name = None

import boto3

client = boto3.client('sts')
account = client.get_caller_identity()['Account']

my_session = boto3.session.Session()
region = my_session.region_name

if not use_gpu:
    instance_type = 'ml.c4.2xlarge'
    algorithm_name = 'sagemaker-tf-wpi'
else:
    instance_type = 'ml.p3.2xlarge'
    algorithm_name = 'wpi-gpu'

# The number of epochs to train to. 500 is a safe number. With the default instance, it should take 45 minutes.
hyperparameters = {'epochs': 500}

ecr_image = '{}.dkr.ecr.{}.amazonaws.com/{}:latest'.format(account, region, algorithm_name)
print(ecr_image)

# The estimator object, using our notebook, training instance, the ECR image, and the specified training steps
estimator = Estimator(role=role,
                      train_instance_count=1,
                      train_instance_type=instance_type,
                      image_name=ecr_image,
                      hyperparameters=hyperparameters)

# Change this bucket if you want to train with your own data. The WPILib bucket contains thousands of high quality labeled images.
# s3://wpilib
estimator.fit("s3://wpilib")


249838237784.dkr.ecr.us-east-1.amazonaws.com/wpi-gpu:latest
2019-11-04 05:01:56 Starting - Starting the training job...
2019-11-04 05:01:57 Starting - Launching requested ML instances......
2019-11-04 05:03:04 Starting - Preparing the instances for training...
2019-11-04 05:03:55 Downloading - Downloading input data...
2019-11-04 05:04:09 Training - Downloading the training image............
2019-11-04 05:06:12 Training - Training image download completed. Training in progress.[31mDownloading model[0m
[31mSuccessfully created the TFRecords: /opt/ml/input/data/training/train.record[0m
[31mSuccessfully created the TFRecords: /opt/ml/input/data/training/eval.record[0m
[31mRecords generated.[0m
[31mBeginning training on Docker image[0m
[31m+ num_training_steps=500[0m
[31m+ [[ 4 -gt 0 ]][0m
[31m+ case "$1" in[0m
[31m+ num_training_steps=500[0m
[31m+ shift 2[0m
[31m+ [[ 2 -gt 0 ]][0m
[31m+ case "$1" in[0m
[31m+ num_eval_steps=1[0m
[31m+ shift 2[0m
[31m+ [[ 0 -gt 

KeyboardInterrupt: 

### Running inference on your trained model.

1. After running all of the code in this notebook, go to the training job in SageMaker, scroll to the bottom, and find the output S3 location
2. Download the the .tar file in the bucket, extract it, and get your .tflite file
3. FTP output.tflite into the directory SD_CARD:/home/pi.
4. Run the python script, using `python3 object_detection.py --team 190`


