##### *Copyright 2020 Google LLC*
*Licensed under the Apache License, Version 2.0 (the "License")*

In [None]:
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Retrain a classification model for the AIY Vision Kit (with TF1)

In this tutorial, we'll use TensorFlow to retrain an image classification model (MobileNet) with a floors dataset, and compile it into the TensorFlow format that's compatible with the AIY Vision Bonnet (included in the AIY Vision Kit).

All the code examples on this page are executable from your web browser, but you must execute them in order. So although you can run each code block individually, we recommend that you run everything by selecting **Runtime > Run all** in the Colab toolbar. That allows the training to get started while you read the tutorial.

**Note:** Although the code on this page executes in the cloud, you **should not run this on your AIY Vision Kit**. This web app is still quite complex and the Raspberry Pi Zero cannot run it effectively (it will be very slow and you'll have a bad time). Instead, use your desktop computer to run the Colab tutorial and then transfer the files to the Raspberry Pi (as described below).

<a href="https://colab.research.google.com/github/google/aiyprojects-raspbian/blob/aiyprojects/tutorials/vision/aiy_retrain_classification.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open in Colab"></a>
&nbsp;&nbsp;&nbsp;&nbsp;
<a href="https://github.com/google/aiyprojects-raspbian/blob/aiyprojects/tutorials/vision/aiy_retrain_classification.ipynb" target="_parent"><img src="https://img.shields.io/static/v1?logo=GitHub&label=&color=333333&style=flat&message=View%20on%20GitHub" alt="View in GitHub"></a>


## Import the required libraries

First, we need to remove the version of TensorFlow that's included with Google Colab by default, and replace it with TensorFlow 1.13.1, as required by the following training scripts.

In [None]:
! pip uninstall tensorflow tensorboard -y

In [None]:
! pip install -I absl-py==0.9 jupyter-client==6.1.5 tornado==5.1.0 folium==0.2.1 imgaug==0.2.5 tensorflow==1.13.1 tensorboard==1.13.1

The output above  might say "You must restart the runtime" but you should ignore that. You **do not** need to restart.

In [None]:
import tensorflow as tf
assert tf.__version__.startswith('1.13.1')

## Prepare the training data

First let's download and organize the floors dataset we'll use to retrain the model (it contains 5 floor classes).

Pay attention to this part so you can reproduce it with your own images dataset. In particular, notice that the "floor_photos" directory contains an appropriately-named directory for each class. (If you want to retrain the model with different photos, we'll discuss this more at the very end.)

In [None]:
! git clone https://github.com/googlecodelabs/tensorflow-for-poets-2

%cd tensorflow-for-poets-2

In [None]:
! curl https://media.githubusercontent.com/media/lxy1992/elevator_bot/floor/dataset/floors.tgz | tar xz -C tf_files

In [None]:
! ls tf_files/floor_photos

## Retrain the model

First specify the input image size (this is for both width and height; the model's input expects a square image) and the depth multiplier for the MobileNet model. 

Based on our testing, the only variations compatible with the Vision Bonnet's ML accelerator are the following:

+ Input size = 160x160, and depth multiplier = 0.5
+ Input size = 192x192, and depth multiplier = 1.0

In [None]:
IMAGE_SIZE='160'
MULTIPLIER='0.50'
%env ARCHITECTURE=mobilenet_{MULTIPLIER}_{IMAGE_SIZE}

Now start training the model with the floor photos:


In [None]:
! python scripts/retrain.py \
  --bottleneck_dir=tf_files/bottlenecks \
  --how_many_training_steps=500 \
  --model_dir=tf_files/models/ \
  --summaries_dir=tf_files/training_summaries/$ARCHITECTURE \
  --output_graph=tf_files/retrained_graph.pb \
  --output_labels=tf_files/retrained_labels.txt \
  --architecture=$ARCHITECTURE \
  --image_dir=tf_files/floor_photos

## Compile the model for the Vision Kit

The training script above creates a TensorFlow model that you can run on a CPU, but we want to run this on the AIY Vision Bonnet's ML accelerator (the Myriad 2450). So we need to compile the model for that chip.

First download the Vision Bonnet model compiler:

In [None]:
! curl -LO https://dl.google.com/dl/aiyprojects/vision/bonnet_model_compiler_latest.tgz	

! tar -xzf bonnet_model_compiler_latest.tgz

Then compile the model:

In [None]:
! ./bonnet_model_compiler.par \
  --frozen_graph_path=tf_files/retrained_graph.pb \
  --output_graph_path=tf_files/retrained_graph.binaryproto \
  --input_tensor_name=input \
  --output_tensor_names=final_result \
  --input_tensor_size=160 \
  --debug

Don't worry if you see an error like this:

```
Check failed: toco::port::file::GetContents(FLAGS_frozen_graph_path, &frozen_graph_contents, toco::port::file::Options())
```

Just click the Play button in the above code to run it again. It should work this time.

That's it. Your retrained model is ready to run on the Vision Kit.

## Download the model


The compiled model is saved into this Colab runtime's temporary storage. You can download it to your computer like this:

1. Open the **Files** tab in the left panel.
2. Expand the **tensorflow-for-poets-2** folder and then the **tf_files** folder.
3. Right-click on `retrained_graph.binaryproto` and select **Download**.
4. Also download `retrained_labels.txt`.

## Copy the files to the Raspberry Pi

First you need to get the Raspberry Pi's IP address so you can transfer the model files using SSH:

1. Log into the Raspberry Pi and follow the [instructions to enable remote SSH access](https://www.raspberrypi.org/documentation/remote-access/ssh/).
2. On the Pi, open the terminal and run this command:

   ```
   hostname -I
   ```

3. Write down the first IP address that's printed. For example, `192.168.86.24` (yours might be different).
4. Also make sure the Raspberry Pi is [connected to Wi-Fi](https://www.raspberrypi.org/documentation/configuration/wireless/desktop.md) (must be on the same Wi-Fi as your desktop computer).

Now go back to your desktop computer and transfer the files:
1. Open a terminal and navigate to where the model files were downloaded (usually the `$HOME/Downloads` directory):

  ```
  cd ~/Downloads
  ```

2. Using the Raspberry Pi's IP address, transfer the files using `scp`. For example:

  ```
  scp retrained_*  pi@192.168.86.24:/home/pi
  ```

3. If it says the authenticity of the host can't be established and asks "are you sure you want to continue?", type "yes" and press Enter.

4. When prompted, enter the password for your Pi. By default, the password is "raspberry" but you might have changed this already.

If successful, your terminal prints the files that it transferred:

```
retrained_graph.binaryproto      100% 2618KB   2.0MB/s   00:01    
retrained_labels.txt             100%   40     6.0KB/s   00:00 
```

## Run the model on the Vision Kit




1.  Log into the Raspberry Pi either via the desktop or with SSH from your desktop. If you use SSH then you can easily copy-paste the next command to run on the Pi. For example, you can SSH to the Pi like this (using the same IP address you got from above):

  ```
  ssh pi@192.168.86.24
  ```

2.  Double check that the transferred files are where you expect them:

  ```
  cd ~ && ls
  ```

  You should see `retrained_graph.binaryproto` and `retrained_labels.txt`.

3.  Make sure there are no other programs currently using the Pi Camera.

  For example, by default, the Vision Kit automatically runs the Joy Detector at startup. You can stop it with this command on the Pi:

  ```
  sudo systemctl stop joy_detection_demo
  ```

4.  Now run this command from the Raspberry Pi shell to test the retrained model that you transferred to the board:

  ```
  ~/AIY-projects-python/src/examples/vision/mobilenet_based_classifier.py \
      --model_path ~/retrained_graph.binaryproto \
      --label_path ~/retrained_labels.txt \
      --input_height 160 \
      --input_width 160 \
      --input_layer input \
      --output_layer final_result \
      --preview
  ```

It takes about 10 seconds for the model to load and start printing results. 

It will print the top classifications that the model detects based on what the camera sees. If you have a monitor connected to the Raspberry Pi, it also displays the camera preview with the classification printed at the top.

Try holding up images of different floors that were included in the training dataset (daisies, dandelions, roses, sunfloors, and tulips). You can search for these floors online and point the Vision Kit at the photos on your screen to see the predictions.

The predictions should be pretty good, but this was just a basic training example. With more experience with TensorFlow and some modifications to the training parameters used above, you can create a much more accurate model.

## Recap and next steps

In this tutorial Colab, you downloaded a bunch of floor photos and used them to retrain the MobileNet V1 image classification model.

To perform the training, we used the `retrain.py` Python script, which can quickly retrain an image classificaiton model using a relatively-small number of sample images. It does so by reusing most of the model's pre-trained weights, but replacing the top layers that perform the final classification and training new weights for that part of the model only.

To learn more about how this training script works, look at the source code for the [`retrain.py` script](https://github.com/googlecodelabs/tensorflow-for-poets-2/blob/master/scripts/retrain.py).




### Train with your own images

Using this script and all the code above, try using your own image dataset to train the model to identify different objects. To get good results, you should provide a few hundred sample images for each object you want to recognize. 

Just put the images for each object class into a folder that's named corresponding to that object (for example, see how the [floors dataset](http://download.tensorflow.org/example_images/floor_photos.tgz) is organized). 

Also be sure your images are resized appropriately; although it's not necessary that they match the model's input size exactly, they should not be very large resolution (again, inspect the floors dataset for example image sizes).


### Build a program to run the model

A good place to start learning this part is to read the source code for the [`mobilenet_based_classifier.py` script](https://github.com/google/aiyprojects-raspbian/blob/aiyprojects/src/examples/vision/mobilenet_based_classifier.py) (used above to run the model).

You might want to just copy that code and make whatever changes you want so the program does what you want, such as turn on an LED or make a noise when the camera detects a specific object (try turning on the LED if the returned results includes a specific label ID with a probability score higher than 0.5).

To execute an inference with each camera frame, this code uses the [`aiy.vision.inference`](https://aiyprojects.readthedocs.io/en/latest/aiy.vision.inference.html) API. So check out that API reference to better understand the example code and see what else you can do with the AIY Vision API.