## This notebook is adapted from https://github.com/microsoft/CameraTraps/blob/master/detection/megadetector_colab.ipynb

<a href="https://colab.research.google.com/github/microsoft/CameraTraps/blob/master/detection/megadetector_colab.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open in Colab"/>
</a>

Link in case the above badge doesn't redirect you correctly: [Open in Colab](https://colab.research.google.com/github/microsoft/CameraTraps/blob/master/detection/megadetector_colab.ipynb)

This notebook replaces a previous example by [@louis030195](https://github.com/louis030195). Improvements: updated environment setup, MegaDetector model version and support for mounting Google Drive folders so you can process your own images here.

# Running MegaDetector on camera trap images using Google Colab
Put together by Alistair Stewart, Alice Springs, May 2020.
@alsnothome

For reference please read the [MegaDetector guide on GitHub](https://github.com/microsoft/CameraTraps/blob/master/megadetector.md) and check there for updates. Here we have roughly followed the steps for running under Linux.

This notebook is designed to load camera trap image files already uploaded onto Google Drive. If you don't have images already loaded onto Google Drive or just want to see a demo of MegaDetector in action, we also provide code to download some sample images.

The steps walk through copying of all of the required model and helper files to the Colab runtime and installing all the required packages. You can then connect to your Google Drive folder and process all of the images in a folder using the MegaDetector saved model. The output is saved in a JSON file - a text based database file whose format is described in this [section](https://github.com/microsoft/CameraTraps/tree/master/api/batch_processing#batch-processing-api-output-format) in the batch API user guide. The detections (as bounding boxes) can then be rendered on your images.

The Google Colab instance will only stay open for a maximum 10-12 hrs and after that it will close and any unsaved data will be lost. We recommend saving the JSON output and annotated images into your Google Drive folder for persistent storage.

## Set up the Colab instance to run on GPU processing


Navigate to Edit→Notebook Settings and select "GPU" from the Hardware Accelerator drop-down 

## Copy the model, install dependencies, set PYTHONPATH

Note: from here on you'll start seeing a mix of code. Most are Linux system commands, rather than Python. The system commands are prefixed by a shebang `!`, which tells this notebook to execute them on the command line.

### Install TensorFlow v1

TensorFlow is already installed in Colab, but our scripts are not yet compatible with the newer version of TensorFlow. 

Please follow the next three steps in sequence and do not skip any steps :) If you were not able to follow these, you can reset the runtime by going to "Runtime" in the top menu and "Factory reset runtime".


1. Uninstall the existing version of TensorFlow (this doesn't affect your other Colabs, don't worry)


In [None]:
pip uninstall tensorflow

2. Install the older TensorFlow version using `pip`, with GPU processing by specifying `-gpu` and version number `1.13.1`. We also install the other required Python packages that are not already in Colab - `humanfriendly` and `jsonpickle`.

In [None]:
pip install tensorflow-gpu==1.13.1 humanfriendly jsonpickle

3. Importantly, you now need to **re-start the runtime** of this Colab for it to start using the older version TensorFlow that we just installed.

  Click on the "Runtime" option on the top menu, then "Restart runtime". After that, you can proceed with the rest of this notebook.

  Let's check that we have the right version of TensorFlow (1.13.1):

In [None]:
import tensorflow as tf
print(tf.__version__)

### Download the MegaDetector model file

Currently, v4.1 is avaialble by direct download. The link can be found in the GitHub MegaDetector readme: MegaDetector v4.1, 2020.04.27 frozen model (.pb)

In [None]:
!wget -O /content/megadetector_v4_1_0.pb https://lilablobssc.blob.core.windows.net/models/camera_traps/megadetector/md_v4.1.0/md_v4.1.0.pb

### Clone the two required Microsoft git repos
This will copy the latest version of the Microsoft AI for Earth "utilities" and "Camera Traps" repositories from GitHub. These make data handling and running the model easy. 

In [None]:
!git clone https://github.com/FFI-Vietnam/CameraTraps-FFIVietnamAdaptation
!git clone https://github.com/FFI-Vietnam/ai4eutils-FFIVietnamAdaptation

We'll also copy the Python scripts that run the model and produce visualization of results to the working directory.

In [None]:
!cp /content/CameraTraps-FFIVietnamAdaptation/detection/run_tf_detector_batch.py .
!cp /content/CameraTraps-FFIVietnamAdaptation/visualization/visualize_detector_output.py .

### Set `PYTHONPATH` to include `CameraTraps` and `ai4eutils`

Add cloned git folders to the `PYTHONPATH` environment variable so that we can import their modules from any working directory.


In [None]:
import os
os.environ['PYTHONPATH'] += ":/content/ai4eutils-FFIVietnamAdaptation"
os.environ['PYTHONPATH'] += ":/content/CameraTraps-FFIVietnamAdaptation"

!echo "PYTHONPATH: $PYTHONPATH"

## Mount Google Drive in Colab
You can mount your Google Drive if you have sample images there to try MegaDetector on or want to save the results to your Google Drive.

Once you run the cell below, it will show a URL and a text box.

Visit that URL to choose the Google account where the images you want to process live. After you authenticate, an authorization code will be shown. Copy the authorization code to the text box here. 

Your Google Drive folders will then be mounted under `/content/drive` and can be viewed and navigated in the Files pane.

The method is described under this Colab code snippet: https://colab.research.google.com/notebooks/io.ipynb#scrollTo=u22w3BFiOveA. Never give out your account username and password. Read this Colab code snippet to understand how this connection is made and authenticated. There are other ways to connect your Google Drive or upload your data if you do not find this method suitable.

In [None]:
from google.colab import drive
drive.mount('/content/drive')

## MegaDetector batch processing

This step executes the Python script `run_tf_detector_batch.py` that we copied from the CameraTraps repo. It has three mandatory arguments and one optional:

1.   path to the MegaDetector saved model file.
2.   a folder containing images. If your images were already on Google Drive, replace `[Image_Folder]` with your folder name from Google Drive.
3.   the output JSON file location and name - replace `[Output_Folder]` with your folder name and `[output_file_name.json]` with your file name.
4.   option `--recursive` goes through all subfolders to find and process all images within.
5.   option `--checkpoint_frequency` specifies the number of inferences before saving to checkpoint file. The default is `20`, which means after running through `20` images, the model will save the progress in a checkpoint file so that it can be resumed in the next run.
6.   option `--resume_from_checkpoint` forces the model to save its progress in a checkpoint file. This prevents rerunning everything from beginning when the runtime is shut down.

You will need to change the image folder path and output file path, depending on your situation.

In our experience the Colab system will take ~30 seconds to intialize and load the saved MegaDetector model. It will then iterate through all of the images in the folder specified. Processing initially takes a few seconds per image and usually settles to ~1 sec per image. That is ~60 images per minute or ~3600 images per hour.

If you see the error "AssertionError: output_file specified needs to end with .json" then you haven't update the output folder and file name in the line of code below properly.

In [None]:
# specify the image directory
images_dir = '/content/drive/My Drive/[Image_Folder]'

# choose a location for the output JSON file
output_file_path = '/content/drive/My Drive/[Output_Folder]/[output_file_name.json]'

# specify the location of the checkpoint file
# checkpoint_file_path = '/content/drive/My Drive/[Output_Folder]/[checkpoint_file_name.json]'

In [None]:
# TEST
images_dir = f'/content/drive/My Drive/FFI/MegaDetector Test/2021-07-26 Demo/156'
# choose a location for the output JSON file
output_file_path = f'/content/drive/My Drive/FFI/MegaDetector Test/2021-07-26 Demo/156_output_file(1).json'
checkpoint_file = f'/content/drive/My Drive/FFI/MegaDetector Test/2021-07-26 Demo/checkpoint_20210819035609.json'

!python run_tf_detector_batch.py megadetector_v4_1_0.pb "$images_dir" "$output_file_path" --recursive --checkpoint_frequency 20 --resume_from_checkpoint "$checkpoint_file" 
# !python run_tf_detector_batch.py megadetector_v4_1_0.pb "$images_dir" "$output_file_path" --recursive --checkpoint_frequency 20

Here we pass the Python variable value `output_file_path` you specified above to the bash commands below using `$` (double quoting as there are spaces in this path), to run the script. This is so that we can refer to the output file path later for visualization.

In [None]:
# !python run_tf_detector_batch.py megadetector_v4_1_0.pb "$images_dir" "$output_file_path" --recursive c 20 --resume_from_checkpoint "$checkpoint_file_path" 
!python run_tf_detector_batch.py megadetector_v4_1_0.pb "$images_dir" "$output_file_path" --recursive --checkpoint_frequency 20

## Visualize batch processing script outputs

Here we use the `visualize_detector_output.py` in the `visualization` folder of the Camera Traps repo to see the output of the MegaDetector visualized on our images. It will save images annotated with the results (original images will *not* be modified) to the `[Visualization_Folder]` you specify here.

The scripts take in a number of optional parameters to control output image size and how many are sampled (if you've processed a lot of images but only want to visualize the results on a few) - take a look at the `main()` function in the script to see what other parameters are available.

In [None]:
visualization_dir = '/content/[Visualization_Folder]'  # pick a location for annotated images

In [None]:
!python visualize_detector_output.py "$output_file_path" "$visualization_dir" --confidence 0.8 --images_dir "$images_dir" --output_image_width -1
print("Done")

In [None]:
# TEST
images_dir = f'/content/drive/My Drive/FFI/MegaDetector Test/2021-07-26 Demo/156'
output_file_path = f'/content/drive/My Drive/FFI/MegaDetector Test/2021-07-26 Demo/156_output_file(1).json'
visualization_dir = "/content/drive/My Drive/FFI/MegaDetector Test/2021-07-26 Demo/Annotations"  # pick a location for annotated images
!python visualize_detector_output.py "$output_file_path" "$visualization_dir" --confidence 0.7 --images_dir "$images_dir" --output_image_width -1
print("Done")
