<a href="https://colab.research.google.com/github/lu-lab/frcnn-all-in-one/blob/main/colab/Faster_R_CNN_inferencing.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>


# Faster R-CNN inferencing
---
This notebook will allow you to inference using a pre-trained object detector and Google's GPU resources. Enable GPU by going to Runtime -> Change runtime type and select "GPU" from the dropdown menu. This will speed your inferencing time up substantially, but note that Google has a limit on how much of this GPU resource you can use. If you use the GPU resource heavily, you may have to subscribe to a paid plan. 

We recommend reading this notebook through **in full** before starting. Then, you can run the cells in this notebook. The cells in Step 0-Step 2 must be run **in order**. The actual inferencing, performed in Step 3, includes multiple cells, of which you can run any or all before moving to Step 4. 

**Note**: If the session disconnects, you will have to re-run all the cells leading up to inferencing.

###Step 0: Copy this notebook and download pre-trained models


---


First off, **save a copy of this notebook to your own Google Drive!** This is partially to protect your data and partially so you can save any changes you may want to make to this notebook. We recommend putting it into it's own folder, because we will be making a few sub-folders to organize the inputs and outputs of the network and the code.

At this point you should also **download the pre-trained model you'd like to test**. Once you've downloaded the model, extract it if needed. Then make a folder called 'model' in the same folder as this notebook in your Google Drive. For models created with Tensorflow 1 , you'll need to upload the label_map.pbtxt file and the frozen_inference_graph.pb file. For models created with Tensorflow 2, you'll need to upload the label_map.pbtxt and 'saved_model' folder. The models we fine-tuned for our paper can be found [here](https://doi.org/10.6084/m9.figshare.13678705.v1), and were all created with Tensorflow 1. Here's a quick guide on what models from the paper we'd recommend trying for what imaging conditions:

*   The **developmental model** (referenced as worms-on-plate in the paper) tends to work best for imaging conditions where the worm is relatively small and on an agar plate, seeded or unseeded. 
*   The **egg model** (referenced as egg-finder in the paper) tends to work best for data where the worm and/or eggs are imaged at relatively high resolution.
*   The **aging model** is highly tuned for animals in a specific microfluidic device and in our experience it does not generalize as well as the other models.However, if you believe your data look similar to the data we use with this model in our paper (see Figures 1 and 4), it may be worth a try. 


You can also find many pre-trained models [here](https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/tf1_detection_zoo.md) and [here](https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/tf2_detection_zoo.md) at the Tensorflow 1 and 2 model zoos. Just note that the lightweight tool for detection and visualization we built will only look at classes 1 and 2 of the model - any detections of other classes of object will be ignored!

**Note**: Following this notebook will require several GB of space in your Google Drive, in addition to whatever space you may need for any image data you may want to perform inferencing on. 


###Step 1: Connect to your Google Drive

---


The first code cell below will mount Google Drive, get files we need from the GitHub repository to run this notebook, and make a new folder that we will put inferencing results in. 

Before running the cell below, make sure to modify the path following the first '%cd' to the path this notebook is in! If you have spaces in the path, add a \ before the space. For example, the path 

```
/content/drive/My Drive/Colab Notebooks/Faster R-CNN/
```

becomes 
```
/content/drive/My\ Drive/Colab\ Notebooks/Faster\ R-CNN/
```



In [None]:
# First we need to mount Google drive and gather some dependencies...
from google.colab import drive 
drive.mount('/content/drive')
import os

# NOTE: Modify this path if needed
%cd /content/drive/My\ Drive/Colab\ Notebooks/Faster\ R-CNN\ inferencing/
working_dir = os.getcwd()

!git clone https://github.com/lu-lab/frcnn-all-in-one.git
%cd {working_dir}
!cp -a ./frcnn-all-in-one/colab/. .
!rm -r ./frcnn-all-in-one
# Since you already have the inferencing notebook in your google drive, delete the second copy that is in the repository
!rm Faster_R_CNN_inferencing.ipynb

# if this folder already exists, it will not be made, but that should be ok.
!mkdir inferencing-results


###Step 2: Install libraries

---

Next we install all the necessary Python libraries and packages to inference using a pre-trained model in this notebook. 

If you've already run this notebook before, you will already have the Tensorflow models repository in a folder called Tensorflow and you can prevent re-cloning that repo by using a # to comment out the line 
```!git clone https://github.com/tensorflow/models.git #```. However, every time you restart the notebook, you do need to re-install each of the libraries below, so don't eliminate anything else. 

In [None]:
# You will need to re-install everything if you've restarted your session
%mkdir Tensorflow 
%cd ./Tensorflow
!git clone https://github.com/tensorflow/models.git #
%cd {working_dir}
!apt-get -qq install -y protobuf-compiler python-pil python-lxml python-tk 
%cd ./Tensorflow/models/research/
!protoc object_detection/protos/*.proto --python_out=.
%cd {working_dir}

%cd Tensorflow/models/research/
!cp object_detection/packages/tf2/setup.py .
!python -m pip install .
%cd {working_dir}

!pip install Cython==0.29.21
!pip install h5py==2.10.0
!pip install opencv-python-headless==4.4.0.44
!pip install scipy
!pip install tensorflow

### **Step 3: Test inferencing**

---



Now we've set up the environment, we can use the pre-trained model to inference, or predict the bounding boxes for the classes the model is trained on with new images. Below are a few different options for how to do this, depending on whether the images you want inferences for is a list of images, a folder of images, or a movie. This step will help you visualize detections and give you a qualitative idea of how well the model performs with your data. **You can run one or multiple of the cells below to inference with different types of input data.**

In each example below, we first get inferences for each image or frame in the list, folder, or movie and store it in an h5 file. This filetype is convenient because the whole file need not be loaded into memory in order to access variables. Then, we use the boxes and scores in the h5 file to draw boxes around where the model detects worms or eggs (or whatever classes are assigned to labels 1 and 2). For convenience, we also provide a converter to save h5 data to .csv files that can be read by Excel or Google Sheets. 

The inferencing for models created with Tensorflow 1 and Tensorflow 2 is a little different. All the examples below show how to inference using a Tensorflow 1 model, but you can follow the instructions below to modify a couple lines so that you can inference using a Tensorflow 2 model.

#### **Adapting each cell for Tensorflow 2 models**
All the examples below are designed for use with Tensorflow 1 models, since the models from our paper are Tensorflow 1 models. However, you can easily adapt this to Tensorflow 2 models by changing a few lines in each of the cells below. First, find the lines 

```
cnn = inferencing_tools.CNN_tf1(path_to_frozen_graph, path_to_labels, save_to_hdf, h5_file) #comment for Tensorflow 2 models
# cnn = inferencing_tools.CNN_tf2(path_to_frozen_graph, path_to_labels, save_to_hdf, h5_file) #uncomment for Tensorflow 2 models
```
To switch to Tensorflow 2, use a # to comment out the first of these two lines, and remove the # from the second line. 

Then, for Tensorflow 2 models, change the ```path_to_frozen_graph``` variable to  
```path_to_frozen_graph = './model/saved_model'```

assuming that your model is saved in the folder 'model'

#### **Inferencing for a list of images**
First, let's see how the model performs by feeding in a list of image names. The images themselves are in the 'data' -> 'test_images' folder. If you've run the cells above, you'll see a file 'image_list.xlsx'. This file contains a list with the paths to the images you want to inference in a column, with 'filename' labelling the column in the first row. Once we gather all the detections for each image in the list, we'll superimpose boxes indicating where the objects are, with red boxes being worms and blue boxes being eggs (assuming worms and eggs correspond to classes 1 and 2). The images with boxes imposed will be displayed below the following cell, and they'll be saved in the 'inferencing-results' folder, with each image having the same name as the original image. 

In [None]:
import sys
import glob
import IPython
from IPython.display import Image, display
import cv2
import pandas as pd
from inference_code import inferencing_tools


path_to_frozen_graph = './model/frozen_inference_graph.pb'
# we've already defined this above, but just in case
path_to_labels = './model/label_map.pbtxt'
save_to_hdf = True
save_path = './inferencing-results'
h5_file = os.path.join(save_path, 'test_detections.h5')

# pull out the training image names
excel_filepath = './data/image_list.xlsx'
images = pd.read_excel(excel_filepath)
test_image_paths = images['filename']

# every inference you make with this 'cnn' object will be saved in the same h5 file
cnn = inferencing_tools.CNN_tf1(path_to_frozen_graph, path_to_labels, save_to_hdf, h5_file) #comment for Tensorflow 2 models
# cnn = inferencing_tools.CNN_tf2(path_to_frozen_graph, path_to_labels, save_to_hdf, h5_file) #uncomment for Tensorflow 2 models

for fn in test_image_paths:
    image_np = cv2.imread(fn)
    key = os.path.basename(fn)
    #inferencing happens in this call - the h5 file will store information using the 'path' variable as part of the key
    worm_boxes, egg_boxes = cnn.get_eggs_and_worms(image_np, key)

# Now that we have all the detections, label them on the test data and visualize the detections on each test image.
inferencing_tools.label_all_detections_from_h5(h5_file, test_image_paths.to_list(), save_path)
for image_name in glob.glob('./inferencing-results/*.png'): #assuming png
    display(Image(filename=image_name))

#### **Inferencing from a folder of images**
This is very similar to the example above. The difference is that here we detect all worms and eggs from .png or .jpg images in a folder ```image_dir```. Edit the ```image_dir``` variable below to lead to the folder with your data! The data must be on Google Drive.

Here we again save the boxes and scores to an h5 file in the 'inferencing-results' directory. We then overlay the detections on top of the original images and save them in the 'inferencing-results' folder with the same name as the original image. Red = worm, blue = egg

In [None]:
import sys
import glob
import IPython
from IPython.display import Image, display
import cv2
import pandas as pd
from inference_code import inferencing_tools

# set up inputs to the inferencer
path_to_frozen_graph = './model/frozen_inference_graph.pb'
path_to_labels = './model/label_map.pbtxt'
save_to_hdf = True
save_path = './inferencing-results'
h5_file = os.path.join(save_path, 'folder_detections.h5')

# every inference you make with this 'cnn' object will be saved in the same h5 file
cnn = inferencing_tools.CNN_tf1(path_to_frozen_graph, path_to_labels, save_to_hdf, h5_file) #comment for Tensorflow 2 models
# cnn = inferencing_tools.CNN_tf2(path_to_frozen_graph, path_to_labels, save_to_hdf, h5_file) #uncomment for Tensorflow 2 models

# take a look at qualitatively how well the model performs on images in the 'image_dir' folder
# edit the 'image_dir' variable below to reflect the where your data is stored!
image_dir = './data/test_images'
ext_list = ['.jpg', '.png'] # if you have another file format, you can add it here. most formats should work.
image_names = [f for f in os.listdir(image_dir) if os.path.isfile(os.path.join(image_dir, f))
                             and f.endswith(tuple(ext_list))]

for fn in image_names:
    image_np = cv2.imread(os.path.join(image_dir, fn))
    #inferencing happens in this call - you can directly use the boxes this returns if you wish. They are also saved in the h5 file
    worm_boxes, egg_boxes = cnn.get_eggs_and_worms(image_np, fn)

# Now that we have all the detections, label them on the test data and visualize the detections on each test image.
image_paths = [os.path.join(image_dir, fn) for fn in image_names]
inferencing_tools.label_all_detections_from_h5(h5_file, image_paths, save_path)
for image_name in glob.glob('./inferencing-results/*.png'): 
    display(Image(filename=image_name))

#### **Inferencing from a movie**
Here's an example that detects both eggs and worms for each frame in a video with the path assigned to ```movie_path``` and saves the detections to an h5 file.  The movie with detections overlaid can then be created with the filename assigned to the ```save_file``` variable. The ```movie_path``` must lead to a movie uploaded to Google Drive.



In [None]:
import sys
import glob
import cv2
import pandas as pd
from inference_code import inferencing_tools

path_to_frozen_graph = './model/frozen_inference_graph.pb'
# we've already defined this above, but just in case
path_to_labels = './model/label_map.pbtxt'
save_to_hdf = True
save_path = './inferencing-results'
h5_file = os.path.join(save_path, 'movie_detections.h5')

# every inference you make with this 'cnn' object will be saved in the same h5 file
cnn = inferencing_tools.CNN_tf1(path_to_frozen_graph, path_to_labels, save_to_hdf, h5_file) #comment for Tensorflow 2 models
#cnn = inferencing_tools.CNN_tf2(path_to_frozen_graph, path_to_labels, save_to_hdf, h5_file) #uncomment for Tensorflow 2 models

# detect and visualize detections from movie
# the input can be most video formats, the output file should be an mp4 file if
# you wish to view the video in this notebook.
movie_path = 'your-video-path.mp4'
save_file = './inferencing-results/test_nn.mp4'

vid = cv2.VideoCapture(movie_path)
idx = 1
while vid.isOpened():
    ret, image = vid.read()
    if ret:
        #inferencing happens in this call
        worm_boxes, egg_boxes = cnn.get_eggs_and_worms(image, idx)
    else:
        break
    print("Processing frame no %s" % idx)
    idx += 1
vid.release()

# if the input of the inferencing is a video, the output will be a video
inferencing_tools.label_all_detections_from_h5(h5_file, movie_path, save_file)

### **Step 4: Convert h5 file to csv**

---


Before running this cell, you **must** run one of the inferencing cells above, or you'll get an error. 

By replacing the path assigned to ```h5_file```
below with the path to your own h5 file, you can convert it to a csv file. The current set up would allow you to convert the file ```'./inferencing-results/folder_detections.h5'``` to a csv file ```'./inferencing-results/bounding_boxes_from_folder.csv'```. If you used the inferencing cell above that performs inferencing on a folder of images with the default naming, the cell below will work as is, otherwise, simply modify the ```h5_file``` variable. The default location of the ```csv_filepath``` will save to the 'inferencing-results' folder, but feel free to rename or relocate this file as you like. 


In the resulting file, the 'frame' column is either the movie frame or image name, and the 'xmin', 'xmax', 'ymin', and 'ymax' columns are all expressed as a proportion of the total size of the image or frame (e.g. the upper left corner of a box should be (xmin *x* image_width, ymin *x* image_width) ). In the class column is an integer that corresponds to the classes in your label_map.txt file. If you used the same classes in the annotation notebook, then class 1 is a worm and class 2 is an egg. Finally, the 'score' column is an indication of the confidence that the model has in this detection, with scores closer to 1 being highly confident and scores close to zero having very low confidence. 





In [None]:
import h5py
import pandas as pd
from inference_code import inferencing_tools

save_path = './inferencing-results'
h5_file = os.path.join(save_path, 'folder_detections.h5')
xmins = []
xmaxs = []
ymins = []
ymaxs = []
classes = []
scores = []
frame_no = []


with h5py.File(h5_file, 'r') as hf:
  keys = list(hf.keys())
  # look for the identifier at the end of the key - for movies this is the frame number, 
  # for images it is the image name
  frames = [txt.split('_')[-1] for txt in keys]
  for frame in frames:
    # get info about frame
    xmin, ymin, xmax, ymax, id, score = inferencing_tools.get_boxes_from_h5(frame, hf)
    xmins.extend(xmin)
    ymins.extend(ymin)
    xmaxs.extend(xmax)
    ymaxs.extend(ymax)
    classes.extend(id)
    scores.extend(score)
    frame_no.extend([frame]*len(score))

data = {'frame': frame_no, 'xmin': xmins, 'xmax':xmaxs, 'ymin':ymins, 
        'ymax':ymaxs, 'class':classes, 'score':scores}
df = pd.DataFrame(data).drop_duplicates()

# save csv with detections to 'inferencing-results' directory
csv_filepath = './inferencing-results/bounding_boxes_from_folder.csv'
df.to_csv(csv_filepath, index=False)

### Step 5: Training (Optional)

---

If you're not satisifed with the pre-trained model's performance, it's straightforward to train a model by following the directions in our binder notebook [![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/lu-lab/frcnn-all-in-one/HEAD) and colab training notebook [![Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/lu-lab/frcnn-all-in-one/blob/main/colab/Faster_R_CNN_training.ipynb). 


