## Evaluate Fast-RCNN model directly from python

This notebook demonstrates eveluation of a model trained by the Fast-RCNN implementation of CNTK.

For a full description of the model and the algorithm, please see the following tutorial: https://github.com/Microsoft/CNTK/wiki/Object-Detection-using-Fast-R-CNN 

Below, you will see sample code for:
1. Preparing the input data for the network (including image size adjustments)
2. Evaluation of the input data using the model
3. Processing the evluation result and presenting the selected regions back on the image

Before running this notebook, please make sure that:
<ol>
<li>You have version >= 2.0 of CNTK installed. Installation instructions are available here: https://github.com/Microsoft/CNTK/wiki/Setup-CNTK-on-your-machine

The current assumption is that you have CNTK installed on a windows machine under "c:\local\cntk". You can change the path for the CNTK by changing the "cntk_base_path" variable that is defined below.</li>

<li>You trained the Fast R-CNN model example for the groecry dataset run according to the instructions in the [tutorial above](https://github.com/Microsoft/CNTK/wiki/Object-Detection-using-Fast-R-CNN).<br>
**Important**: Please note that this example works with the Brain Script model that supports version 2.0 of CNTK.
To use the brainscript model, make sure to use following fastrnn.cntk configuration file: https://github.com/Microsoft/CNTK/blob/pkranen/frcnPythonApi/Examples/Image/Detection/FastRCNN/fastrcnn.cntk

The configuration uses the BrainScript version for the AlexNet. Prior to running the A2_RunCntk.py script, you should download the model from: https://www.cntk.ai/Models/AlexNet/AlexNetBS.model and place it under the: "<i>C:\local\cntk\Examples\Image\PretrainedModels</i>" directory.</li>

<li>This notebook uses the CNTK python APIs and should be run from the CNTK python enviornment. The enviornment can be started by running the script: "c:\local\cntk\Scripts\cntkpy34.py".</li>
</ol>

In [1]:
%matplotlib inline
# the above line enable us to draw the images inside the notebooks

# path to the CNTK installation
cntk_base_path = r"C:\local\cntk"

## load the model:

In [None]:
from cntk import load_model
from os.path import join
frcnn_model = load_model(join(cntk_base_path, r"Examples/Image/Detection/FastRCNN/proc/grocery_2000/cntkFiles/Output/Fast-RCNN.model"))

## Load image and convert it to the network format

The image is loaded using OpenCV, and then resized according to the network input dimensions.

When resizing, we preserve scale and pad the border areas with a constant value (114)

In [None]:
import cv2
import numpy as np
import matplotlib.pyplot as plt

image_height = 1000
image_width = 1000 

def resize_and_pad(img, width, height, pad_value=114):
    # port of the c++ code from CNTK: https://github.com/Microsoft/CNTK/blob/f686879b654285d06d75c69ee266e9d4b7b87bc4/Source/Readers/ImageReader/ImageTransformers.cpp#L316
    img_width = len(img[0])
    img_height = len(img)
    
    scale_w = img_width > img_height
    
    target_w = width
    target_h = height
    
    if scale_w:
        target_h = int(np.round(img_height * float(width) / float(img_width)))
    else:
        target_w = int(np.round(img_width * float(height) / float(img_height)))
        
    resized = cv2.resize(img, (target_w, target_h), 0, 0, interpolation=cv2.INTER_NEAREST)
    
    top = int(max(0, np.round((height - target_h) / 2)))
    left = int(max(0, np.round((width - target_w) / 2)))
    
    bottom = height - top - target_h
    right = width - left - target_w
    
    resized_with_pad = cv2.copyMakeBorder(resized, top, bottom, left, right, 
                                          cv2.BORDER_CONSTANT, value=[pad_value, pad_value, pad_value])
        
    #tranpose(2,0,1) converts the image to the HWC format which CNTK accepts
    model_arg_rep = np.ascontiguousarray(np.array(resized_with_pad, dtype=np.float32).transpose(2,0,1))
    
    return resized_with_pad, model_arg_rep

def load_image_and_scale(image_path, width, height, pad_value=114):
    img = cv2.imread(image_path)
    return resize_and_pad(img, width, height, pad_value), img

test_image_path = join(cntk_base_path, r"Examples/Image/DataSets/grocery/testImages/WIN_20160803_11_28_42_Pro.jpg")
(test_img, test_img_model_arg), original_img = load_image_and_scale(test_image_path, image_width, image_height)

plt.imshow(cv2.cvtColor(test_img, cv2.COLOR_BGR2RGB))
plt.axis("off")

## Define ROIs for testing

Current ROI data is taken from the example test file. 
ROIs can be produced using a sliding window/selective search. The given ROIs are in the format of [x,y,w,h] and in the coordinates of the scaled and padded image.

The ROIs are padded with regions of [0,0,0,0] at the end to match the 2000 ROIs model.

We use a helper function to find where the pading start and store the length of the actual list without the padding in order to use it later in the evaluation.

As a final step, we also convert the coordinate of the ROIs back to the coordinates of the original image

In [None]:
import numpy as np

#read the test rois from file and convert into a float matrix of 2000 X 4
with open('./test_rois.txt', 'r') as rois_file:
    rois_sample=rois_file.read().replace('\n', '')
rr = rois_sample.split(" ")
test_rois = np.array([[float(rr[i*4]), float(rr[i*4+1]), float(rr[i*4+2]), float(rr[i*4+3])] for i in range(int(len(rr)/4))])

# find where the padding begins in case we read the rois from a file or use some test data
# in case we generate the rois in the script we should know this in advance
def find_roi_padding(rois):
    # we can probably make this more efficient if we use binary search, etc.
    # anyway its something that we should know when we run the evaluation
    for i in range(len(rois)):
        roi = rois[i]
        if (roi[0] == 0.0 and roi[1] == 0 and roi[2] == 0.0 and roi[3] == 0.0):
            return i
    return len(rois)+1

roi_padding_index = find_roi_padding(test_rois)

## Convert the ROIs back to the image coordinates
def to_original_rois(rois, img_width, img_height, dest_width, dest_height):
    scale_w = img_width > img_height
    
    target_h = dest_height
    target_w = dest_width
    height_ratio = 1
    width_ratio = 1
    if scale_w:
        height_ratio = float(dest_width) / float(img_width)
        target_h = int(np.round(img_height * height_ratio))
    else:
        width_ratio = float(dest_height) / float(img_height)
        target_w = int(np.round(img_width * width_ratio))
    
    top = max(0, np.round((dest_height - target_h) / 2)) / float(dest_width)
    left = max(0, np.round((dest_width - target_w) / 2)) / float(dest_height)
    
    original_rois = []
    for r in rois:
        x1 = int((r[0] - left) * dest_width / width_ratio)
        y1 = int((r[1] - top)*img_height)
        x2 = x1 + int(r[2] * dest_width / width_ratio)
        y2 = y1 + int(img_height * r[3]) 
        original_rois.append([x1, y1, x2, y2])
    return np.array(original_rois)
        
original_rois = to_original_rois(test_rois, len(original_img[0]), len(original_img), 1000, 1000)

## Evaluate the sample
Prepare the data to be in CNTK's expected arguments format and run it through the model

Process the result by trimming the padded rois part, and calculate the predicted labels and probabilites

In [None]:
import sys
sys.path.append(r"C:\local\cntk\Examples\Image\Detection\FastRCNN")
from cntk_helpers import softmax2D

# a dummy variable for labels the will be given as an input to the network but will be ignored
dummy_labels = np.zeros((2000,17))

# prepare the arguments
arguments = {
    frcnn_model.arguments[1]: [test_img_model_arg],
    frcnn_model.arguments[0]: [test_rois],
    frcnn_model.arguments[2] : [dummy_labels]
}

# run it through the model
output = frcnn_model.eval(arguments)

# take just the relevant part and cast to float64 to prevent overflow when doing softmax
rois_values = output[frcnn_model.outputs[2]][0][0][:roi_padding_index].astype(np.float64)

# get the prediction for each roi by taking the index with the maximal value in each row 
rois_labels_predictions = np.argmax(rois_values, axis=1)

# calculate the probabilities using softmax
rois_probs = softmax2D(rois_values) 

# print the number of ROIs that were detected as non-background
print("Number of detections: %d"%np.sum(rois_labels_predictions > 0))

### Merge overlapping regions using Non Maxima Supression
The code below merged overlapping regions that were detected using Non-Maxima-Supression algorithm that is implemented in the cntk_helpers module

In [None]:
from cntk_helpers import applyNonMaximaSuppression
nms_threshold = 0.1
non_padded_rois = test_rois[:roi_padding_index]
max_probs = np.amax(rois_probs, axis=1).tolist()
rois_prediction_indices = applyNonMaximaSuppression(nms_threshold, rois_labels_predictions, max_probs, non_padded_rois)
print("Indices of seletected regions:",rois_prediction_indices)

## Visualize the results

As a final step, we use the OpenCV **rectangle** library in order to draw the seleted regions on the original image

In [None]:
rois_with_prediction = test_rois[rois_prediction_indices]
rois_prediction_labels = rois_labels_predictions[rois_prediction_indices]
rois_predicion_scores = rois_values[rois_prediction_indices]
original_rois_predictions = original_rois[rois_prediction_indices]

original_img_cpy = original_img.copy()
for roi in original_rois_predictions:
    (x1,y1,x2,y2) = roi
    cv2.rectangle(original_img_cpy, (x1, y1), (x2, y2), (0, 255, 0), 5)

plt.imshow(cv2.cvtColor(original_img_cpy, cv2.COLOR_BGR2RGB))
plt.axis("off")