# ROI Pooling Layer

![](https://deepsense.ai/wp-content/uploads/2017/02/roi_pooling-1.gif.pagespeed.ce.5V5mycIRNu.gif)
<center>Image taken from <a href="https://deepsense.ai/region-of-interest-pooling-explained/">here</a></center>
<br><br>

The last part that goes into making Faster R-CNN possible is the ROI Pooling layer! 

It might sound complex, but this concept is straightforward! First, we pre-defined some output sizes (e.g., 2x2, 5x5, 7x7). After a CNN part of the network produces features and RPN produces ROI proposals, ROI pooling takes each ROI as an input and applies its position and size on the feature maps generated by a CNN network. Each of these regions is split into several pieces to get the predefined size (e.g., 2x2). We take each sub-section and apply the MaxPooling on top of it (select the most significant number).

**Learn more about ROI Pooling:**
- https://towardsdatascience.com/understanding-region-of-interest-part-1-roi-pooling-e4f5dd65bb44
- https://towardsdatascience.com/region-of-interest-pooling-f7c637f409af
- https://deepsense.ai/region-of-interest-pooling-explained/
- (Optional) https://kaushikpatnaik.github.io/annotated/papers/2020/07/04/ROI-Pool-and-Align-Pytorch-Implementation.html

### Steps:
1. Import dependencies
2. Define the RoIPooling layer

### Topics covered and learning objectives
- RoI Pooling layer

### Time estimates:
- Reading/Watching materials: 15min
- Exercises: 3-5min
<br><br>
- **Total**: ~20min

## Import dependencies

In [None]:
import tensorflow as tf
import tensorflow.keras.backend as K
from tensorflow.keras.layers import Layer, Input, Conv2D

from tests import test_ROI_Pooling

### Exercise 1 Complete the call function in the **RoIPooling** layer

**Tutorial: How to create a custom Keras layer**: 
- https://sparrow.dev/keras-custom-layer/
- https://faroit.com/keras-docs/2.0.1/layers/writing-your-own-keras-layers/

The call method accepts three (3) inputs → feature maps from the previous layer, ROIs from RPN, and box_indexes (IDs of images that correspond to each ROI).

Your task will be to perform RoIPooling by using the crop_and_resize function from TensorFlow. The complete documentation can be found [here](https://www.tensorflow.org/api_docs/python/tf/image/crop_and_resize)

In [None]:
class RoIPooling(Layer):
    
    def __init__(self, size=(7, 7)):
        """
        RoI Pooling layer
        
        Args:
            :param size (tuple): size of the pooling output 
        """
        
        self.size = size
        super(RoIPooling, self).__init__()

    def build(self, input_shape):
        """
        Build method is used to define weights for the custom Keras method
        
        Args:
            :param input_shape (list): Shape of the input of the previous layer in a network
        """
        
        self.shape = input_shape
        super(RoIPooling, self).build(input_shape)

    def compute_output_shape(self, input_shape):
        """
        In case your layer modifies the shape of its input, 
        you should specify here the shape transformation logic. 
        This allows Keras to do automatic shape inference.
        
        Args:
            :param input_shape (list): Shape of the input of the previous layer in a network
            
        Source: Taken from article: https://dongjk.github.io/code/object+detection/keras/2018/06/10/Faster_R-CNN_step_by_step,_Part_II.html
        """
        a=input_shape[1][0]
        b=self.size[0]
        c=self.size[1]
        d=input_shape[0][3]
        return (a,b,c,d)
        
    def call(self, feature_maps, rois, box_indexes, **kwargs):
        """
        Main Layer's logic.
        
        Args:
            :param feature_maps: Feature maps generated by RPN. Dimensions 4D, example: [None, None, None, 512]
            :param rois: List of rois with 4 coordinates each. Example: [None, 4]
            :param box_indexes:  A 1-D tensor of shape `[num_boxes]` with int32 values in `[0, batch)`. 
                                The value of `box_ind[i]` specifies the image that the `i`-th box refers to.
                                
        Returns:
            x - pooled inputs with size: [None, self.size[0], self.size[1], feature_maps[-1]] -> Example: [None, 7, 7, 512]
        """
        
        # YOUR CODE HERE
        x = None
        return x

In [None]:
# RUN THIS CELL TO TEST YOUR CODE
test_ROI_Pooling(RoIPooling)

# What's next?

Since we haven't implemented the whole Faster R-CNN, but its most essential parts, we suggest that you go through some blogs and implementations covering them in much more depth. But taking the length and training time into account, that might take several days to a week! 


Links to follow:
- Implementation PT1: https://dongjk.github.io/code/object+detection/keras/2018/05/21/Faster_R-CNN_step_by_step,_Part_I.html
- Implementation PT2: https://dongjk.github.io/code/object+detection/keras/2018/06/10/Faster_R-CNN_step_by_step,_Part_II.html
- PyTorch: https://github.com/clemkoa/faster-rcnn
- Keras: https://github.com/you359/Keras-FasterRCNN


For now, you have covered the main parts of the two-staged object detection algorithms, and it's time to switch a line and tackle some bleeding-edge object detection networks!