# Faster R-CNN Overview
> Maintainer: Huong Nguyen (huong.nguyen@fansipan.io)

**Objectives**:
1. Why Faster R-CNN matter
2. Understanding Faster R-CNN architecture

## 1. Why Faster R-CNN matter

By using basic CNNs, we only can solve the problem with image classification which means the inputs are images containing only one object such as a flower, a cat, or a dog.

<img src="https://upload.wikimedia.org/wikipedia/commons/thumb/5/52/Liliumbulbiferumflowertop.jpg/440px-Liliumbulbiferumflowertop.jpg" height="200"/>

How about if an image contains more than one objects? Because we don't exactly know how many objects in the image, it's impossible to define a suitable output layer.Therefore, **the traditional CNN model can not be used to solve the objects detection problem.**

<img src="https://i1.wp.com/nttuan8.com/wp-content/uploads/2019/04/ex.png?w=655&ssl=1" height="250"/>


**R-CNN** (Region with CNN feature) came to address the above problem. So, how does R-CNN work? There are two steps:
- Step1. Using Selective Search (SS) algorithm to obtain 2000 boxes which are potential to contain objects.
- Step 2. classifying object for each box using CNN features.

<img src="https://s3-ap-south-1.amazonaws.com/av-blog-media/wp-content/uploads/2018/10/rcnn.png"/>

Since **R-CNN** takes quite long time for training and it also takes quite long time for detecting an object, **R-CNN can not be used to solve the objects detection in real-time problem.**

That was where **Fast R-CNN** was born to help. Fast R-CNN still uses SS for obtaining region proposals (boxes). The difference is that Fast R-CNN doesn't extract 2000 boxes rather than it put the image into a CNN to create convolutional feature map. Then, it finds the region proposal corresponding to feature map. 

<img src="https://s3-ap-south-1.amazonaws.com/av-blog-media/wp-content/uploads/2018/10/Fast-rcnn.png"/>

**Fast R-CNN** is better performance than R-CNN, but **Fast R-CNN is still not fast enough when works with large dataset due to SS**.

**Faster R-CNN** was proposed to optimize the performance. Faster R-CNN doesn't use SS to produce region proposals instead it use a CNN called Region Proposal Network (RPN).

## 2. Faster R-CNN architecture

<img src="https://s3-ap-south-1.amazonaws.com/av-blog-media/wp-content/uploads/2018/10/Faster-rcnn.png" />

- Step1. Pass input into a CNN to get feature map
- Step2. produce region proposals using RPN
- Step3. apply ROI pooling to make all region proposals are in same size
- Step4. pass region proposals into a full-connected layer to classify and predict bounding boxes for the image.


**References**:
- [Object detection với Faster R-CNN](https://nttuan8.com/bai-11-object-detection-voi-faster-r-cnn/)
- [Faster R-CNN: Towards Real-Time Object Detection
with Region Proposal Networks](https://papers.nips.cc/paper/5638-faster-r-cnn-towards-real-time-object-detection-with-region-proposal-networks.pdf)
- [A Practical Implementation of the Faster R-CNN Algorithm for Object Detection (Part 2 – with Python codes)](https://www.analyticsvidhya.com/blog/2018/11/implementation-faster-r-cnn-python-object-detection/)