Skip to content

chaudhary-rohit/RPN-Faster-R-CNN

Repository files navigation

RPN-Faster-R-CNN

Faster R-CNN is an object-detection model that can give real-time output of about 5-7fps. RPN(Region Proposal Network) enables this model with very high speed of region proposal algorithm over other methods like Selective Search. Object detection is a computer technology related to computer vision and image processing that deals with detecting instances of semantic objects of a certain class in digital images and videos. The goal of Object Detection is to find the location of an object in a given picture accurately and mark the object with the appropriate category. To be precise, the problem that object detection seeks to solve involves determining where the object is, and what it is. However, solving this problem is not easy. Unlike the human eye, a computer processes images in two dimensions. Furthermore, the size of the object, its orientation in the space, its attitude, and its location in the image can all vary greatly.

Region Proposal Network (RPN) method provides the best way and is the state-of-the-art method of region proposals for object detection task. There are few other methods which have been discarded after RPN is proposed e.g. Selective Search (SS) method which works on some greedy algorithm to propose regions on an image on the basis of similarity in colour, texture, size and fitness. SS method was used in RCNN and fast-RCNN and it takes about a second to propose regions on a single image. In fast-RCNN, SS was serving as a bottleneck for real-time object detection until RPN was discovered. Replacing SS method with RPN method in fast-RCNN upgraded object-detection to real-time application with 5-10 fps output and removing all the speed and time barriers. RPN shares full-image convolutional features with the detection network which enables cost free region proposals. RPNs are trained end to end for region proposal task which are used by fast-RCNN for detection The region proposals generated by RPN is fed to fastRCNN detection part to classify the object classes and refine the proposals at the same time. We use FastRCNN because it uses a single stage training pipeline much simpler to train and quite faster both in training and testing. Evolving from single layered perceptron to very deep models like Res-Net, VGG19 etc. image classification has developed allot, faster RCNN achieving 73.2% mAP. In faster-RCNN initial convolution layers are shared between RPN and fast-RCNN enabling the model to share computations and decrease execution time remarkably leading to real-time object detection capability. In brief feature map is generated for the given image and RPN works on it to generate region proposals on the basis of which Region-of-Interest (RoI) pooling is done to produce RoI feature vector. On basis of this RoI vector classification and bounding-box regression are done to produce final bounding boxes along with the class of the enclosed object. For the very deep VGG16 model the state-of-the-art object detection accuracy on PASCAL VOC 2007 (73.2% mAP) and 2012 (70.4% mAP) using 300 proposals per image. I implemented the code in python using TensorFlow (low-level API).

About

Region proposal network

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages