## Fast-RCNN

Fast R-CNN is an extension of the R-CNN (Region-based Convolutional Neural Network) object detection algorithm, which aims to address some of the limitations of the original R-CNN in terms of speed and efficiency. 

Fast R-CNN introduces several improvements to the R-CNN pipeline, including the use of a single network for both region proposal and object detection, as well as the use of a region of interest (RoI) pooling layer to efficiently extract features from proposed regions.

### Steps for Fast-RCNN:

1. Preprocessing: Load and preprocess the input image.

2. Feature Extraction: Pass the preprocessed image through a pre-trained CNN to extract feature maps.


3. Region Proposal: Use a region proposal network (RPN) or an external method (e.g., selective search) to generate region proposals based on the feature maps.

4. RoI Pooling: Apply RoI pooling to the feature maps to extract fixed-size features for each proposed region.

5. Classification and Regression: Pass the RoI-pooled features through fully connected layers for object classification and bounding box regression.

6. Non-Maximum Suppression: Apply non-maximum suppression to filter out redundant detections and produce the final set of object detections.

7. Post-processing: Optionally, perform additional post-processing steps such as filtering detections based on confidence scores or applying additional heuristics.

#### Implementing Fast-RCNN using Tensorflow Object Detection API

Steps for Fast-RCNN implementation:

1. Install TensorFlow Object Detection API: First, you need to install the TensorFlow Object Detection API by following the installation instructions provided in the official documentation.

2. Download Pre-trained Model: Download a pre-trained Fast R-CNN model checkpoint from the TensorFlow Model Zoo. The Model Zoo provides pre-trained models for various object detection architectures, including Fast R-CNN.

3. Set Up Pipeline Configuration: Configure the object detection pipeline by creating a pipeline configuration file. This file specifies parameters such as the model architecture, input image size, class labels, and paths to the pre-trained model checkpoint and label map.

4. Load Pre-trained Model: Load the pre-trained Fast R-CNN model checkpoint and instantiate the object detection model using TensorFlow's Object Detection API.

5. Perform Inference: Use the loaded model to perform inference on input images or video frames. The model will detect objects in the input images and classify them into predefined classes.

6. Visualize Results: Visualize the detected objects and their bounding boxes on the input images or video frames.

In [1]:
import numpy as np
import os
import six.moves.urllib as urllib
import sys
import tarfile
import tensorflow as tf
import zipfile

from collections import defaultdict
from io import StringIO
import matplotlib
from matplotlib import pyplot as plt
from PIL import Image

# Import the Object Detection API modules
from object_detection.utils import ops as utils_ops
# from object_detection.utils import label_map_util
from object_detection.utils import visualization_utils as vis_util
from PIL import Image
matplotlib.use('TkAgg')  # or 'Qt5Agg'





In [2]:
def read_file(file_path):
    
    classLabels=[]
    filename= file_path
    
    with open(filename,'rt') as fpt:
        classLabels = fpt.read().rstrip('\n').split('\n')

    return classLabels

In [3]:
def create_pbtxt_file(file_path):
    
    label_map = read_file(file_path)
#     print(label_map)
    file_name= 'label_map.pbtxt'
    with open(file_name, 'w') as f:
        for idx, class_name in enumerate(label_map):
            f.write('item{\n')
            f.write(f'id:{idx}\n')
            f.write(f"name:'{class_name}'\n")
            f.write('}\n')
            
    return file_name           

In [4]:
def read_label_map(file_path):
    
    label_map_path= create_pbtxt_file(file_path)
    
    label_map = {}
    with open(label_map_path, 'r') as f:
        lines = f.readlines()
        # Initialize variables to store ID and name
        class_id = None
        class_name = None
        # Iterate through each line in the file
        for line in lines:
            # Check for the start of a new item block
            if 'item {' in line:
                # Reset class ID and name
                class_id = None
                class_name = None
            # Check for ID line
            elif 'id' in line:
                # Extract class ID
                class_id = (line.split(':')[1])
            # Check for name line
            elif 'name' in line:
                # Extract class name
                class_name = line.split(":")[1].strip()[1:-1]  # Remove quotes and leading/trailing whitespace
                # Store ID and name in dictionary
                if class_id is not None and class_name is not None:
                    label_map[class_id] = class_name
    return label_map

In [5]:
# PATH_TO_LABELS = r"D:\Transfer Learning Model\Labels\imagenet.shortnames.list"

# category_index = read_label_map(PATH_TO_LABELS)

In [6]:
# category_index

In [7]:
# Define the paths for the model and label map
PATH_TO_MODEL_DIR = r"D:\Transfer Learning Model\Faster RCNN\faster_rcnn_inception_resnet_v2_640x640_coco17_tpu-8\faster_rcnn_inception_resnet_v2_640x640_coco17_tpu-8\saved_model"
PATH_TO_LABELS = r"D:\Transfer Learning Model\Labels\imagenet.shortnames.list"

# Load the model
detection_model = tf.saved_model.load(PATH_TO_MODEL_DIR)

# Load label map using the alternate method
category_index = read_label_map(PATH_TO_LABELS)

# Load pre-trained model
detect_fn = tf.saved_model.load(PATH_TO_MODEL_DIR)


In [8]:
# Perform inference on input image
image_path = r"C:\Users\varsha\Pictures\CV_IMG\1200.jpg"
image_np = tf.io.read_file(image_path)
image_np = tf.image.decode_jpeg(image_np, channels=3)

input_tensor = tf.convert_to_tensor(image_np)
input_tensor = input_tensor[tf.newaxis, ...]

detections = detect_fn(input_tensor)
# print(detections['detection_boxes'][0].numpy())
# print(category_index)

# Visualize results
vis_utils.visualize_boxes_and_labels_on_image_array(
    image_np.numpy(),
    detections['detection_boxes'][0].numpy(),
    detections['detection_classes'][0].numpy().astype(int),
    detections['detection_scores'][0].numpy(),
    category_index,
    use_normalized_coordinates=True,
    max_boxes_to_draw=200,
    min_score_thresh=0.5,
    agnostic_mode=False
)

# Display the results
plt.imshow(image_np)
plt.show()

NameError: name 'vis_utils' is not defined

In [None]:
# # Function to run inference on a single image
# def run_inference_for_single_image(model, image):
#     # Convert the image to a numpy array
#     image_np = np.array(image)
#     # Add a batch dimension to the image
#     input_tensor = tf.convert_to_tensor(image_np)
#     input_tensor = input_tensor[tf.newaxis, ...]
#     # Run inference
#     model_fn = model.signatures['serving_default']
#     output_dict = model_fn(input_tensor)
#     # Convert the model output tensors to numpy arrays
#     num_detections = int(output_dict.pop('num_detections'))
#     output_dict = {key: value[0, :num_detections].numpy() for key, value in output_dict.items()}
#     output_dict['num_detections'] = num_detections
#     # Handle models with masks
#     if 'detection_masks' in output_dict:
#         detection_masks_reframed = utils_ops.reframe_box_masks_to_image_masks(
#             output_dict['detection_masks'], output_dict['detection_boxes'],
#             image.shape[0], image.shape[1])
#         detection_masks_reframed = tf.cast(detection_masks_reframed > 0.5,
#                                            tf.uint8)
#         output_dict['detection_masks_reframed'] = detection_masks_reframed.numpy()
#     return output_dict

# # Function to visualize the detection results
# def visualize(image_np, output_dict):
#     # Visualize the detection boxes and labels on the image
#     vis_util.visualize_boxes_and_labels_on_image_array(
#         image_np,
#         output_dict['detection_boxes'],
#         output_dict['detection_classes'],
#         output_dict['detection_scores'],
#         category_index,
#         instance_masks=output_dict.get('detection_masks_reframed', None),
#         use_normalized_coordinates=True,
#         line_thickness=8)
#     return image_np

# # Load an image
# PATH_TO_IMAGE = r"C:\Users\Pictures\CV_IMG\1200.jpg"
# image = Image.open(PATH_TO_IMAGE)

# # Run inference on the image
# output_dict = run_inference_for_single_image(detection_model, image)

# # Visualize the detection results
# image_np = visualize(int(np.array(image)), output_dict)
# plt.figure(figsize=(12, 8))
# plt.imshow(image_np)
# plt.show()

In [None]:
# Define the paths for the model and label map
PATH_TO_MODEL_DIR = r"D:\Transfer Learning Model\Faster RCNN\faster_rcnn_inception_resnet_v2_640x640_coco17_tpu-8\faster_rcnn_inception_resnet_v2_640x640_coco17_tpu-8\saved_model"
PATH_TO_LABELS = r"D:\Transfer Learning Model\Labels\imagenet.shortnames.list"


In [None]:
import tensorflow as tf
import matplotlib.pyplot as plt
from object_detection.utils import visualization_utils as vis_utils

# Load the pre-trained model
detect_fn = tf.saved_model.load(r"D:\Transfer Learning Model\Faster RCNN\faster_rcnn_inception_resnet_v2_640x640_coco17_tpu-8\faster_rcnn_inception_resnet_v2_640x640_coco17_tpu-8\saved_model")

# Load the ImageNet labels
with open('imagenet.shortnames.list', 'r') as f:
    class_names = f.read().splitlines()

# Perform inference on input image
image_path = r"C:\Users\Pictures\CV_IMG\1200.jpg"
image_np = tf.io.read_file(image_path)
image_np = tf.image.decode_jpeg(image_np, channels=3)

input_tensor = tf.convert_to_tensor(image_np)
input_tensor = input_tensor[tf.newaxis, ...]

detections = detect_fn(input_tensor)

# Visualize results
vis_utils.visualize_boxes_and_labels_on_image_array(
    image_np.numpy(),
    detections['detection_boxes'][0].numpy(),
    detections['detection_classes'][0].numpy().astype(int),
    detections['detection_scores'][0].numpy(),
    class_names,
    use_normalized_coordinates=True,
    max_boxes_to_draw=200,
    min_score_thresh=0.5,
    agnostic_mode=False
)

# Display the results
plt.imshow(image_np)
plt.show()

In [11]:
import cv2
import matplotlib.pyplot as plt

IMAGE_PATH= r"C:\Users\varsha\Pictures\CV_IMG\1200.jpg"

img= cv2.imread(r"C:\Users\varsha\Pictures\CV_IMG\1200.jpg")
plt.imshow(img)

<matplotlib.image.AxesImage at 0x1e055750a10>

In [None]:
img_n= cv2.resize(img, (600,1024))
img_n= img_n/255.0
# img_n

In [None]:
# plt.imshow(img_n)

In [None]:
# Rectangle(xy=(0.350221, 0.169153), width=0.524498, height=0.711739, angle=0)

In [None]:
def convert_boxes_to_original_form(boxes, image_shape):
    # boxes: Numpy array of shape (num_boxes, 4) containing normalized bounding box coordinates
    # image_shape: Tuple containing the height and width of the original image
    
    height, width = image_shape
    original_boxes = []
    for box in boxes:
        ymin, xmin, ymax, xmax = box
        ymin = int(ymin * height)
        xmin = int(xmin * width)
        ymax = int(ymax * height)
        xmax = int(xmax * width)
        original_boxes.append([ymin, xmin, ymax, xmax])
    return original_boxes

In [None]:
# Example usage
normalized_boxes = [[0.169153, 0.350221, 0.711739, 0.524498]]  # Example normalized bounding box coordinates
image_shape = (600,600)  # Example original image shape (height, width)

original_boxes = convert_boxes_to_original_form(normalized_boxes, image_shape)

In [None]:
original_boxes

In [None]:
imr=cv2.rectangle(img_n, (358,101,  427, 537), (255, 0, 0), thickness= 4)
cv2.imshow('dsk', imr)
cv2.waitKey(0)
cv2.destroyAllWindows()

In [None]:
# !pip install tensorflow-object-detection-api

In [None]:
# !pip uninstall object_detection

In [None]:
# Load model checkpoint and pipeline configuration
# model_checkpoint_path = PATH_TO_MODEL_DIR 
# pipeline_config_path = 'pipeline.config'

# # Load label map
# label_map_path = PATH_TO_LABELS
# category_index = load_label_map(PATH_TO_LABELS)
