# <a id='toc1_'></a>[Computer Vision Applications](#toc0_)

**Table of contents**<a id='toc0_'></a>    
- [Computer Vision Applications](#toc1_)    
  - [Object Detection](#toc1_1_)    
    - [Object Detection with Faster R-CNN](#toc1_1_1_)    
  - [Image Segmentation](#toc1_2_)    
    - [Image Segmentation with Detectron2](#toc1_2_1_)    

<!-- vscode-jupyter-toc-config
	numbering=false
	anchor=true
	flat=false
	minLevel=1
	maxLevel=6
	/vscode-jupyter-toc-config -->
<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->

In [None]:
# !pip install torch torchvision albumentations opencv-python
!pip install 'git+https://github.com/facebookresearch/detectron2.git'

Collecting albumentations
  Downloading albumentations-2.0.5-py3-none-any.whl.metadata (41 kB)
Collecting albucore==0.0.23 (from albumentations)
  Downloading albucore-0.0.23-py3-none-any.whl.metadata (5.3 kB)
Collecting eval-type-backport (from albumentations)
  Downloading eval_type_backport-0.2.2-py3-none-any.whl.metadata (2.2 kB)
Collecting opencv-python-headless>=4.9.0.80 (from albumentations)
  Downloading opencv_python_headless-4.11.0.86-cp37-abi3-win_amd64.whl.metadata (20 kB)
Collecting stringzilla>=3.10.4 (from albucore==0.0.23->albumentations)
  Downloading stringzilla-3.12.2-cp39-cp39-win_amd64.whl.metadata (81 kB)
Collecting simsimd>=5.9.2 (from albucore==0.0.23->albumentations)
  Downloading simsimd-6.2.1-cp39-cp39-win_amd64.whl.metadata (67 kB)
Downloading albumentations-2.0.5-py3-none-any.whl (290 kB)
Downloading albucore-0.0.23-py3-none-any.whl (14 kB)
Downloading opencv_python_headless-4.11.0.86-cp37-abi3-win_amd64.whl (39.4 MB)
   ------------------------------------

ERROR: Invalid requirement: "'git+https://github.com/facebookresearch/detectron2.git'": Expected package name at the start of dependency specifier
    'git+https://github.com/facebookresearch/detectron2.git'
    ^


In [None]:
import torch
import torchvision
import numpy as np
import cv2
import matplotlib.pyplot as plt
from torchvision import transforms
from torchvision.models.detection import fasterrcnn_resnet50_fpn
from detectron2.engine import DefaultPredictor
from detectron2.config import get_cfg
from detectron2 import model_zoo

In [None]:
def load_image(image_path):
    image = cv2.imread(image_path)
    image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
    transform = transforms.Compose([transforms.ToTensor()])
    return transform(image).unsqueeze(0)

## <a id='toc1_1_'></a>[Object Detection](#toc0_)
Object detection is a computer vision technique that identifies and localizes objects within an image. Unlike image classification, which assigns a single label to an image, object detection provides bounding boxes around detected objects. Common object detection models include YOLO (You Only Look Once), SSD (Single Shot MultiBox Detector), and Faster R-CNN.

### <a id='toc1_1_1_'></a>[Object Detection with Faster R-CNN](#toc0_)

In [None]:
def object_detection(image_path):
    """Performs object detection on an image using Faster R-CNN."""
    model = fasterrcnn_resnet50_fpn(pretrained=True)
    model.eval()
    
    image_tensor = load_image(image_path)
    with torch.no_grad():
        prediction = model(image_tensor)
    
    # Display the image with bounding boxes
    img = cv2.imread(image_path)
    img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
    
    for i, box in enumerate(prediction[0]['boxes']):
        x1, y1, x2, y2 = map(int, box.numpy())
        cv2.rectangle(img, (x1, y1), (x2, y2), (255, 0, 0), 2)
    
    plt.imshow(img)
    plt.axis("off")
    plt.show()

## <a id='toc1_2_'></a>[Image Segmentation](#toc0_)
Image segmentation is the process of partitioning an image into multiple regions to identify objects more precisely at the pixel level. There are two main types:
- **Semantic Segmentation**: Classifies each pixel in an image into a category (e.g., sky, car, road).
- **Instance Segmentation**: Distinguishes between individual objects of the same class (e.g., two different cars in an image). Models like U-Net and Mask R-CNN are commonly used for segmentation.

### <a id='toc1_2_1_'></a>[Image Segmentation with Detectron2](#toc0_)

> Detectron2 is Facebook AI Research's next generation library that provides state-of-the-art detection and segmentation algorithms. It is the successor of Detectron and maskrcnn-benchmark. It supports a number of computer vision research projects and production applications in Facebook. *(Source: [Facebook GitHub](https://github.com/facebookresearch/detectron2))*

![](../../../../img/detectron2.png)  
(Source: [Facebook GitHub](https://github.com/facebookresearch/detectron2))

In [None]:
def image_segmentation(image_path):
    """Performs instance segmentation on an image using Detectron2's Mask R-CNN."""
    cfg = get_cfg()
    cfg.merge_from_file(model_zoo.get_config_file("COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml"))
    cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url("COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml")
    cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.5
    predictor = DefaultPredictor(cfg)
    
    image = cv2.imread(image_path)
    outputs = predictor(image)
    
    # Display segmented mask
    v = outputs['instances'].pred_masks.cpu().numpy()
    plt.imshow(v[0], cmap='gray')
    plt.axis("off")
    plt.show()