## Introduction

<img src="https://dl.fbaipublicfiles.com/detectron2/Detectron2-Logo-Horz.png" width="500">


## Before you start

Let's make sure that we have access to GPU. We can use `nvidia-smi` command to do that. In case of any problems navigate to `Edit` -> `Notebook settings` -> `Hardware accelerator` and set it to `GPU`.

In [None]:
!nvidia-smi

## Install Detectron2 and dependencies

In [None]:
!python -m pip install 'git+https://github.com/facebookresearch/detectron2.git'

Now is a good time to confirm that we have the right versions of the libraries at our disposal.

In [None]:
import torch, detectron2
!nvcc --version
TORCH_VERSION = ".".join(torch.__version__.split(".")[:2])
CUDA_VERSION = torch.__version__.split("+")[-1]
print("torch: ", TORCH_VERSION, "; cuda: ", CUDA_VERSION)
print("detectron2:", detectron2.__version__)

# Download Dataset

In [None]:
%pip install gdown
import gdown

download_link = 'https://drive.google.com/u/0/uc?id=1owRpmEzXdShPq1erPD_QTTXRatEaR_MZ&export=download'
# download to dataset folder

gdown.download(download_link, quiet=False)

download_link = 'https://drive.google.com/u/0/uc?id=1aX2HY1JF9iOS_bn4d8vilh-FwtQSTBxZ&export=download'
# download to dataset folder
gdown.download(download_link, quiet=False)

download_link = 'https://drive.google.com/u/0/uc?id=1KPqhqt9NVvk8dAjdXDkr4B4LbcJgyO81&export=download'
# download to dataset folder
gdown.download(download_link, quiet=False)

# unzip dataset
!unzip -q dataset.zip

# remove zip file
!rm dataset.zip

!mkdir dataset

# un tar gz dataset
!tar -xvzf FinnForest3.0.tar.gz

!mkdir -p dataset/train/images
!mkdir -p dataset/valid/images
!mkdir -p dataset/train/labels
!mkdir -p dataset/valid/labels

# !mv FinnForest3.0/rgb/train/* dataset/train/images/
# !mv FinnForest3.0/rgb/val/* dataset/valid/images/


!mv FinnForest3.0/train.json dataset/train/
!mv FinnForest3.0/val.json dataset/valid/


### Clone Upscale Codes and install requirements


In [None]:
!git clone https://github.com/xinntao/Real-ESRGAN.git
%cd Real-ESRGAN
# Set up the environment
!pip install basicsr
!pip install facexlib
!pip install gfpgan
!pip install -r requirements.txt
!python setup.py develop

### UpScale Train Data

In [None]:
import os
from google.colab import files
import shutil

TRAIN_SRC_FOLDER = 'FinnForest3.0/rgb/train/'

RESULT_FOLDER = '/content/Real-ESRGAN/results/'

if os.path.isdir(RESULT_FOLDER):
    shutil.rmtree(RESULT_FOLDER)
os.mkdir(RESULT_FOLDER)

!python inference_realesrgan.py -n RealESRNet_x4plus.pth -i {TRAIN_SRC_FOLDER}



!mv {RESULT_FOLDER}* dataset/train/images/

### UpScale Validation Data

In [None]:
VALID_SRC_FOLDER = 'FinnForest3.0/rgb/val/'

if os.path.isdir(RESULT_FOLDER):
    shutil.rmtree(RESULT_FOLDER)
os.mkdir(RESULT_FOLDER)

!python inference_realesrgan.py -n RealESRNet_x4plus.pth -i {VALID_SRC_FOLDER}
!mv {RESULT_FOLDER}* dataset/valid/images/


In [None]:
# COMMON LIBRARIES
import os
import cv2

from datetime import datetime
# from google.colab.patches import cv2_imshow

# DATA SET PREPARATION AND LOADING
from detectron2.data.datasets import register_coco_instances
from detectron2.data import DatasetCatalog, MetadataCatalog

# VISUALIZATION
from detectron2.utils.visualizer import Visualizer
from detectron2.utils.visualizer import ColorMode

# CONFIGURATION
from detectron2 import model_zoo
from detectron2.config import get_cfg

# EVALUATION
from detectron2.engine import DefaultPredictor

# TRAINING
from detectron2.engine import DefaultTrainer

## Run a Pre-trained Detectron2 Model

Before you start training, it's a good idea to check that everything is working properly. The best way to do this is to perform inference using a pre-trained model.

In [None]:
from google.colab.patches import cv2_imshow

!wget http://images.cocodataset.org/val2017/000000439715.jpg -q -O input.jpg
image = cv2.imread("input.jpg")
cv2_imshow(image)
# cv2.imshow('frame', image)
# cv2.waitKey(0)
# cv2.destroyAllWindows()

In [None]:
cfg = get_cfg()
cfg.merge_from_file(model_zoo.get_config_file("COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml"))
cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.5
cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url("COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml")
predictor = DefaultPredictor(cfg)
outputs = predictor(image)

In [None]:
print(outputs["instances"].pred_classes)
print(outputs["instances"].pred_boxes)

In [None]:
visualizer = Visualizer(image[:, :, ::-1], MetadataCatalog.get(cfg.DATASETS.TRAIN[0]), scale=1.2)
out = visualizer.draw_instance_predictions(outputs["instances"].to("cpu"))
cv2_imshow(out.get_image()[:, :, ::-1])
# cv2.imshow('frame', out.get_image()[:, :, ::-1])
# cv2.waitKey(0)
# cv2.destroyAllWindows()

## COCO Format Dataset

### Download

We use `finnforest-segmentation` dataset as example.
Structure of your dataset should look like this:

```
dataset-directory/
├─ train
│  ├─ train-image-1.jpg
│  ├─ train-image-1.jpg
│  ├─ ...
│  └─ train.json
│
└─ valid
   ├─ valid-image-1.jpg
   ├─ valid-image-1.jpg
   ├─ ...
   └─ val.json
```

### Register

When you use Detectron2, before you actually train the model you need to [register it](https://detectron2.readthedocs.io/en/latest/tutorials/datasets.html#register-a-coco-format-dataset).

In [None]:
TRAIN_DATA_SET_NAME='finnforest_train'
register_coco_instances(
    name='finnforest_train',
    metadata={},
    json_file='dataset/train/train.json',
    image_root='dataset/train/images'
)

In [None]:
# VALID SET
VALID_DATA_SET_NAME='finnforest_valid'
register_coco_instances(
    name='finnforest_valid',
    metadata={},
    json_file='dataset/valid/val.json',
    image_root='dataset/valid/images'
)

We can now confirm that our custom dataset was correctly registered using [MetadataCatalog](https://detectron2.readthedocs.io/en/latest/modules/data.html#detectron2.data.MetadataCatalog).

In [None]:
[
    data_set
    for data_set
    in MetadataCatalog.list()
    if data_set.startswith('finnforest')
]

### Visualize

Let's take a look at single entry from out train dataset.

In [None]:
metadata = MetadataCatalog.get(TRAIN_DATA_SET_NAME)
dataset_train = DatasetCatalog.get(TRAIN_DATA_SET_NAME)

dataset_entry = dataset_train[0]
image = cv2.imread(dataset_entry["file_name"])

visualizer = Visualizer(
    image[:, :, ::-1],
    metadata=metadata,
    scale=0.8,
    instance_mode=ColorMode.IMAGE_BW
)

out = visualizer.draw_dataset_dict(dataset_entry)
cv2_imshow(out.get_image()[:, :, ::-1])
# cv2.imshow('frame', out.get_image()[:, :, ::-1])
# cv2.waitKey(0)
# cv2.destroyAllWindows()

## Train Model Using Custom COCO Format Dataset

### Configuration

### CONFIGS Options

| Name | In sched | train time (s/iter) | inference time (s/im) | train mem (GB) | box AP | mask AP | model id | Config |
| --- | --- | --- | --- | --- | --- | --- | --- | --- |
| R50-C4 | 1x | 0.584 | 0.11 | 5.2 | 36.8 | 32.2 | 137259246 | mask_rcnn_R_50_C4_1x |
| R50-DC5 | 1x | 0.471 | 0.076 | 6.5 | 38.3 | 34.2 | 137260150 | mask_rcnn_R_50_DC5_1x |
| R50-FPN | 1x | 0.261 | 0.043 | 3.4 | 38.6 | 35.2 | 137260431 | mask_rcnn_R_50_FPN_1x |
| R50-C4 | 3x | 0.575 | 0.111 | 5.2 | 39.8 | 34.4 | 137849525 | mask_rcnn_R_50_C4_3x |
| R50-DC5 | 3x | 0.47 | 0.076 | 6.5 | 40 | 35.9 | 137849551 | mask_rcnn_R_50_DC5_3x |
| R50-FPN | 3x | 0.261 | 0.043 | 3.4 | 41 | 37.2 | 137849600 | mask_rcnn_R_50_FPN_3x |
| R101-C4 | 3x | 0.652 | 0.145 | 6.3 | 42.6 | 36.7 | 138363239 | mask_rcnn_R_101_C4_3x |
| R101-DC5 | 3x | 0.545 | 0.092 | 7.6 | 41.9 | 37.3 | 138363294 | mask_rcnn_R_101_DC5_3x |
| R101-FPN | 3x | 0.34 | 0.056 | 4.6 | 42.9 | 38.6 | 138205316 | mask_rcnn_R_101_FPN_3x |
| X101-FPN | 3x | 0.69 | 0.103 | 7.2 | 44.3 | 39.5 | 139653917 | mask_rcnn_X_101_32x8d_FPN_3x |


**JUST COPY THE CONFIG SECTION INTO ARCHITECTURE SECTION CODE**

### Which model to choose?
Its totally depend on the task and the resouce that we have. If we have a lot of data and a lot of resource, we can choose the model with the highest accuracy. If we have a small dataset and a small resource, we can choose the model with the lowest accuracy.

The important metrics in the table above are:
- **train time**: how long it takes to train the model for one iteration (in seconds)
- **inference time**: how long it takes to perform inference on one image (in seconds)
- **train mem**: how much GPU memory it takes to train the model for one iteration (in GB)
- **box AP**: average precision for bounding box detection
- **mask AP**: average precision for instance segmentation

In [None]:
# HYPERPARAMETERS
ARCHITECTURE = "mask_rcnn_R_101_FPN_3x" # Pick from the above table config section
CONFIG_FILE_PATH = f"COCO-InstanceSegmentation/{ARCHITECTURE}.yaml"
MAX_ITER = 100000
EVAL_PERIOD = 200
BASE_LR = 0.001
NUM_CLASSES = 8

# OUTPUT DIR
OUTPUT_DIR_PATH = os.path.join(
    'runs_detectron',
    ARCHITECTURE,
    datetime.now().strftime('%Y-%m-%d-%H-%M-%S')
)

os.makedirs(OUTPUT_DIR_PATH, exist_ok=True)

In [None]:
VALID_DATA_SET_NAME='finnforest_valid'

cfg = get_cfg()
cfg.merge_from_file(model_zoo.get_config_file(CONFIG_FILE_PATH))
cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url(CONFIG_FILE_PATH)
cfg.DATASETS.TRAIN = (TRAIN_DATA_SET_NAME,)
cfg.DATASETS.TEST = (VALID_DATA_SET_NAME,)
cfg.MODEL.ROI_HEADS.BATCH_SIZE_PER_IMAGE = 64
cfg.TEST.EVAL_PERIOD = EVAL_PERIOD
cfg.DATALOADER.NUM_WORKERS = 2
cfg.SOLVER.IMS_PER_BATCH = 2
cfg.INPUT.MASK_FORMAT='bitmask'
cfg.SOLVER.BASE_LR = BASE_LR
cfg.SOLVER.MAX_ITER = MAX_ITER
cfg.MODEL.ROI_HEADS.NUM_CLASSES = NUM_CLASSES
cfg.OUTPUT_DIR = OUTPUT_DIR_PATH
print(cfg)

### Training

In [None]:
trainer = DefaultTrainer(cfg)
trainer.resume_or_load(resume=False)
trainer.train()

### Result and metrics

After a successfull train it will generate the weight and metrics in a folder of `run_detectron/{MODEL_NAME}/{DATE}`


In [None]:
# Look at training curves in tensorboard:
%load_ext tensorboard
%tensorboard --logdir $OUTPUT_DIR_PATH

### Evaluation

In [None]:
cfg.MODEL.WEIGHTS = os.path.join(cfg.OUTPUT_DIR, "model_final.pth")
cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.7
predictor = DefaultPredictor(cfg)

**Visualization on Validation Data**

In [None]:
dataset_valid = DatasetCatalog.get(VALID_DATA_SET_NAME)

for d in dataset_valid:
    img = cv2.imread(d["file_name"])
    outputs = predictor(img)

    visualizer = Visualizer(
        img[:, :, ::-1],
        metadata=metadata,
        scale=0.8,
        instance_mode=ColorMode.IMAGE_BW
    )
    out = visualizer.draw_instance_predictions(outputs["instances"].to("cpu"))
    cv2_imshow(out.get_image()[:, :, ::-1])
#     cv2.imshow('frame', out.get_image()[:, :, ::-1])
#     cv2.waitKey(0)
# cv2.destroyAllWindows()

**Evaluation based on Map Metrics**

In [None]:
from detectron2.evaluation import COCOEvaluator, inference_on_dataset, DatasetEvaluators, DatasetEvaluator
from detectron2.data import build_detection_test_loader


In [None]:
from detectron2.data import MetadataCatalog

def bb_intersection_over_union(boxA, boxB):
	# determine the (x, y)-coordinates of the intersection rectangle
	xA = max(boxA[0], boxB[0])
	yA = max(boxA[1], boxB[1])
	xB = min(boxA[2], boxB[2])
	yB = min(boxA[3], boxB[3])
	# compute the area of intersection rectangle
	interArea = max(0, xB - xA + 1) * max(0, yB - yA + 1)
	# compute the area of both the prediction and ground-truth
	# rectangles
	boxAArea = (boxA[2] - boxA[0] + 1) * (boxA[3] - boxA[1] + 1)
	boxBArea = (boxB[2] - boxB[0] + 1) * (boxB[3] - boxB[1] + 1)
	# compute the intersection over union by taking the intersection
	# area and dividing it by the sum of prediction + ground-truth
	# areas - the interesection area
	iou = interArea / float(boxAArea + boxBArea - interArea)
	# return the intersection over union value
	return iou

class Counter(DatasetEvaluator):
  def __init__(self, dataset_name):
    self._metadata = MetadataCatalog.get(dataset_name)
    self.valid_data = DatasetCatalog.get(VALID_DATA_SET_NAME)
    self.tp=0
    self.detections = 0
    self.all_gts = 0
    #print(self.valid_data)
  def reset(self):
    self.count = 0
  def process(self, inputs, outputs):
    for output in outputs:
      self.count += len(output["instances"])
      #print(output["instances"].pred_boxes.tensor.cpu().numpy())
      #print(inputs)
      targets = []
      for d in self.valid_data:

        if d['image_id'] == inputs[0]['image_id']:
          for i in d['annotations']:
            targets.append(i['bbox'])
      #print(np.array(targets))
      self.detections=self.detections+len(output["instances"].pred_boxes.tensor.cpu().numpy())
      self.all_gts=self.all_gts+len(targets)

      for tg_idx in np.array(targets):
        box1_x1, box1_y1, box1_w, box1_h = tg_idx
        box1_x2, box1_y2 = box1_x1 + box1_w, box1_y1 + box1_h
        for in_idx in output["instances"].pred_boxes.tensor.cpu().numpy():
            box2_x1, box2_y1, box2_w, box2_h = in_idx
            box2_x2, box2_y2 = box2_x1 + box2_w, box2_y1 + box2_h

            if bb_intersection_over_union([box1_x1, box1_y1, box1_x2, box1_y2], [box2_x1, box2_y1, box2_x2, box2_y2]) >= 0.5:
              self.tp=self.tp+1



  def evaluate(self):
    # save self.count somewhere, or print it, or return it.
    return {"Instances": self.count, "tp": self.tp, "P": self.tp/self.detections, "R": self.tp/self.all_gts }

In [None]:
COCOEvaluator(VALID_DATA_SET_NAME, cfg, False, output_dir="./output/")

In [None]:
import numpy as np

In [None]:
#Call the COCO Evaluator function and pass the Validation Dataset
evaluator =  DatasetEvaluators([COCOEvaluator(VALID_DATA_SET_NAME, cfg, False, output_dir="./output/"), Counter(VALID_DATA_SET_NAME)])
val_loader = build_detection_test_loader(cfg, VALID_DATA_SET_NAME)

#Use the created predicted model in the previous step
values = inference_on_dataset(predictor.model, val_loader, evaluator)

**maps of object detection**

In [None]:
values['bbox']

AP50

In [None]:
values['bbox']['AP50']

AP50:95

In [None]:
values['bbox']['AP']

**maps of segmentation**

In [None]:
values['segm']

AP50

In [None]:
values['segm']['AP50']

AP50:95

In [None]:
values['segm']['AP']

# **Recall & precision **

In [None]:
values['P']

In [None]:
values['R']