# Robot@Home 2 - Segmentation with Detectron2 `v0.1`

`R@H2 notebook series`   

<a href="https://colab.research.google.com/github/goyoambrosio/RobotAtHome2/blob/master/notebooks/150-Segmentation-with-Detectron2.ipynb"><img align="left" src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open in Colab" title="Open and Execute in Google Colaboratory"></a>



## Introduction

This notebook presents a new example of data processing performed on our mobile robotic dataset. Specifically, we demonstrate how to perform segmentation using Detectron2, a ground-up rewrite of Facebook AI Research's Detectron object detection platform.

## Downloading R@H2 locally

As we have already shown in previous examples, we can install the dataset in the local storage space of the virtual machine, which allows us to apply functions locally and execute them at the highest processing speed.


Next cell will download R@H2 dataset in the local storage. It will take ~8 minutes.

In [1]:
!mkdir -p /content/R@H2/files
!mkdir -p /content/WORKSPACE

%cd /content/WORKSPACE
!gdown 1qmvhcPrMTNKtWpGfWQHUPVqE67356N6I # .rh
%cd /content/R@H2
!gdown 17Mt2KzwJMjvagrP1X-Q0SVbAGUc1M7TW # Robot@Home2_db.tgz
!gdown 1zdG4zA55MrQ6r12L_Fg2KIJgnpD479rf # Robot@Home2_files.tgz

!echo Decompressing database
!tar -xzf Robot@Home2_db.tgz
!echo Decompressing files
!echo '|================================================> 100%'
!tar -xzf Robot@Home2_files.tgz -C ./files --checkpoint=.50000

!rm /content/R@H2/Robot@Home2_db.tgz
!rm /content/R@H2/Robot@Home2_files.tgz

%cd /content/WORKSPACE

/content/WORKSPACE
Downloading...
From: https://drive.google.com/uc?id=1qmvhcPrMTNKtWpGfWQHUPVqE67356N6I
To: /content/WORKSPACE/.rh
100% 211/211 [00:00<00:00, 1.36MB/s]
/content/R@H2
Downloading...
From: https://drive.google.com/uc?id=17Mt2KzwJMjvagrP1X-Q0SVbAGUc1M7TW
To: /content/R@H2/Robot@Home2_db.tgz
100% 517M/517M [00:07<00:00, 69.0MB/s]
Downloading...
From: https://drive.google.com/uc?id=1zdG4zA55MrQ6r12L_Fg2KIJgnpD479rf
To: /content/R@H2/Robot@Home2_files.tgz
100% 12.5G/12.5G [01:34<00:00, 133MB/s]
Decompressing database
Decompressing files
................................................../content/WORKSPACE


We have finished the process with the following folder structure:

    /content
     ├─── /R@H2
     │    │   └── /files
     │    │      ├── /rgbd
     │    │      └── /scene
     │    └────── rh.db
     └─── /WORKSPACE
           └── .rh

## Mounting Google Drive

Remember that Colab's storage is not persistent, so we optionally mount our Google Drive to provide a permanent storage space.

In [2]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


## Installing needed packages

Now, we will install needed packages. We'll start with the Robot@Home package.


In [2]:
!pip install robotathome --quiet

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.0/2.0 MB[0m [31m22.4 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m60.0/60.0 kB[0m [31m6.3 MB/s[0m eta [36m0:00:00[0m
[?25h

In [4]:
import robotathome
print(robotathome.__version__)

1.1.8


For Detectron2 you will have to wait another 4 minutes.

In [5]:
!python -m pip install 'git+https://github.com/facebookresearch/detectron2.git' --quiet

  Preparing metadata (setup.py) ... [?25l[?25hdone
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m50.2/50.2 kB[0m [31m2.6 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m79.5/79.5 kB[0m [31m8.0 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m154.5/154.5 kB[0m [31m14.5 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.7/1.7 MB[0m [31m39.7 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m117.0/117.0 kB[0m [31m14.6 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
  Building wheel for detectron2 (setup.py) ... [?25l[?25hdone
  Building wheel for fvcore (setup.py) ... [?25l[?25hdone
  Building wheel for antlr4-python3-runtime (setup.py) ... [?25l[?25hdone


## Main code

Ok, now that we've set up our environment, we'll proceed to the important code. In this example, we assume that you have some experience with Robot@Home2 and Python, so we won't go into all the details.

First, we import neccesary libraries. [Tqdm](https://https://tqdm.github.io/) is a library that is used for creating Python Progress Bars.

In [6]:
# import some common libraries
import os
import datetime as dt
from tqdm import tqdm
import matplotlib.pyplot as plt
import cv2 as cv
# cv2 causes some trouble to Colab so they provide a patch
# Usually you'll write: from cv2 import imshow
from google.colab.patches import cv2_imshow

### Robot@Home2

It's time for Robot@Home2. As usual, we import some necessary functions from the package.

In [7]:
from robotathome import RobotAtHome
from robotathome import filter_sensor_observations
from robotathome import composed_RGBD_images, concat_images, process_image
from robotathome import logger, log

As we run experiments, we generate a large number of videos. Naming videos sometimes becomes a tedious task. For this reason, we then introduce the next function to which we pass the dataset parameters and it returns a concatenated string with the date and time data so that no two names are the same. An example could be something like: `pare-s1_0_pare_livingroom1_RGBD_1_20230429202717.mp4`

In [8]:
def get_video_filename(rh_dataset: RobotAtHome,
                       home_session_name: str,
                       home_subsession: str,
                       room_name: str,
                       sensor_name: str):
    """Return a filename composed by arguments and current time.

    The filename extension is .avi
    """
    video_filename = ''.join(
        [
            home_session_name,
            '_', str(home_subsession),
            '_', room_name,
            '_', sensor_name,
            dt.datetime.now().strftime("_%Y%m%d%H%M%S"),
            '.mp4'
        ])

    [_, _, _, wspc_path, _] = rh_dataset.get_path_vars()
    video_path_filename = os.path.abspath(os.path.join(wspc_path,
                                                       video_filename))
    return video_path_filename

Like previous examples, we introduce a function to get a video handler.

In [9]:
def get_rh_video_handler(rh_dataset, filename, sensor_names):
    """Return a video handler."""
    sensor_size = rh_dataset.get_RGBD_sensor_size()
    fourcc = cv.VideoWriter_fourcc(*'MJPG')
    # fourcc = cv.VideoWriter_fourcc(*'mp4v')
    out = cv.VideoWriter(filename,
                          fourcc,
                          rh_dataset.get_RGBD_fps(),
                          (len(sensor_names)*sensor_size['w'], sensor_size['h']))
    return out

As we already know, to work with the dataset we instantiante the RobotAtHome class and extract a subset of RGBD observations.

In [10]:
log.set_log_level('INFO')  # SUCCESS is the default

# Fill the variables that constitutes selection filter
home_session_name = 'pare-s1'
home_subsession = 0
room_name = 'pare_livingroom1'
sensor_list = ['RGBD_3', 'RGBD_4', 'RGBD_1', 'RGBD_2'] # Left to right order

# Create an instance of the RobotAtHome class
try: 
      rh = RobotAtHome()
except:
      logger.error("Something was wrong")

# Get the labeled RGB-D observations dataframe
df = rh.get_sensor_observations('lblrgbd')

# Filter the dataframe and get a dictionary with a dataframe per sensor
df_dict = filter_sensor_observations(rh, df,
                                     home_session_name,
                                     home_subsession,
                                     room_name,
                                     sensor_list)

logger.info(f"Labeled RGBD set: {len(df)} observations")
for sensor_name in sensor_list:
    logger.info(f"No. of RGBD observations int the filtered subset for the sensor {sensor_name}: {len(df_dict[sensor_name])} observations")



[32m2023-05-01 20:41:27.224[0m | [32m[1mSUCCESS [0m | [36mrobotathome.core.reader[0m:[36m__open_dataset[0m:[36m141[0m - [32m[1mConnection is established: rh.db[0m
[32m2023-05-01 20:41:27.628[0m | [1mINFO    [0m | [36m__main__[0m:[36m<cell line: 25>[0m:[36m25[0m - [1mLabeled RGBD set: 32937 observations[0m
[32m2023-05-01 20:41:27.629[0m | [1mINFO    [0m | [36m__main__[0m:[36m<cell line: 26>[0m:[36m27[0m - [1mNo. of RGBD observations int the filtered subset for the sensor RGBD_3: 292 observations[0m
[32m2023-05-01 20:41:27.632[0m | [1mINFO    [0m | [36m__main__[0m:[36m<cell line: 26>[0m:[36m27[0m - [1mNo. of RGBD observations int the filtered subset for the sensor RGBD_4: 292 observations[0m
[32m2023-05-01 20:41:27.635[0m | [1mINFO    [0m | [36m__main__[0m:[36m<cell line: 26>[0m:[36m27[0m - [1mNo. of RGBD observations int the filtered subset for the sensor RGBD_1: 292 observations[0m
[32m2023-05-01 20:41:27.639[0m | [1mINFO

### Detectron2

We are now ready to work with Detectron2. We start by importing the necessary functions.

In [11]:
# Setup detectron2 logger
import detectron2
from detectron2.utils.logger import setup_logger
setup_logger()

# import some common detectron2 utilities
from detectron2.config import get_cfg
from detectron2 import model_zoo

from detectron2.engine import DefaultPredictor
from detectron2.utils.video_visualizer import VideoVisualizer
from detectron2.utils.visualizer import ColorMode, Visualizer
from detectron2.data import MetadataCatalog

We can easily configure our model for different tasks. Next two cells are examples of that. The first one creates a model configuration for object detection. The resulting video will show classic boxes around the detected objects. On the other hand, the next cell creates a model for object segmentation. The resulting video will show colored regions with the segmented objects.

You have to run only the chosen one.

In [None]:
# Create config for object detection
cfg = get_cfg()
cfg.merge_from_file(model_zoo.get_config_file("COCO-Detection/faster_rcnn_R_101_FPN_3x.yaml"))
cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.5  # set threshold for this model
cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url("COCO-Detection/faster_rcnn_R_101_FPN_3x.yaml")

In [12]:
# Create config for object segmentation
cfg = get_cfg()
cfg.merge_from_file(model_zoo.get_config_file("COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml"))
cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.5  # set threshold for this model
cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url("COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml")

Next cell, creates the predictor for the previously chosen configuration. A `visualizer` is also created to allow bounding boxes o colored regions to be displayed over each image.

In [13]:
# Create predictor
predictor = DefaultPredictor(cfg)

# Initialize visualizer
v = VideoVisualizer(MetadataCatalog.get(cfg.DATASETS.TRAIN[0]), ColorMode.IMAGE)

[05/01 20:41:50 d2.checkpoint.detection_checkpoint]: [DetectionCheckpointer] Loading from https://dl.fbaipublicfiles.com/detectron2/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl ...


model_final_f10217.pkl: 178MB [00:01, 120MB/s]                           


In the next cell, we define a function to apply the predictor (also known as object detection/segmentation) to the compose image of the selected camera sensors.

In [14]:
def apply_model_detectron_composed(img_dict, predictor, v):
    img_list = list(img_dict.values())
    img_list_result = []
    for img in img_list:
        # Append image from sensor to img_list
        img_list_result.append(img)
    composed_img = cv.hconcat(img_list_result)

    # Make prediction
    outputs = predictor(composed_img)
    # Make sure the img is colored
    composed_img = cv.cvtColor(composed_img, cv.COLOR_RGB2BGR)
    # Draw a visualization of the predictions using the video visualizer
    visualization = v.draw_instance_predictions(composed_img, outputs["instances"].to("cpu"))
    # Convert Matplotlib RGB format to OpenCV BGR format
    processed_img = cv.cvtColor(visualization.get_image(), cv.COLOR_RGB2BGR)
    return processed_img

### Main loop

We are now going to apply the predictor to all selected images in the data set. Next we show the main loop where the previous function is passed as an argument to be applied to each image.

Thanks to the `tqdm` function, the expected number of iterations and progress will be displayed.

In [16]:
# Building video filename
video_filename = get_video_filename(rh,
                                    home_session_name,
                                    home_subsession,
                                    room_name,
                                    '-'.join(sensor_list))
# Getting video handler
video_handler = get_rh_video_handler(rh,
                                     video_filename,
                                     sensor_list)

# #############################################
#                Main loop
# #############################################

# Iterate over the dictionary of dataframes, i.e. frame by frame
for (RGB_image_dict, D_image_dict) in tqdm(composed_RGBD_images(
                                           rh,
                                           df_dict),
                                           desc =f"Wait for {len(df_dict[sensor_list[0]])} iterations"):
    def f(img_dict):
        # return my_function(img_dict, par1,... parn)
        return apply_model_detectron_composed(img_dict, predictor, v)
    # Apply f to img_dict
    resulting_img = process_image(f, RGB_image_dict)
    # cv2_imshow(resulting_img)
    video_handler.write(resulting_img)

# Closing video file
video_handler.release()

Wait for 292 iterations: 292it [01:39,  2.94it/s]
