<a href="https://colab.research.google.com/github/AntonBaumannDE/detectron2_DTS19/blob/master/DigitalTechSummit.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Overview


**Instance and Keypoint Segmentation on the same Video.**

**Notebook was made during the Digital Tech Summit Hackathon in 2019.**

At the time of the Hackathon Detectron2 was just released a few hours ago. I thought it'd be a cool idea to play around with both, instance and keypoint segmentation in the same video. This Notebook allows you to create some awesome visualizations for your custom videos.  

How to use it: 

1.   Mount your Google Drive so you can import your videos (Just run the first code snippet)
2.   Define your directories and run everything
3.   Put the frames back into a video

As soon as I have some spare time I will have to fix the different color variations in each frame.
You could solve this by enabling H.264 encoding for ffmpeg and to compile openCV in the Colab runtime. Detectron has a separate visualizer for videos, which would allow the user to run inference on videos much easier and faster than I did during the hackathon.  


Many thanks to Yuxin Wu for taking the time to answer my questions and for setting up this decent tutorial on colab: https://colab.research.google.com/drive/16jcaJoc6bCFAQ96jDe2HwtXj7BMD_-m5


//Anton





#Define some input paths

In [0]:
from google.colab import drive
drive.mount('/content/drive')

**Let's define the directories so we can load our custom *video*:**

In [0]:
import os
inputpath = str('/content/drive/My Drive/video.mp4')                              #Path to the video you want to be processed
storagepath = str('/content/drive/My Drive/Colab Notebooks/Frames')
outputpath = str('/content/drive/My Drive/Colab Notebooks/finalGripper')
os.mkdir(storagepath)                                                             #make a folder for storing all the frames after Instance Segmentation
os.mkdir(outputpath)

# Install Detectron2

In [0]:
!pip install -U torch torchvision
import torch, torchvision
torch.__version__
!pip install git+https://github.com/facebookresearch/fvcore.git

In [0]:
!git clone https://github.com/facebookresearch/detectron2 detectron2_repo
!pip install -e detectron2_repo

In [0]:
# You may need to restart your runtime prior to this, to let your installation take effect
# Some basic setup
# Setup detectron2 logger
import detectron2
from detectron2.utils.logger import setup_logger
setup_logger()

# import some common libraries
import matplotlib.pyplot as plt
import numpy as np
import cv2
from google.colab.patches import cv2_imshow

# Instance Segmentation

In [0]:
cap = cv2.VideoCapture(inputpath)
if (cap.isOpened()== False): 
  print("Error opening video stream or file")
frame_width = int(cap.get(4))
frame_height = int(cap.get(3))

import numpy as np
from detectron2.engine import DefaultPredictor
from detectron2.config import get_cfg
from detectron2.utils.visualizer import Visualizer
from detectron2.data import MetadataCatalog
from detectron2.utils.visualizer import ColorMode
from google.colab.patches import cv2_imshow

cfg = get_cfg()
cfg.merge_from_file("./detectron2_repo/configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml")
cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.5                                      # set threshold for this model
cfg.MODEL.WEIGHTS = "detectron2://COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl"
predictor = DefaultPredictor(cfg)
import numpy as np
i=0
while(i<300):                                                                    #Video with up to 300 frames gets segmented into each frame
  ret, frame = cap.read()
  if ret == True:
    #frame = np.rot90(frame)
    #frame = np.rot90(frame)
    #frame = np.rot90(frame)
    outputs = predictor(frame)
    
    v = Visualizer(frame, MetadataCatalog.get("coco_2017_val"), scale=1.5, instance_mode=ColorMode.IMAGE_BW)
    v = v.draw_instance_predictions(outputs["instances"].to("cpu"))
    v = v.get_image()[:, :, ::-1]
    cv2.imwrite(storagepath + '/result_%04i.jpg' %i, v )
    i+=1
    if i == 3:
      print("Processing the video! Here is a frame:")
      cv2_imshow(v)
    
  # Break the loop
  else:
    break 
 
# When everything done, release the video capture and video write objects
cap.release()
# Closes all the frames
cv2.destroyAllWindows()
print('Done')

In [0]:
# look at the outputs
outputs["instances"].pred_classes
outputs["instances"].pred_boxes

# Additional Keypoint Segmentation



In [0]:
#check how many frames were processed and stored
import os
path, dirs, files = next(os.walk(storagepath))
frame_count = len(files)
print(frame_count)

In [0]:
# Here you could potentially crop your frames testwise and apply it in the code down below
im = cv2.imread(outputpath + "/result_0000.jpg")
assert not isinstance(im,type(None)), 'image not found'
im = im[0:1080, 0:1620]  #Bottom pixel to top pixel;left to right pixel
cv2_imshow(im)

In [0]:
import numpy as np
from detectron2.engine import DefaultPredictor
from detectron2.config import get_cfg
from detectron2.utils.visualizer import Visualizer
from detectron2.data import MetadataCatalog
from detectron2.utils.visualizer import ColorMode

cfg = get_cfg()
cfg.merge_from_file("./detectron2_repo/configs/COCO-Keypoints/keypoint_rcnn_R_50_FPN_3x.yaml")
cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.7  # set threshold for this model
cfg.MODEL.WEIGHTS = "detectron2://COCO-Keypoints/keypoint_rcnn_R_50_FPN_3x/137849621/model_final_a6e10b.pkl"
predictor = DefaultPredictor(cfg)

import glob
i=0
for frame in glob.iglob(storagepath + '/*.jpg'):
    frame = cv2.imread(frame)
    outputs = predictor(frame)
    v = Visualizer(frame[:,:,::-1], MetadataCatalog.get(cfg.DATASETS.TRAIN[0]), scale=1.5)
    v = v.draw_instance_predictions(outputs["instances"].to("cpu"))
    v = v.get_image()[:, :, ::-1]
    #v = v[0:1080, 300:1500] #in case you want to crop a video
    cv2.imwrite(outputpath + '/result_%04i.jpg' %i, v)                                                     
    i+=1
    if i == 3:
      print('Processing your images!')

# Closes all the frames
cv2.destroyAllWindows() 
  

In [0]:
# look at the outputs
outputs["instances"].pred_classes
outputs["instances"].pred_boxes

In [0]:
#Run this if you want to delete the instance segmented images
from shutil import rmtree
rmtree(storagepath)

# Export results as video

In [0]:
pathIn= outputpath
pathOut = '/content/drive/My Drive/Colab Notebooks/processedVideo.mp4' #Store the final video
fps = 20 #change as you wish

import cv2
import numpy as np
import os
from os.path import isfile, join
import glob

frame_array = []
files = [f for f in os.listdir(pathIn) if isfile(join(pathIn, f))]
#for sorting the file names properly
files.sort(key = lambda x: x[5:-4])
files.sort()

frame_array = []
files = [f for f in os.listdir(pathIn) if isfile(join(pathIn, f))]
#for sorting the file names properly
files.sort(key = lambda x: x[5:-4])
for frame in glob.iglob(pathIn + '/*.jpg'):
    #reading each files
    img = cv2.imread(frame)
    height, width, layers = img.shape
    size = (width,height)
    
    #inserting the frames into an image array
    frame_array.append(img)
out = cv2.VideoWriter(pathOut,cv2.VideoWriter_fourcc(*'DIVX'), fps, size)
for i in range(len(frame_array)):
    # writing to a image array
    out.write(frame_array[i])
out.release()

/content/drive/My Drive/Colab Notebooks/finalGripper
