 <center><font size="6">Welcome to my detectron2 notebook.

I am at the University of Michigan working on a research project in Ye Labs. As part of my responsibilities, I was asked to find a method for removing a 'dynamic' background in a set of videos which include animals, leaving behind only the animals. After searching around for solutions and spending a day *trying* to make the [matterport Mask R-CNN](https://github.com/matterport/Mask_RCNN) code work natively on my Mac M1, I stumbled upon a [youtube video](https://www.youtube.com/watch?v=9a_Z14M-msc) explaining how to work with detectron2. Detectron2 was immediately appealing and made me realize that the problem statement I was working with needed to be adjusted. I mentally changed my assignment to "given a video with animals in it, return only the animals without any other background content." This may sound the same as my original assignment, but to me, the subtle difference in thought process was significant. So I decided to train and use detectron2 as the base for identifying the animals in my videos, creating instance masks, extracting the animals from videos based on these masks, pasting them onto a blank canvas, and then ultimately saving the newly constructed canvas as a video frame in a new video.   

This notebook includes only the inference portion of my work. Please see my "Detecting_Multiple_Things_With_Detectron2" notebook for detailed explanations of every step from dataset creation, image annotations, training, as well as the inference steps found in here.  I have stripped down this notebook to the bare minimum even excluding comments in code to minimize clutter. 

# Set up the notebook

In [None]:
local = False
nvidia_alt ='cpu'

In [None]:
if not local: 
    from google.colab import drive
    drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [None]:
# Check which versions of pyyaml, torch, and torchvision are installed
# If the versions I want are not installed, do it now
# If they are installed, don't waste time downloading the wheel
!pip install pyyaml
torch_version = !pip show torch | grep Version
torchvision_version = !pip show torchvision | grep Version
opencv_version = !pip show opencv-python | grep Version
if opencv_version[0] != 'Version: 4.5.5.64':
  !pip uninstall opencv-python --yes
  !pip install opencv-python==4.5.5.64
else:
  print('opencv ' + opencv_version[0] + ' already installed')
if torch_version[0] != 'Version: 1.10.1+cu113':
  !pip install torch==1.10.1+cu113 -f \
    https://download.pytorch.org/whl/torch_stable.html
else:
  print('torch ' + torch_version[0] + ' already installed')
if torchvision_version[0] != 'Version: 0.11.2+cu113':
  !pip install torchvision==0.11.2+cu113 -f \
    https://download.pytorch.org/whl/torch_stable.html
else:
  print('torchvision ' + torchvision_version[0] + ' already installed')

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Looking in links: https://download.pytorch.org/whl/torch_stable.html
Collecting torch==1.10.1+cu113
  Downloading https://download.pytorch.org/whl/cu113/torch-1.10.1%2Bcu113-cp38-cp38-linux_x86_64.whl (1821.4 MB)
[2K     [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[91m╸[0m [32m1.8/1.8 GB[0m [31m32.9 MB/s[0m eta [36m0:00:01[0mtcmalloc: large alloc 1821442048 bytes == 0x3f7a000 @  0x7f25da6211e7 0x4d30a0 0x4d312c 0x5d6f4c 0x51edd1 0x51ef5b 0x4f750a 0x4997a2 0x55cd91 0x5d8941 0x4997a2 0x55cd91 0x5d8941 0x4997a2 0x55cd91 0x5d8941 0x4997a2 0x55cd91 0x5d8941 0x4997a2 0x55cd91 0x5d8941 0x4997a2 0x5d8868 0x4997a2 0x55cd91 0x5d8941 0x49abe4 0x55cd91 0x5d8941 0x4997a2
[2K     [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[91m╸[0m [32m1.8/1.8 GB[0m [31m108.9 MB/s[0m eta [36m0:00:

In [None]:
# Check which version of detectron2 is installed.
# If the version I want is not installed, do it now
# If it is installed, do not waste time downloading the wheel 
detectron2_version = !pip show detectron2 | grep Version
if detectron2_version[0] != 'Version: 0.6+cu113':
  !python -m pip install detectron2==0.6 -f \
    https://dl.fbaipublicfiles.com/detectron2/wheels/cu113/torch1.10/index.html
else:
  print('detectron2 ' + detectron2_version[0] + ' already installed')

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Looking in links: https://dl.fbaipublicfiles.com/detectron2/wheels/cu113/torch1.10/index.html
Collecting detectron2==0.6
  Downloading https://dl.fbaipublicfiles.com/detectron2/wheels/cu113/torch1.10/detectron2-0.6%2Bcu113-cp38-cp38-linux_x86_64.whl (7.0 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m7.0/7.0 MB[0m [31m11.4 MB/s[0m eta [36m0:00:00[0m
Collecting yacs>=0.1.8
  Downloading yacs-0.1.8-py3-none-any.whl (14 kB)
Collecting omegaconf>=2.1
  Downloading omegaconf-2.3.0-py3-none-any.whl (79 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m79.5/79.5 KB[0m [31m3.5 MB/s[0m eta [36m0:00:00[0m
Collecting hydra-core>=1.1
  Downloading hydra_core-1.3.1-py3-none-any.whl (154 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m154.1/154.1 KB[0m [31m14.5 MB/s[0m eta [36m0:00:00[0m
Collecting iopath<0.1.10,>=0.1.7
  Downloadi

## Get Info About Your Environment
---

In [None]:
# If you are local, I assume you know your resources; Therefore, it does not
# run when you are operating locally
if not local:
    !nvidia-smi -L
    !lscpu |grep 'Model name'

GPU 0: Tesla T4 (UUID: GPU-35e4c9e4-addd-8f25-e8d6-652f6ddbd261)
Model name:          Intel(R) Xeon(R) CPU @ 2.20GHz


#Import The Required Libraries
---

In [None]:
import os
import torch 
import cv2
import random
import numpy as np
import matplotlib.pyplot as plt
from datetime import datetime 
from tqdm.notebook import tqdm_notebook
from detectron2.utils.visualizer import ColorMode
from detectron2.utils.video_visualizer import VideoVisualizer

%matplotlib inline


if local:
  from tqdm import tqdm as tqdm_notebook
else: 
  from tqdm.notebook import tqdm_notebook
  from google.colab.patches import cv2_imshow
  from google.colab import runtime



if torch.cuda.is_available():
  device = 'cuda'
else:
  device = nvidia_alt

#Set up Your Global Variables Here. <font color='red'>Run Every Time!</font>
---

In [None]:
my_dataset_path = '/content/drive/MyDrive/datasets/'
my_things = ['rat', 'larva',]
my_display_things = ['rat','larva',]
my_prediction_threshold = 0.75
my_dataset_name = 'combo'
image_extensions = ['.jpg','.png',]
video_extensions = ['.mp4','.mov', '.avi',]
disconnect_on_complete = True

In [None]:
if not local:
  import sys
  sys.path.append(my_dataset_path + my_dataset_name + '/sys/')
from thing_masker import thing_masker 

# Test Your Model on Images!

In [None]:
thing = thing_masker(my_things, my_dataset_name,\
                     my_dataset_path) 
my_dataset_metadata = thing.metadata

In [None]:
temp_weights = my_dataset_path + my_dataset_name + \
        '/models/' + my_dataset_name + '_model_final.pth'
temp_config=None
thing.init_predictor(my_prediction_threshold, device, 
                     temp_weights, temp_config)
thing.metadata.thing_classes

['rat', 'larva']

###Process some images and display results

In [None]:
test_images = thing.get_test_files(image_extensions, thing.test_directory)
for d in random.sample(test_images, 6):
  image_from_file = cv2.imread(d)
  thing.update_outputs(image_from_file)
  masked_image = thing.get_masked_image(image_from_file, my_display_things)
  original_with_masks = thing.get_original_with_masks(image_from_file,
                                                      my_display_things)
  print(d)
  fig = plt.figure(figsize=(30, 30))
  fig.add_subplot(1, 2, 1)
  plt.imshow(masked_image)
  plt.axis('off')
  plt.title("Masked")
  fig.add_subplot(1,2,2)
  plt.imshow(cv2.cvtColor(original_with_masks.get_image()[:, :, ::-1],\
                          cv2.COLOR_BGR2RGB))
  plt.axis('off')
  plt.title("Original With Mask")
  plt.show()


Output hidden; open in https://colab.research.google.com to view.

#Process a Video

##Start Working with Video

In [None]:
test_videos = thing.get_test_files(video_extensions, thing.video_test_directory)
test_videos

['/content/drive/MyDrive/datasets/combo/videos/rat_23_cocaine.mp4',
 '/content/drive/MyDrive/datasets/combo/videos/rat_7_cocaine.mp4',
 '/content/drive/MyDrive/datasets/combo/videos/rat_14_cocaine.mp4',
 '/content/drive/MyDrive/datasets/combo/videos/rat_20_cocaine.mp4',
 '/content/drive/MyDrive/datasets/combo/videos/rat_28_cocaine.mp4']

In [None]:
test_video_name = random.sample(test_videos, 5)
for video in test_video_name:
  print(video)
  cap = cv2.VideoCapture(video)
  width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
  height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
  frames_per_second = cap.get(cv2.CAP_PROP_FPS)
  num_frames = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
  date_time = str(datetime.now()).replace(" ", "_")
  masked_images_video_result = thing.video_output_directory + "masked_images_" \
                                + os.path.split(video)[1][:-4]+ '_'+\
                                date_time + "_result.avi"
  original_showing_masks_temp = thing.video_output_directory + \
    "original_showing_masks_" + os.path.split(video)[1][:-4] + '_'+\
    date_time + "_temp.avi" 
  original_showing_masks_result = thing.video_output_directory + \
    "original_showing_masks_" + os.path.split(video)[1][:-4] + '_'+\
    date_time + "_result.mp4" 
  fourcc = cv2.VideoWriter_fourcc(*'MJPG')
  output_writer1 = cv2.VideoWriter(masked_images_video_result, fourcc, 
                                  frames_per_second, (width, height))
  output_writer2 = cv2.VideoWriter(original_showing_masks_temp, fourcc, 
                                  frames_per_second, (width, height))
  if num_frames == 0:
      cap.release()
      assert num_frames == 0, 'video not found or empty!'
  video_vis = VideoVisualizer(thing.metadata, ColorMode.IMAGE)
  video_length=int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
  for frame_count in tqdm_notebook(range(video_length), ncols=100):
      _, frame = cap.read()
      if frame is None:
          print('end of input file reached')
          cap.release()
          cv2.destroyAllWindows()
          break
      thing.update_outputs(frame)
      foreground = thing.get_masked_image(frame, my_display_things)
      frame = cv2.cvtColor(frame, cv2.COLOR_RGB2BGR)
      video_filter = thing.outputs['instances']
      i=0
      for i in range(len(thing.classes)):
        if not thing.classes[i] in my_display_things:
              video_filter=video_filter[video_filter.pred_classes != i]
      visualization = video_vis.draw_instance_predictions(frame, 
                            video_filter.to('cpu'))
      visualization = cv2.cvtColor(visualization.get_image(), cv2.COLOR_RGB2BGR)
      output_writer1.write(np.array(foreground))
      output_writer2.write(visualization)
  output_writer1.release()
  output_writer2.release()
  print('done saving videos')
  cap.release()
  print('converting original with mask to save space...')
  input_file = original_showing_masks_temp
  output_file = original_showing_masks_result
  !ffmpeg -i "{input_file}" -c:v libx264 \
      -preset slow -crf 35 -c:a copy "{output_file}"
  #if os.path.exists(original_showing_masks_temp):
  #  os.remove(original_showing_masks_temp)
  #else:
  #  print("The file does not exist")

/content/drive/MyDrive/datasets/combo/videos/rat_20_cocaine.mp4


  0%|                                                                      | 0/9088 [00:00<?, ?it/s]

done saving videos
converting original with mask to save space...
ffmpeg version 3.4.11-0ubuntu0.1 Copyright (c) 2000-2022 the FFmpeg developers
  built with gcc 7 (Ubuntu 7.5.0-3ubuntu1~18.04)
  configuration: --prefix=/usr --extra-version=0ubuntu0.1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --enable-gpl --disable-stripping --enable-avresample --enable-avisynth --enable-gnutls --enable-ladspa --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librubberband --enable-librsvg --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libw

  0%|                                                                      | 0/8975 [00:00<?, ?it/s]

done saving videos
converting original with mask to save space...
ffmpeg version 3.4.11-0ubuntu0.1 Copyright (c) 2000-2022 the FFmpeg developers
  built with gcc 7 (Ubuntu 7.5.0-3ubuntu1~18.04)
  configuration: --prefix=/usr --extra-version=0ubuntu0.1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --enable-gpl --disable-stripping --enable-avresample --enable-avisynth --enable-gnutls --enable-ladspa --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librubberband --enable-librsvg --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libw

  0%|                                                                      | 0/9088 [00:00<?, ?it/s]

done saving videos
converting original with mask to save space...
ffmpeg version 3.4.11-0ubuntu0.1 Copyright (c) 2000-2022 the FFmpeg developers
  built with gcc 7 (Ubuntu 7.5.0-3ubuntu1~18.04)
  configuration: --prefix=/usr --extra-version=0ubuntu0.1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --enable-gpl --disable-stripping --enable-avresample --enable-avisynth --enable-gnutls --enable-ladspa --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librubberband --enable-librsvg --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libw

  0%|                                                                      | 0/9038 [00:00<?, ?it/s]

done saving videos
converting original with mask to save space...
ffmpeg version 3.4.11-0ubuntu0.1 Copyright (c) 2000-2022 the FFmpeg developers
  built with gcc 7 (Ubuntu 7.5.0-3ubuntu1~18.04)
  configuration: --prefix=/usr --extra-version=0ubuntu0.1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --enable-gpl --disable-stripping --enable-avresample --enable-avisynth --enable-gnutls --enable-ladspa --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librubberband --enable-librsvg --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libw

  0%|                                                                      | 0/9336 [00:00<?, ?it/s]

done saving videos
converting original with mask to save space...
ffmpeg version 3.4.11-0ubuntu0.1 Copyright (c) 2000-2022 the FFmpeg developers
  built with gcc 7 (Ubuntu 7.5.0-3ubuntu1~18.04)
  configuration: --prefix=/usr --extra-version=0ubuntu0.1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --enable-gpl --disable-stripping --enable-avresample --enable-avisynth --enable-gnutls --enable-ladspa --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librubberband --enable-librsvg --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libw

In [None]:
if (not local) and (disconnect_on_complete): runtime.unassign() 