# Image Stabilization for Hand-tracking Telescope Video Prototype

This notebook will explore a prototype for "Image Stabilization for Hand-tracking Telescope Video". For SageMaker you will want to launch the instance as ml.g4dn.xlarge (4 vCPU + 16 GiB + 1 GPU) Python 3 (PyTorch 1.6 Python 3.6 GPU Optimized)

The prototype will be built on the Deep Unsupervised Trajectory-based stabilization framework (DUT).

The DUT Stabalization model was observed to be qualitatively better on stabalization.

The DUT pipeline is an ensemble of 3 models: a point cloud generator model, a motion analysis model, and a smoothing model. That being said what we probably want to do is pick one of the models and focus on training it. 

In the process of building the prototype I will identify what appears to be the weakest model based on the use of astronomical data and focus in on it. Training of that model will use the non-astronomical data. This is an attempt to solve the cold start problem through the use of "transfer learning". Transfer learning is a machine learning method where a model developed for a task is reused as the starting point for a model on a second task. The goal is then to accumulate the target data to continue to improve the model.

Requirements:

* Upload a video (mp4) created by hand tracking a celestial body
* Store the Uploaded unstable video for future use.
* Run it through the video stabilization pipeline 
* Present the stabalized video side-by-side with the uploaded video
* Allow the user to download the stabalized video

## Setup

In [1]:
%%time

import os

FFMPEG_TAR = 'ffmpeg-release-amd64-static.tar.xz'
if os.path.exists(FFMPEG_TAR):
    os.remove(FFMPEG_TAR)
    
!wget https://johnvansickle.com/ffmpeg/releases/ffmpeg-release-amd64-static.tar.xz
!tar -xf ffmpeg-release-amd64-static.tar.xz
!ffmpeg-4.4-amd64-static/ffmpeg -version

if os.path.exists(FFMPEG_TAR):
    os.remove(FFMPEG_TAR)

!/opt/conda/bin/python -m pip install --upgrade pip
!pip install scikit-image
!pip install easydict

!conda update -n base -y -c defaults conda
!conda install -y -c conda-forge cupy

wget: /opt/conda/lib/libuuid.so.1: no version information available (required by wget)
--2021-09-14 19:20:14--  https://johnvansickle.com/ffmpeg/releases/ffmpeg-release-amd64-static.tar.xz
Resolving johnvansickle.com (johnvansickle.com)... 107.180.57.212
Connecting to johnvansickle.com (johnvansickle.com)|107.180.57.212|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 39577132 (38M) [application/x-xz]
Saving to: ‘ffmpeg-release-amd64-static.tar.xz’


2021-09-14 19:20:17 (15.8 MB/s) - ‘ffmpeg-release-amd64-static.tar.xz’ saved [39577132/39577132]

tar: ffmpeg-4.4-amd64-static/GPLv3.txt: Cannot change ownership to uid 1000, gid 1000: Operation not permitted
tar: ffmpeg-4.4-amd64-static/manpages/ffmpeg-all.txt: Cannot change ownership to uid 1000, gid 1000: Operation not permitted
tar: ffmpeg-4.4-amd64-static/manpages/ffmpeg-scaler.txt: Cannot change ownership to uid 1000, gid 1000: Operation not permitted
tar: ffmpeg-4.4-amd64-static/manpages/ffmpeg-resampler.txt

## Using the DUT Model for Prototype.

In [2]:
import torch
import torch.nn as nn
import argparse
from PIL import Image
import cv2
import os
import traceback
import math
import time
import sys

project_location = os.getcwd()
sys.path.append(os.path.join(project_location,'DUTCode'))
project_location

'/root/hand-tracking-stabilization'

### Load the Pre-trained Models from S3

I pulled the pre-trained models to S3 to protect them from disappearing. At the time the pre-trained models were at [https://drive.google.com/drive/folders/15T8Wwf1OL99AKDGTgECzwubwTqbkmGn6](https://drive.google.com/drive/folders/15T8Wwf1OL99AKDGTgECzwubwTqbkmGn6)

In [3]:
%%time

import os
import boto3

project_dir = os.getcwd()

data_dir = 'DUTPretrained'
if not os.path.isdir(data_dir):
    os.mkdir(data_dir)
os.chdir(data_dir)

s3 = boto3.client('s3')
s3.download_file('madat-machine-learning-data', 'Mark/capstone-project/pre-trained-models/ckpt-20210817T154228Z-001.zip', 'ckpt-20210817T154228Z-001.zip')

from zipfile import ZipFile

with ZipFile('ckpt-20210817T154228Z-001.zip', 'r') as zipObj:
    zipObj.extractall()
    
os.chdir('..')
os.getcwd()

CPU times: user 2.77 s, sys: 1.11 s, total: 3.88 s
Wall time: 7.74 s


'/root/hand-tracking-stabilization'

### Create video of unstabalized images

This will not be needed for the prototype, because the intent is upload a video. But what will be needed is the breaking of that video into distinct frame images of the correct size for the model.

In [4]:
import cv2
import os

image_folder = 'DUTCode/images'
tmp_video_name = 'tmp_unstable.mp4'
video_name = 'unstable.mp4'
tmp_clipped_video_name = 'tmp_clipped_unstable.mp4'
clipped_video_name = 'clipped_unstable.mp4'

height = 0
width = 0
total_number = 480
    
for img in os.listdir(image_folder):
    file, ext = os.path.splitext(img)
    if img.endswith(".jpg"):
        file_name_corrected = os.path.join(image_folder,file.zfill(3)+'.jpg')
        uncorrected_file_name =  os.path.join(image_folder,img)
        #print(uncorrected_file_name +' => ' + file_name_corrected)
        os.rename(uncorrected_file_name,file_name_corrected)
        
        frame = cv2.imread(file_name_corrected)
        height, width, channels = frame.shape


fourcc = cv2.VideoWriter_fourcc(*'mp4v')
print(width, height)

video = cv2.VideoWriter(tmp_video_name, fourcc, 25, (width, height))

# Now that everything is renamed it should be in the correct order
for img_number in range(480):
    image = os.path.join(image_folder,str(img_number).zfill(3)+'.jpg')
    #print(image)
    img = cv2.imread(image)
    
    video.write(img)

video.release()


clipped_video = cv2.VideoWriter(tmp_clipped_video_name, fourcc, 25, (342, 206))

# Now that everything is renamed it should be in the correct order
for img_number in range(480):
    image = os.path.join(image_folder,str(img_number).zfill(3)+'.jpg')
    #print(image)
    img = cv2.imread(image)

    y_offset = (height-206)//2
    x_offset = (width-342)//2
    crop_img = img[y_offset:y_offset+206, x_offset:x_offset+342]
    #print(crop_img.shape)
    clipped_video.write(crop_img)

clipped_video.release()

print('Complete')

# OpenCV doesn't ship with the H264 codec that you need to see the video in a notebook; due to licensing incompatabilities. For that reason 
# I encode as MP4V and then I post process with FFMPEG
if os.path.exists(video_name):
    os.remove(video_name)
if os.path.exists(clipped_video_name):
    os.remove(clipped_video_name)

!ffmpeg-4.4-amd64-static/ffmpeg -i {tmp_video_name} -c:v h264 {video_name}
!ffmpeg-4.4-amd64-static/ffmpeg -i {tmp_clipped_video_name} -c:v h264 {clipped_video_name}
    
if os.path.exists(tmp_video_name):
    os.remove(tmp_video_name)
if os.path.exists(tmp_clipped_video_name):
    os.remove(tmp_clipped_video_name)

640 360
Complete
ffmpeg version 4.4-static https://johnvansickle.com/ffmpeg/  Copyright (c) 2000-2021 the FFmpeg developers
  built with gcc 8 (Debian 8.3.0-6)
  configuration: --enable-gpl --enable-version3 --enable-static --disable-debug --disable-ffplay --disable-indev=sndio --disable-outdev=sndio --cc=gcc --enable-fontconfig --enable-frei0r --enable-gnutls --enable-gmp --enable-libgme --enable-gray --enable-libaom --enable-libfribidi --enable-libass --enable-libvmaf --enable-libfreetype --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-librubberband --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libvorbis --enable-libopus --enable-libtheora --enable-libvidstab --enable-libvo-amrwbenc --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libdav1d --enable-libxvid --enable-libzvbi --enable-libzimg
  libavutil      56. 70.100 / 56. 70.100
  libavcodec     58.134.100 / 58.134.100
 

## Prototype Built on DUT Stabalizer




In [5]:
# From deploy_samples.sh

#OutputBasePath='results/'
#SmootherPath='ckpt/smoother.pth'
#RFDetPath='ckpt/RFDet_640.pth.tar'
#PWCNetPath='ckpt/network-default.pytorch'
#MotionProPath='ckpt/MotionPro.pth'
#InputPath='images/'

#--SmootherPath=$SmootherPath \
#--RFDetPath=$RFDetPath \
#--PWCNetPath=$PWCNetPath \
#--MotionPro=$MotionProPath \
#--InputBasePath=$InputPath \
#--OutputBasePath=$OutputBasePath 
class DUTArguments:
    
    SmootherPath='DUTPretrained/ckpt/smoother.pth'
    RFDetPath='DUTPretrained/ckpt/RFDet_640.pth.tar'
    PWCNetPath='DUTPretrained/ckpt/network-default.pytorch'
    MotionProPath='DUTPretrained/ckpt/MotionPro.pth'
    SingleHomo=True
    OutputBasePath = 'results/'
    InputBasePath = 'DUTCode/images/'
    MaxLength = 1200
    OutNamePrefix = ''
    Repeat = 50
    
dut_args = DUTArguments()

In [6]:
def generateStable(model, base_path, outPath, outPrefix, max_length, args):

    image_base_path = base_path
    image_len = min(len([ele for ele in os.listdir(image_base_path) if ele[-4:] == '.jpg']), max_length)
    # read input video
    images = []
    rgbimages = []
    for i in range(image_len):
        image = cv2.imread(os.path.join(args.InputBasePath, '{:03d}.jpg'.format(i)), 0)
        image = image * (1. / 255.)
        image = cv2.resize(image, (cfg.MODEL.WIDTH, cfg.MODEL.HEIGHT))
        images.append(image.reshape(1, 1, cfg.MODEL.HEIGHT, cfg.MODEL.WIDTH))

        image = cv2.imread(os.path.join(args.InputBasePath, '{:03d}.jpg'.format(i)))
        image = cv2.resize(image, (cfg.MODEL.WIDTH, cfg.MODEL.HEIGHT))
        rgbimages.append(np.expand_dims(np.transpose(image, (2, 0, 1)), 0))

    x = np.concatenate(images, 1).astype(np.float32)
    x = torch.from_numpy(x).unsqueeze(0)

    x_RGB = np.concatenate(rgbimages, 0).astype(np.float32)
    x_RGB = torch.from_numpy(x_RGB).unsqueeze(0)

    with torch.no_grad():
        origin_motion, smoothPath = model.inference(x.cuda(), x_RGB.cuda(), repeat=args.Repeat)

    origin_motion = origin_motion.cpu().numpy()
    smoothPath = smoothPath.cpu().numpy()
    origin_motion = np.transpose(origin_motion[0], (2, 3, 1, 0))
    smoothPath = np.transpose(smoothPath[0], (2, 3, 1, 0))

    x_paths = origin_motion[:, :, :, 0]
    y_paths = origin_motion[:, :, :, 1]
    sx_paths = smoothPath[:, :, :, 0]
    sy_paths = smoothPath[:, :, :, 1]

    frame_rate = 25
    frame_width = cfg.MODEL.WIDTH
    frame_height = cfg.MODEL.HEIGHT

    print("generate stabilized video...")
    fourcc = cv2.VideoWriter_fourcc(*'MP4V')
    out = cv2.VideoWriter(os.path.join(outPath, outPrefix + 'tmp_DUT_stable.mp4'), fourcc, frame_rate, (frame_width, frame_height))
    print(frame_width, frame_height)

    new_x_motion_meshes = sx_paths - x_paths
    new_y_motion_meshes = sy_paths - y_paths

    outImages = warpListImage(rgbimages, new_x_motion_meshes, new_y_motion_meshes)
    outImages = outImages.numpy().astype(np.uint8)
    outImages = [np.transpose(outImages[idx], (1, 2, 0)) for idx in range(outImages.shape[0])]
    for frame in tqdm(outImages):
        VERTICAL_BORDER = 60
        HORIZONTAL_BORDER = 80

        new_frame = frame[VERTICAL_BORDER:-VERTICAL_BORDER, HORIZONTAL_BORDER:-HORIZONTAL_BORDER]
        new_frame = cv2.resize(new_frame, (frame.shape[1], frame.shape[0]), interpolation=cv2.INTER_CUBIC)
        #print(new_frame.shape)
        out.write(new_frame)
    out.release()

In [7]:
from models.DUT.DUT import DUT
from utils.WarpUtils import warpListImage
from configs.config import cfg
import numpy as np
from tqdm import tqdm

model = DUT(SmootherPath=dut_args.SmootherPath, RFDetPath=dut_args.RFDetPath, PWCNetPath=dut_args.PWCNetPath, MotionProPath=dut_args.MotionProPath, homo=dut_args.SingleHomo)
model.cuda()
model.eval()

generateStable(model, dut_args.InputBasePath, dut_args.OutputBasePath, dut_args.OutNamePrefix, dut_args.MaxLength, dut_args)

# OpenCV doesn't ship with the H264 codec that you need to see the video in a notebook; due to licensing incompatabilities. For that reason 
# I encode as MP4V and then I post process with FFMPEG
video_name =os.path.join(dut_args.OutputBasePath,'DUT_stable.mp4')
tmp_video_name = os.path.join(dut_args.OutputBasePath,'tmp_DUT_stable.mp4')

if os.path.exists(video_name):
    os.remove(video_name)

!ffmpeg-4.4-amd64-static/ffmpeg -i {tmp_video_name} -c:v h264 {video_name}
    
if os.path.exists(tmp_video_name):
    os.remove(tmp_video_name)


-------------model configuration------------------------
using RFNet ...
using PWCNet for motion estimation...
using Motion Propagation model with multi homo...
using Deep Smoother Model...
------------------reload parameters-------------------------
reload Smoother params
successfully load 12 params for smoother
reload RFDet Model
successfully load 100 params for RFDet
reload PWCNet Model
reload MotionPropagation Model
successfully load 21 params for MotionPropagation
detect keypoints ....
[2021-09-14 19:21:29.877 pytorch-1-6-gpu-py3-ml-g4dn-xlarge-594def216eaae0b31fbf025840e5:12520 INFO utils.py:27] RULE_JOB_STOP_SIGNAL_FILENAME: None
[2021-09-14 19:21:30.007 pytorch-1-6-gpu-py3-ml-g4dn-xlarge-594def216eaae0b31fbf025840e5:12520 INFO profiler_config_parser.py:102] Unable to find config at /opt/ml/input/config/profilerconfig.json. Profiler is disabled.


  None, None, :, :
	nonzero()
Consider using one of the following signatures instead:
	nonzero(*, bool as_tuple) (Triggered internally at  /codebuild/output/src811146734/src/torch/csrc/utils/python_arg_parser.cpp:766.)
  kpts = im_topk.nonzero()  # (B*topk, 4)


estimate motion ....
motion propagation ....
generate stabilized video...
640 480


100%|██████████| 480/480 [00:02<00:00, 239.68it/s]


ffmpeg version 4.4-static https://johnvansickle.com/ffmpeg/  Copyright (c) 2000-2021 the FFmpeg developers
  built with gcc 8 (Debian 8.3.0-6)
  configuration: --enable-gpl --enable-version3 --enable-static --disable-debug --disable-ffplay --disable-indev=sndio --disable-outdev=sndio --cc=gcc --enable-fontconfig --enable-frei0r --enable-gnutls --enable-gmp --enable-libgme --enable-gray --enable-libaom --enable-libfribidi --enable-libass --enable-libvmaf --enable-libfreetype --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-librubberband --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libvorbis --enable-libopus --enable-libtheora --enable-libvidstab --enable-libvo-amrwbenc --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libdav1d --enable-libxvid --enable-libzvbi --enable-libzimg
  libavutil      56. 70.100 / 56. 70.100
  libavcodec     58.134.100 / 58.134.100
  libavformat    5

In [9]:
%%html

<table>
  <tr>
    <td>
        <center>Full Unstable (640x360)</center>
        <video width="640" height="360" controls autoplay>
            <source src="unstable.mp4" type="video/mp4">
        </video>
    </td>
  <tr>
    <td>
        <center>Stabalized (640x480)</center>
        <video width="640" height="480" controls autoplay>
          <source src="results/DUT_stable.mp4" type="video/mp4">
        </video>
    </td>
</table>

0
Full Unstable (640x360)
Stabalized (640x480)
