<a href="https://colab.research.google.com/github/CamK2/ComputerVision/blob/main/Motion_Estimation_Lab.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Problem:** - how to detect/understand object motion from a series of images (i.e. a video)
<img src="https://nanonets.com/blog/content/images/2019/04/intro-1-2.gif">

* How can we more specifically pose the question?  
* What should be acceptable inputs?   
* How do we report a solution (what are outputs)?  
  * A grid of motion estimates (sparse optical flow).
  * A motion estimate for each pixel (dense optical flow).

<img src="https://nanonets.com/blog/content/images/2019/04/sparse-vs-dense.gif">




This notebook was based on these articles:[one](https://nanonets.com/blog/optical-flow/), [two](https://androidkt.com/how-to-capture-and-play-video-in-google-colab/)

The two cells below allow you to capture a video via Google colab (similar to how we used Code Snippets for taking pictures in the past). Take a short video with some type of motion - Ex: move your hand or head.

In [2]:
from IPython.display import display, Javascript,HTML
from google.colab.output import eval_js
from base64 import b64decode
 
def record_video(filename):
  js=Javascript("""
    async function recordVideo() {
      const options = { mimeType: "video/webm; codecs=vp9" };
      const div = document.createElement('div');
      const capture = document.createElement('button');
      const stopCapture = document.createElement("button");
       
      capture.textContent = "Start Recording";
      capture.style.background = "orange";
      capture.style.color = "white";
 
      stopCapture.textContent = "Stop Recording";
      stopCapture.style.background = "red";
      stopCapture.style.color = "white";
      div.appendChild(capture);
 
      const video = document.createElement('video');
      const recordingVid = document.createElement("video");
      video.style.display = 'block';
 
      const stream = await navigator.mediaDevices.getUserMedia({audio:true, video: true});
     
      let recorder = new MediaRecorder(stream, options);
      document.body.appendChild(div);
      div.appendChild(video);
 
      video.srcObject = stream;
      video.muted = true;
 
      await video.play();
 
      google.colab.output.setIframeHeight(document.documentElement.scrollHeight, true);
 
      await new Promise((resolve) => {
        capture.onclick = resolve;
      });
      recorder.start();
      capture.replaceWith(stopCapture);
 
      await new Promise((resolve) => stopCapture.onclick = resolve);
      recorder.stop();
      let recData = await new Promise((resolve) => recorder.ondataavailable = resolve);
      let arrBuff = await recData.data.arrayBuffer();
       
      // stop the stream and remove the video element
      stream.getVideoTracks()[0].stop();
      div.remove();
 
      let binaryString = "";
      let bytes = new Uint8Array(arrBuff);
      bytes.forEach((byte) => {
        binaryString += String.fromCharCode(byte);
      })
    return btoa(binaryString);
    }
  """)
  try:
    display(js)
    data=eval_js('recordVideo({})')
    binary=b64decode(data)
    with open(filename,"wb") as video_file:
      video_file.write(binary)
    print(f"Finished recording video at:{filename}")
  except Exception as err:
    print(str(err))

In [3]:
video_path = "test.mp4"
record_video(video_path)

<IPython.core.display.Javascript object>

Finished recording video at:test.mp4


In [16]:
import numpy as np
import cv2 as cv


# corner features
feature_params = dict(maxCorners=300,
                      qualityLevel = 0.2,
                      minDistance = 2,
                      blockSize = 7)
lk_params = dict(winSize=(15,15),
                 maxLevel=2,
                 criteria=(cv.TERM_CRITERIA_EPS | cv.TERM_CRITERIA_COUNT,
                           10,0.03))
# input our video file
cap = cv.VideoCapture("test.mp4")
writer = cv.VideoWriter("output.avi",
                        cv.VideoWriter_fourcc(*'MPEG'),
                        30, (1080, 1920))

# specify color for optical flow line, pink
color = (255,int(0.7*255),int(0.7*255))

# grab first frame
ret, first_frame = cap.read()
# grayscale
prev_gray = cv.cvtColor(first_frame, cv.COLOR_BGR2GRAY)
# find corners
prev = cv.goodFeaturesToTrack(prev_gray, mask = None, **feature_params)
# create a mask
mask = np.zeros_like(first_frame)
# loop over video frames
while cap.isOpened(): # while video file not empty, keep running
  # read next frame
  ret, frame = cap.read()
  gray = cv.cvtColor(frame,cv.COLOR_BGR2GRAY)
  # grab corners
  prev = cv.goodFeaturesToTrack(prev_gray, mask = None, **feature_params)
  # calculate the sparse optical flow
  next, status, error = cv.calcOpticalFlowPyrLK(prev_gray,
                                                gray, prev, None,
                                                **lk_params)
  # pick only good corner feature matches
  good_old = prev[status==1].astype(int)
  good_new = next[status==1].astype(int)

  # draw the optical flow lines
  for i, (new,old) in enumerate(zip(good_new, good_old)):
    a,b = new.ravel()
    c,d = old.ravel()

    mask = cv.line(mask, (a,b), (c,d), color, 2)
    frame = cv.circle(frame, (a,b), 3, color, -1)
  # overlay all lines onto current frame
  output = cv.add(frame, mask) # create a new "frame" with both

  # update before next loop iteration
  prev_gray = gray.copy() # deep copy
  prev = good_new.reshape(-1,1,2) # update good feature point
  writer.write(output)

  # break code
  if output.any() or (cv.waitKey(10) & 0xFF == ord('q')):
    break

cap.release()
writer.release()

<hr>
Grayscale the video.

###Optical Flow steps

* Put video in a "good" format
* Identify/specify a set of (x,y) features to track.
* Find the displacement of each feature between successive frames:
  * Do a convolution/kernel "scan" of the feature pixels in each subsequent image.
  * Calculate the x and y displacement and record it for each feature. 