# What's Optical flow?

> Optical flow or optic flow is the pattern of apparent motion of objects, surfaces, and edges in a visual scene caused by the relative motion between an observer and a scene. - https://en.wikipedia.org/wiki/Optical_flow


**In this competition, each player motion can be essential for detecting the helmet impact. I'm going to introduce a deep learning model for optical flow estimation.**

In [None]:
import os
import sys
sys.path.append('/kaggle/input/raft-pytorch')
import numpy as np
import cv2
import matplotlib.pyplot as plt
import torch

from glob import glob
from PIL import Image
from tqdm import tqdm

# RAFT introduction

I introduce the model: **RAFT: Recurrent All-Pairs Field Transforms for Optical Flow** which is originally introduced in ECCV2020 by Teed et. al. in Princeton University and prized Best Paper Award!.
* https://arxiv.org/abs/2003.12039
* https://github.com/princeton-vl/RAFT (licensed under the BSD 3-Clause License)

Briefly, RAFT has below features
* Recurrent optical flow estimation
* Compute pixel-wise correlation between pair-wise input images and reuse it in the following recurrent step
* Lightweight, rapid inference, and high accuracy

![RAFT architecture image from https://github.com/princeton-vl/RAFT](https://github.com/princeton-vl/RAFT/raw/master/RAFT.png)

This is [my explanation slide](https://speakerdeck.com/daigo0927/raft-recurrent-all-pairs-field-transforms-for-optical-flow) in Japanese.

# Run RAFT on sample images

In [None]:
from raft.core.raft import RAFT
from raft.core.utils import flow_viz
from raft.core.utils.utils import InputPadder
from raft.config import RAFTConfig

In [None]:
config = RAFTConfig(
    dropout=0,
    alternate_corr=False,
    small=False,
    mixed_precision=False
)

model = RAFT(config)
model

In [None]:
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f'device: {device}')

weights_path = '/kaggle/input/raft-pytorch/raft-sintel.pth'
# weights_path = '/kaggle/input/raft-pytorch/raft-things.pth'

ckpt = torch.load(weights_path, map_location=device)
model.to(device)
model.load_state_dict(ckpt)

In [None]:
image_files = glob('/kaggle/input/raft-pytorch/raft/demo-frames/*.png')
image_files = sorted(image_files)

print(f'Found {len(image_files)} images')
print(sorted(image_files))

In [None]:
def load_image(imfile, device):
    img = np.array(Image.open(imfile)).astype(np.uint8)
    img = torch.from_numpy(img).permute(2, 0, 1).float()
    return img[None].to(device)


def viz(img1, img2, flo):
    img1 = img1[0].permute(1,2,0).cpu().numpy()
    img2 = img2[0].permute(1,2,0).cpu().numpy()
    flo = flo[0].permute(1,2,0).cpu().numpy()
    
    # map flow to rgb image
    flo = flow_viz.flow_to_image(flo)
    
    fig, (ax1, ax2, ax3) = plt.subplots(1, 3, figsize=(20, 4))
    ax1.set_title('input image1')
    ax1.imshow(img1.astype(int))
    ax2.set_title('input image2')
    ax2.imshow(img2.astype(int))
    ax3.set_title('estimated optical flow')
    ax3.imshow(flo)
    plt.show()

In [None]:
model.eval()
n_vis = 3

for file1, file2 in tqdm(zip(image_files[:n_vis], image_files[1:1+n_vis])):
    image1 = load_image(file1, device)
    image2 = load_image(file2, device)

    padder = InputPadder(image1.shape)
    image1, image2 = padder.pad(image1, image2)
    
    with torch.no_grad():
        flow_low, flow_up = model(image1, image2, iters=20, test_mode=True)
        
    viz(image1, image2, flow_up)

The first and second columns are input paired images and right column is the predicted optical flow.

# Run on NFL video

In [None]:
video_file = '/kaggle/input/nfl-impact-detection/train/57583_000082_Endzone.mp4'

cap = cv2.VideoCapture(video_file)

frames = []
while True:
    has_frame, image = cap.read()
    
    if has_frame:
        image = image[:, :, ::-1] # convert BGR -> RGB
        frames.append(image)
    else:
        break
frames = np.stack(frames, axis=0)

print(f'frame shape: {frames.shape}')    
plt.imshow(frames[0])

In [None]:
n_vis = 3

for i in range(n_vis):
    image1 = torch.from_numpy(frames[i]).permute(2, 0, 1).float().to(device)
    image2 = torch.from_numpy(frames[i+1]).permute(2, 0, 1).float().to(device)
    
    image1 = image1[None].to(device)
    image2 = image2[None].to(device)

    padder = InputPadder(image1.shape)
    image1, image2 = padder.pad(image1, image2)
    
    with torch.no_grad():
        flow_low, flow_up = model(image1, image2, iters=20, test_mode=True)
        
    viz(image1, image2, flow_up)

RAFT seems to capture the motion of each player.

# Have a nice football flow!