# Tutorial, chapter 7

In this tutorial you will learn how to

- Convert and import the ``sfu-hw-objects-v1`` custom video dataset
- Visualize frames from the video dataset

In [1]:
# https://nbconvert.readthedocs.io/en/latest/removing_cells.html
# use these magic spells to update your classes methods on-the-fly as you edit them:
%reload_ext autoreload
%autoreload 2
from pprint import pprint
from IPython.core.display import display, HTML, Markdown
import ipywidgets as widgets
# %run includeme.ipynb # include a notebook from this same directory
display(HTML("<style>.container { width:100% !important; }</style>"))

  from IPython.core.display import display, HTML, Markdown


In [2]:
path_to_sfu_hw_objects_v1="/home/sampsa/silo/interdigital/mock/SFU-HW-Objects-v1"

In [3]:
!tree {path_to_sfu_hw_objects_v1} --filelimit=10 | cat

/home/sampsa/silo/interdigital/mock/SFU-HW-Objects-v1
├── ClassC
│   ├── Annotations
│   │   └── BasketballDrill [505 entries exceeds filelimit, not opening dir]
│   └── BasketballDrill_832x480_50Hz_8bit_P420.yuv
└── ClassX
    ├── Annotations
    │   └── BasketballDrill
    │       ├── BasketballDrill_832x480_50_seq_001.txt
    │       ├── BasketballDrill_832x480_50_seq_002.txt
    │       ├── BasketballDrill_832x480_50_seq_003.txt
    │       ├── BasketballDrill_832x480_50_seq_004.txt
    │       ├── BasketballDrill_832x480_object.list
    │       ├── video.mkv
    │       ├── video.mp4
    │       └── video.webm
    └── BasketballDrill_832x480_50Hz_8bit_P420.yuv -> /home/sampsa/silo/interdigital/mock/SFU-HW-Objects-v1/ClassC/BasketballDrill_832x480_50Hz_8bit_P420.yuv

6 directories, 10 files


In [4]:
!compressai-vision convert-video --dataset-type=sfu-hw-objects-v1 --dir={path_to_sfu_hw_objects_v1} --y


Converting raw video proper container format

Dataset type           :  sfu-hw-objects-v1
Dataset root directory :  /home/sampsa/silo/interdigital/mock/SFU-HW-Objects-v1

finding .yuv files from /home/sampsa/silo/interdigital/mock/SFU-HW-Objects-v1
ffmpeg -y -f rawvideo -pixel_format yuv420p -video_size 832x480 -i /home/sampsa/silo/interdigital/mock/SFU-HW-Objects-v1/ClassC/BasketballDrill_832x480_50Hz_8bit_P420.yuv -an -c:v h264 -q 0 -r 50 /home/sampsa/silo/interdigital/mock/SFU-HW-Objects-v1/ClassC/Annotations/BasketballDrill/video.mp4
ffmpeg version 4.2.7-0ubuntu0.1 Copyright (c) 2000-2022 the FFmpeg developers
  built with gcc 9 (Ubuntu 9.4.0-1ubuntu1~20.04.1)
  configuration: --prefix=/usr --extra-version=0ubuntu0.1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-avresample --disable-filter=resample --enable-avisynth --enable-gnutls --enable-ladspa --enable-libaom --enable-libass

frame= 1002 fps=188 q=-1.0 Lsize=    4188kB time=00:00:19.98 bitrate=1717.2kbits/s dup=501 drop=0 speed=3.76x    
video:4176kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.301671%
[1;36m[libx264 @ 0x55e29ad56000] [0mframe I:5     Avg QP:23.20  size: 57827
[1;36m[libx264 @ 0x55e29ad56000] [0mframe P:253   Avg QP:25.76  size: 11964
[1;36m[libx264 @ 0x55e29ad56000] [0mframe B:744   Avg QP:30.21  size:  1289
[1;36m[libx264 @ 0x55e29ad56000] [0mconsecutive B-frames:  1.0%  0.0%  0.0% 99.0%
[1;36m[libx264 @ 0x55e29ad56000] [0mmb I  I16..4:  7.5% 42.0% 50.5%
[1;36m[libx264 @ 0x55e29ad56000] [0mmb P  I16..4:  0.1%  5.8%  2.6%  P16..4: 42.8% 16.3% 11.0%  0.0%  0.0%    skip:21.4%
[1;36m[libx264 @ 0x55e29ad56000] [0mmb B  I16..4:  0.0%  0.1%  0.0%  B16..8: 25.2%  2.8%  1.0%  direct: 0.8%  skip:70.1%  L0:62.4% L1:31.1% BI: 6.6%
[1;36m[libx264 @ 0x55e29ad56000] [0m8x8 transform intra:63.8% inter:66.2%
[1;36m[libx264 @ 0x55e29ad56000] [0mcoded y,uv

In [None]:
!compressai-vision import-video --dataset-type=sfu-hw-objects-v1 --dir={path_to_sfu_hw_objects_v1} --y

importing fiftyone
fiftyone imported

Importing a custom video format into fiftyone

Dataset type           :  sfu-hw-objects-v1
Dataset root directory :  /home/sampsa/silo/interdigital/mock/SFU-HW-Objects-v1



Let's continue in a python notebook:

In [None]:
import cv2
import matplotlib.pyplot as plt
import fiftyone as fo
from fiftyone import ViewField as F

In [None]:
dataset=fo.load_dataset("sfu-hw-objects-v1")

In [None]:
dataset

In contrast to image datasets where each sample was an image, now a sample corresponds to a video:

In [None]:
dataset.first()

There is a reference to the video file and a ``Frames`` object, encapsulating ground truths etc. data for each and every frame.  For ``sfu-hw-objects-v1`` in particular, ``class_tag`` corresponds to the class directories (ClassA, ClassB, etc.), while ``name_tag`` to the video descriptive names (BasketballDrill, Traffic, PeopleOnStreeet, etc.).  Let's pick a certain video sample:

In [None]:
sample = dataset[ (F("name_tag") == "BasketballDrill") & (F("class_tag") == "ClassC") ].first()

Take a look at the first frame ground truth detections (note that frame indices start from 1):

In [None]:
sample.frames[1]

In [None]:
vid=cv2.VideoCapture(sample.filepath)

Let's define a small helper function:

In [None]:
def draw_detections(sample: fo.Sample, vid: cv2.VideoCapture, nframe: int):
    from math import floor
    ok = vid.set(cv2.CAP_PROP_POS_FRAMES, nframe-1)
    if not ok:
        AssertionError("seek failed")
    ok, arr = vid.read() # BGR image in arr
    if not ok:
        AssertionError("no image")
    for detection in sample.frames[nframe].detections.detections:
        x0, y0, w, h = detection.bounding_box # rel coords
        x1, y1, x2, y2 = floor(x0*arr.shape[1]), floor(y0*arr.shape[0]), floor((x0+w)*arr.shape[1]), floor((y0+h)*arr.shape[0])
        arr=cv2.rectangle(arr, (x1, y1), (x2, y2), (255, 0, 0), 5)
    return arr

In [None]:
img=draw_detections(sample, vid, 2)
img = img[:,:,::-1] # BGR -> RGB

In [None]:
plt.imshow(img)

In [None]:
vid.release()

Visualize video and annotations in the fiftyone app:

In [None]:
# fo.launch_app(dataset)

In [None]:
# TODO: 
# - polish this tutorial
# - link reference the fiftyone video quickstart tutorial
# - mention that the normal "register" command can be used to import standard video collection formats!
# - next: evaluation tutorial (just the same detectron2-eval etc. command as always!) .. maybe just mentioning this is enough.. or just give an example command