# Tracking Transformers in [Mojito](https://www.bilibili.com/video/BV1PK4y1b7dt) <a href="https://colab.research.google.com/github/eto-ai/rikai/blob/main/notebooks/MojitoVideo.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

> 麻烦给我的爱人来一杯 Mojito 🎵🎵🎵

This notebook demonstrates how to track transformers. [spark-video](https://github.com/eto-ai/spark-video) is required.

<p align="center">
<img alt="Light" src="https://i.scdn.co/image/ab67616d0000b273466def3ce70d94dcacb13c8d" width="56.5%">
<img alt="Light" src="https://canvas-bridge02.tubitv.com/exp01/Q7N97TP4AKS3COmXOGU4Qo4OVwY=/400x574/smart/img.adrise.tv/ec2e6d2f-538b-4b74-997a-4e56c3783676.png">
</p>

## Preparation 1/2: Install python packages and download the video

We recommend that you should use the virual python environment created by conda.

In [None]:
!pip install ipython==7.31.1
!pip install rikai
!pip install rikai-torchhub
!pip install rikai-yolov5
!pip install you-get
!you-get --format=dash-flv360 https://www.bilibili.com/video/BV1PK4y1b7dt -O Mojito.mp4

## Preparation 2/2: Install spark-video
For local development, please install spark-video manually via:

https://github.com/eto-ai/spark-video#local-development

In [None]:
# This paragraph is for Google Colab

!wget -O /usr/local/lib/python3.7/dist-packages/pyspark/jars/ffmpeg-4.4-1.5.6-linux-x86_64.jar https://repo1.maven.org/maven2/org/bytedeco/ffmpeg/4.4-1.5.6/ffmpeg-4.4-1.5.6-linux-x86_64.jar
!wget -O /usr/local/lib/python3.7/dist-packages/pyspark/jars/javacpp-1.5.6-linux-x86_64.jar https://repo1.maven.org/maven2/org/bytedeco/javacpp/1.5.6/javacpp-1.5.6-linux-x86_64.jar
!wget -O /usr/local/lib/python3.7/dist-packages/pyspark/jars/spark-video-assembly-0.0.4.jar https://github.com/eto-ai/spark-video/releases/download/v0.0.4/spark-video-assembly_2.12-0.0.4.jar

## Video Visualization

In [1]:
from rikai.types.video import VideoStream

## For now, only brower-playable uri is playable in Jupyter/Databricks notebooks
uri = 'https://user-images.githubusercontent.com/1267865/154184628-d635c62e-d441-4853-8a67-65a0e838157d.mp4'
video = VideoStream(uri)
video

## Initialization
+ Initialize the Spark Session with Rikai support
+ Register all built-in UDFs
+ Create the yolov5m model

In [2]:
from rikai.spark.utils import init_spark_session
from rikai.spark.functions import init

spark = init_spark_session(
    dict(
        [
            (
                "spark.rikai.sql.ml.registry.torchhub.impl",
                "ai.eto.rikai.sql.model.torchhub.TorchHubRegistry",
            ),
            ("spark.driver.memory", "4g"),
            ("spark.executor.memory", "4g")
        ]
    )
)

init(spark)

:: loading settings :: url = jar:file:/Users/da/.pyenv/versions/3.8.10/envs/rikai/lib/python3.8/site-packages/pyspark/jars/ivy-2.4.0.jar!/org/apache/ivy/core/settings/ivysettings.xml


Ivy Default Cache set to: /Users/da/.ivy2/cache
The jars for the packages stored in: /Users/da/.ivy2/jars
ai.eto#rikai_2.12 added as a dependency
:: resolving dependencies :: org.apache.spark#spark-submit-parent-7fe2df43-ccc7-4f08-ac8c-851106cefa83;1.0
	confs: [default]
	found ai.eto#rikai_2.12;0.1.4 in central
	found org.antlr#antlr4-runtime;4.8-1 in local-m2-cache
	found com.thoughtworks.enableIf#enableif_2.12;1.1.7 in central
	found org.xerial.snappy#snappy-java;1.1.8.4 in central
	found com.typesafe.scala-logging#scala-logging_2.12;3.9.4 in central
	found org.slf4j#slf4j-api;1.7.30 in spark-list
	found io.circe#circe-core_2.12;0.12.3 in central
	found io.circe#circe-numbers_2.12;0.12.3 in central
	found org.typelevel#cats-core_2.12;2.0.0 in central
	found org.typelevel#cats-macros_2.12;2.0.0 in central
	found org.typelevel#cats-kernel_2.12;2.0.0 in central
	found io.circe#circe-generic_2.12;0.12.3 in central
	found com.chuusai#shapeless_2.12;2.3.3 in spark-list
	found org.typelevel

In [3]:
spark.sql("""
CREATE OR REPLACE MODEL yolov5s
OPTIONS (device="cpu", batch_size=32)
USING "torchhub:///ultralytics/yolov5:v6.0/yolov5s";
""")

DataFrame[]

## How many objects in this video

In [4]:
spark.sql(f"""
CREATE OR REPLACE TEMPORARY VIEW mojito
USING video
OPTIONS (
  path 'Mojito.mp4',
  frameStepSize 400,
  imageWidth 640,
  imageHeight 360
)
""")
spark.table("mojito").cache()
spark.sql("""
select count(*) from mojito
""").toPandas()

Input #0, flv, from 'file:///Users/da/github/eto-ai/rikai/notebooks/Mojito.flv':
  Metadata:
    description     : Bilibili VXCode Swarm Transcoder v0.2.30(gap_fixed:False)
    metadatacreator : Version 1.9
    hasKeyframes    : true
    hasVideo        : true
    hasAudio        : true
    hasMetadata     : true
    canSeekToEnd    : true
    datasize        : 72948997
    videosize       : 65303172
    audiosize       : 7587917
    lasttimestamp   : 188
    lastkeyframetimestamp: 188
    lastkeyframelocation: 72950361
  Duration: 00:03:08.40, start: 0.067000, bitrate: 3097 kb/s
  Stream #0:0: Video: h264 (High), yuv420p(tv, bt709/unknown/bt709, progressive), 1920x1080, 2771 kb/s, 30 fps, 29.97 tbr, 1k tbn, 59.94 tbc
  Stream #0:1: Audio: aac (LC), 48000 Hz, stereo, fltp, 318 kb/s
                                                                                

Unnamed: 0,count(1)
0,15


In [5]:
df = spark.sql(f"""
select pred.label, count(*)
from (
  select *, explode(ML_PREDICT(yolov5s, image)) as pred
  from (
    select to_image(image_data) as image
    from mojito
  )
)
group by pred.label
""")
df.toPandas()

Using cache found in /Users/da/.cache/torch/hub/ultralytics_yolov5_v6.0+ 1) / 1]
YOLOv5 🚀 2021-12-29 torch 1.9.0 CPU

Fusing layers... 
Model Summary: 213 layers, 7225885 parameters, 0 gradients
Adding AutoShape... 
Using cache found in /Users/da/.cache/torch/hub/ultralytics_yolov5_v6.0
YOLOv5 🚀 2021-12-29 torch 1.9.0 CPU

Fusing layers... 
Model Summary: 213 layers, 7225885 parameters, 0 gradients
Adding AutoShape... 
                                                                                

Unnamed: 0,label,count(1)
0,motorcycle,1
1,handbag,1
2,person,78
3,clock,1
4,backpack,1
5,truck,3
6,car,18


## Hey, let me see if the motocycles and trucks are transformers?

In [6]:
cases = spark.sql(f"""
select frame_id, pred.label, pred.box, image
from (
  select *, explode(ML_PREDICT(yolov5s, image)) as pred
  from (
    select frame_id, to_image(image_data) as image
    from mojito
  )
)
where
  pred.label in ('motorcycle', 'truck')
""").toPandas()

cases

Using cache found in /Users/da/.cache/torch/hub/ultralytics_yolov5_v6.0
YOLOv5 🚀 2021-12-29 torch 1.9.0 CPU

Fusing layers... 
Model Summary: 213 layers, 7225885 parameters, 0 gradients
Adding AutoShape... 
Using cache found in /Users/da/.cache/torch/hub/ultralytics_yolov5_v6.0+ 1) / 1]
YOLOv5 🚀 2021-12-29 torch 1.9.0 CPU

Fusing layers... 
Model Summary: 213 layers, 7225885 parameters, 0 gradients
Adding AutoShape... 
                                                                                

Unnamed: 0,frame_id,label,box,image
0,402,truck,"(29.30722427368164, 159.26272583007812, 108.02...",Image(<embedded>)
1,5207,motorcycle,"(424.91314697265625, 226.67149353027344, 443.7...",Image(<embedded>)
2,5608,truck,"(272.980712890625, 287.79486083984375, 338.023...",Image(<embedded>)
3,5608,truck,"(0.3799285888671875, 255.99822998046875, 73.45...",Image(<embedded>)


## Let me check if they are transformers!

In [7]:
cases.image[0] | cases.box[0]

In [8]:
cases.image[1] | cases.box[1]