<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc" style="margin-top: 1em;"><ul class="toc-item"><li><span><a href="#Rekall-Queries" data-toc-modified-id="Rekall-Queries-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Rekall Queries</a></span><ul class="toc-item"><li><span><a href="#1-on-1-Panel-Interviews" data-toc-modified-id="1-on-1-Panel-Interviews-1.1"><span class="toc-item-num">1.1&nbsp;&nbsp;</span>1-on-1 Panel Interviews</a></span><ul class="toc-item"><li><span><a href="#Jake-Tapper-and-Bernie-Sanders-in-the-same-shot" data-toc-modified-id="Jake-Tapper-and-Bernie-Sanders-in-the-same-shot-1.1.1"><span class="toc-item-num">1.1.1&nbsp;&nbsp;</span>Jake Tapper and Bernie Sanders in the same shot</a></span></li><li><span><a href="#Full-interviews" data-toc-modified-id="Full-interviews-1.1.2"><span class="toc-item-num">1.1.2&nbsp;&nbsp;</span>Full interviews</a></span></li></ul></li><li><span><a href="#Short-consecutive-shots" data-toc-modified-id="Short-consecutive-shots-1.2"><span class="toc-item-num">1.2&nbsp;&nbsp;</span>Short consecutive shots</a></span></li></ul></li></ul></div>

# Rekall Queries

This notebook serves as a tutorial on how to use rekall to make queries.

## Importing rekall
`rekall` is included as a git submodule in `app/deps`. `app/esper/rekall.py` sets up the dependencies and makes the `rekall` module available. This is how you import everything:

```
from esper.rekall import *
from rekall.temporal_predicates import *
from rekall.interval_list import Interval, IntervalList
```

In [None]:
# Run this first!

from esper.widget import *
from esper.prelude import *
from esper.spark_util import *
from pyspark.sql.functions import *
from esper.rekall import *
from rekall.temporal_predicates import *
from rekall.interval_list import Interval, IntervalList

import numpy as np
import sys

## 1-on-1 Panel Interviews

First, let's look for panel interviews. These are interviews where the host and interview subject are in different places, so their faces are side by side in a panel arrangement. Here's a good example of an interview between Chris Hayes and Beto O'Rourke: https://youtu.be/j0hwHmofc6w?t=66.

### Jake Tapper and Bernie Sanders in the same shot

Let's start by looking for an interview between Jake Tapper and Bernie Sanders. First, we'll load in shots where Jake Tapper or Bernie Sanders appear, and then we'll start joining them in different ways. To keep things reasonable, we're going to limit results to a 100-video sandbox.

In [None]:
# Load sandbox
sandbox_videos = [529, 763, 2648, 3459, 3730, 3769, 3952, 4143, 4611, 5281, 6185, 7262, 8220,
    8697, 8859, 9215, 9480, 9499, 9901, 10323, 10335, 11003, 11555, 11579, 11792,
    12837, 13058, 13141, 13247, 13556, 13827, 13927, 13993, 14482, 15916, 16215,
    16542, 16693, 16879, 17458, 17983, 19882, 19959, 20380, 20450, 23181, 23184,
    24193, 24847, 24992, 25463, 26386, 27188, 27410, 29001, 31378, 32472, 32996,
    33004, 33387, 33541, 33800, 34359, 34642, 36755, 37107, 37113, 37170, 38275,
    38420, 40203, 40856, 41480, 41725, 42756, 45472, 45645, 45655, 45698, 48140,
    49225, 49931, 50164, 50561, 51175, 52075, 52749, 52945, 53355, 53684, 54377,
    55711, 57384, 57592, 57708, 57804, 57990, 59122, 59398, 60186]
sandbox_videos_df = spark.spark.createDataFrame(sandbox_videos, "int")

In [None]:
# This will take a while to load everything in to Spark initially.
face_identities = get_face_identities(include_bbox=True, include_name=True)
print('Schema: ', face_identities)

face_identities = face_identities.join(sandbox_videos_df,
                                       face_identities.video_id ==
                                       sandbox_videos_df.value)

Next, we're going to filter the face identities by identity (to get dataframes only containing Jake Tapper and Bernie Sanders, and then we're going to import them into IntervalLists using `df_to_intrvllists`. IntervalLists are per-video, so we're going to wrap our IntervalLists in a dict that maps from `video_id` to the IntervalList for that video. Luckily, `df_to_intrvllists` does this for us. Note that by default, the start and end times are going to be by frame number.

In [None]:
# The filters and materialization will also take a while, but after that everything should be snappy.
jake_tapper = face_identities.filter(
    face_identities.name == 'jake tapper').filter(
    'probability > 0.7')
bernie_sanders = face_identities.filter(
    face_identities.name == 'bernie sanders').filter(
    'probability > 0.7').alias('bernie_sanders')

jake_by_video = df_to_intrvllists(jake_tapper)
bernie_by_video = df_to_intrvllists(bernie_sanders)

Time to get all the shots where Jake and Bernie are both in the shot. We're going to loop by video ID and use the `overlaps` function of a IntervalList to get overlapping segments between Jake and Bernie. Finally, we'll display our results in the Esper widget using the `intrvllists_to_result` function to get a result that the Esper widget is happy with.

In [None]:
jake_and_bernie_by_video = {}
for video in jake_by_video:
    if video not in bernie_by_video:
        continue
    jake = jake_by_video[video]
    bernie = bernie_by_video[video]
    jake_and_bernie = jake.overlaps(bernie)
    jake_and_bernie_by_video[video] = jake_and_bernie
    
esper_widget(intrvllists_to_result(jake_and_bernie_by_video))

That's not bad, but we see a few problems: we have some false positives (where Jake Tapper is talking next to a small Bernie head), and our true positives don't get the full interview range. Let's see if we can do better.

### Full interviews

Now that we have all the frames where they appear together, it would be great if we could get the full interviews. The problem is that interviews will often cut from both people in frame together to just one of them talking on their own. It would be great if we could include cuts to Bernie Sanders by himself.

We're going to use the `merge` function of a IntervalList for this. The `merge` operation works on two lists (`a.merge(b, predicate=pred)`) and does three things:
* Computes a cross product between the elements of `a` and the elements of `b`
* Filters the pairs by some predicate `pred`
* Merges the two intervals in the pair by computing the minimum interval that covers both of them (so that non-overlapping intervals also get merged in a reasonable way)

You'll see this pattern appear a few times throughout this document - whenever we need to do some sort of cross product operation, we need to define what to do with the pairs that come out of the cross product to get back to a list of intervals.

So we'll use `merge` to combine shots of Jake+Bernie with Bernie by himself. What should our predicate be? Ideally, we want to include combinations where Bernie comes before both of them and when we comes after both of them, so we'll define `or_pred(before(max_dist=10), after(max_dist=10))`. This limits our merge to only get shots that are within `10` frames of each other. Let's give it a try!

In [None]:
jb_next_to_bernie_by_video = {}
for video in jake_and_bernie_by_video:
    jb = jake_and_bernie_by_video[video]
    bernie = bernie_by_video[video]
    
    jb_next_to_bernie_by_video[video] = bernie.merge(jb,
                                                 predicate=or_pred(
                                                     before(max_dist=10),
                                                     after(max_dist=10)
                                                 ))

esper_widget(intrvllists_to_result(jb_next_to_bernie_by_video))

Better! But we still have some of the false positives in our set. Let's filter out all the intervals that don't last for at least 5400 frames (180 seconds for a 30 fps video). We can use `filter_length` for this.

In [None]:
interviews_long_enough_by_video = {}
for video in jb_next_to_bernie_by_video:
    filtered = jb_next_to_bernie_by_video[video].filter_length(min_length = 5400)
    
    if filtered.size() > 0:
        interviews_long_enough_by_video[video] = filtered

esper_widget(intrvllists_to_result(interviews_long_enough_by_video))

We seem to have lost a bunch of interviews... What happened? Let's print out the original `jb_next_to_bernie` dict.

In [None]:
for video in jb_next_to_bernie_by_video:
    print(jb_next_to_bernie_by_video[video].get_temporal_ranges())

There are a bunch of overlapping ranges for each video! This is a side effect of our `merge` from earlier, and you may have noticed shot boundaries in the Esper interface as well. We can use a `coalesce` function to fix this problem. This funciton operates on a single IntervalList and merges overlapping intervals in a single video.

In [None]:
for video in jb_next_to_bernie_by_video:
    jb_next_to_bernie_by_video[video] = jb_next_to_bernie_by_video[video].coalesce()
    print(jb_next_to_bernie_by_video[video].get_temporal_ranges())

This is much closer to what we want. Each interview segment gets is own interval. You may notice some gaps between neighboring ranges; we'll deal with that soon.

But now that our IntervalLists have been coalesced, let's do the same filter from before.

In [None]:
interviews_long_enough_by_video = {}
for video in jb_next_to_bernie_by_video:
    filtered = jb_next_to_bernie_by_video[video].filter_length(min_length = 5400)
    
    if filtered.size() > 0:
        interviews_long_enough_by_video[video] = filtered

result = intrvllists_to_result(interviews_long_enough_by_video)
from pprint import pprint
#pprint(result)

esper_widget(intrvllists_to_result(interviews_long_enough_by_video), show_middle_frame=False, timeline_range=20)

That's great! If you've followed all the instructions so far, the previous query should have returned ten videos. Because I curated this dataset, I happen to know that we're missing one interview. If you go back and look at our first query, it's the one with the text "Clinton: Free College is a False Promise" in the lower third of the thumbnail. Alternatively, you can just look at it here:

In [None]:
esper_widget(intrvllists_to_result({12837: jb_next_to_bernie_by_video[12837]}))

What's going on here? The interview is cutting from clips of Bernie Sanders to *Hillary Clinton* in the middle of the interview, so our `coalesce` is not successfully merging neighboring portions.

We can fix this by using `dilate` to extend all our clips by 300 frames (10 seconds) before using `coalesce`. `dilate` operates on a single IntervalList and extends the start and end window in either direction by a certain amount.

After caolescing, we'll `dilate` by `-300` to get rid of any extra video that we've added on either side.

In [None]:
full_interviews_by_video = {}
for video in jb_next_to_bernie_by_video:
    jb_next_to_bernie = jb_next_to_bernie_by_video[video]
    full_interview = jb_next_to_bernie.dilate(300).coalesce().dilate(-300).filter_length(min_length=5400)
    
    if full_interview.size() > 0:
        full_interviews_by_video[video] = full_interview

esper_widget(intrvllists_to_result(full_interviews_by_video))

Looks like that worked! That's all 11 of our interviews between Jake Tapper and Bernie Sanders.

One last operation: what if we want to only get the interview clips where Bernie Sanders appears by himself, and none of the clips where Jake Tapper and Bernie Sanders are together? We can use the `minus` operator to get the interview intervals, minus the intervals where Bernie Sanders appears with Jake Tapper.

The exact semantics of `minus` can be a bit tricky, and doesn't fit into the "cross product, predicate, process pairs" model that `merge` and `overlaps` use. This is because computing one range minus another can sometimes produce two intervals, and sometimes it can produce one interval:

```
   |-----------------------|
 -            |------|
   _________________________
 = |----------|      |-----|


   |---------------|
 -            |-----------|
   _________________________
 = |----------|      
```
What we want for `a.minus(b)` is something like "the minimum set of intervals that maximally covers `a` without covering anything in `b`." This is what `a.minus(b)` will compute.

In [None]:
bernie_alone_by_video = {}
for video in full_interviews_by_video:
    full_interview = full_interviews_by_video[video]
    jb = jake_and_bernie_by_video[video]
    
    bernie_alone = full_interview.minus(jb)
    
    bernie_alone_by_video[video] = bernie_alone
    
esper_widget(intrvllists_to_result(bernie_alone_by_video))

If you scroll through these videos, you'll se that the only highlighted segments are where Bernie Sanders appears by himself.

## Short consecutive shots

For our second task, we'll look through our sandbox for short consecutive shots - that is, shots that lasted longer than half a second but are consecutive. This can potentially show us some problems with our shot detector.

First, we'll load all the shots in our sandbox set that last less than half a second.

In [None]:
# This can take a while
shots = get_shots()
short_shots = shots.filter('duration < 0.5').join(
        sandbox_videos_df,
    shots.video_id == sandbox_videos_df.value)
print(short_shots.count())

Next we'll aggregate the shots by video ID.

In [None]:
short_shots_by_vid = short_shots.select('video_id', 'min_frame', 'max_frame').groupBy('video_id').agg(
    collect_list('min_frame').alias('min_frames'), collect_list('max_frame').alias('max_frames'))
short_shots_by_vid.show()
print(short_shots_by_vid.count())

Now we'll manually construct IntervalLists for each video.

In [None]:
intrvllists = {}

for row in short_shots_by_vid.collect():
    video = row.video_id
    shots_in_video = zip(row.min_frames, row.max_frames)
    intrvllist = IntervalList([(shot[0], shot[1], 1) for shot in shots_in_video])
    intrvllists[video] = intrvllist

Finally, we'll consecutively merge neighboring shots to get five consecutive short shots. We could do this all in one go, but we'll do it one at a time just to see how things shrink.

In [None]:
n_shots = intrvllists

for n in range(2, 21):
    print('Constructing {} consecutive short shots'.format(n))
    nplusone_shots = {}
    
    for video in n_shots:
        one_shot = intrvllists[video]
        n_shot = n_shots[video].merge(one_shot, predicate=meets_before(epsilon=1))
        
        if len(n_shot.get_temporal_ranges()) > 0:
            nplusone_shots[video] = n_shot
    
    n_shots = nplusone_shots
    print('There are {} videos with {} consecutive short shots'.format(len(n_shots.keys()), n))

esper_widget(intrvllists_to_result(n_shots))