## A notebook for exploring video posedata

**Intended use:** the user selects a video that is accompanied by already extracted posedata in a .json file. The notebook provides visualizations that summarize the quality and content of the poses extracted across all frames of the video, as well as armature plots of the detected poses in a selected frame. These can be viewed separately from the source video, compared numerically, grouped and searched by similarity, and even animated.

Note that at present, this only works with .json output files generated via the Open PifPaf command-line tools.


In [None]:
import os
from pathlib import Path

from bokeh.io import output_notebook, show

import cv2
import faiss
from ipyfilechooser import FileChooser
from IPython.display import display
import numpy as np

from bokeh_functions import *
from pose_functions import *
from posedata_preprocessing import *

### Build and display the video/posedata selector widget

Clicking the "Select" button that appears after running this cell will display a filesystem navigator/selector widget that can be used to select a video for analysis. Note that for now, this video **must** be in the same folder as its posedata output, and the names of the matched video and posedata files should be identical, other than that the posedata file will have `.openpifpaf.json` appended to the name of the video file.

The default folder the selector widget shows first is either the value of the `$DEV_FOLDER` environment variable (see README.md for information about how to set this via a `.env` file) or else the folder from which the notebook is being run.


In [None]:
source_data_folder = Path(os.getenv("DATA_FOLDER", Path.cwd()))

fc = FileChooser(source_data_folder)
fc.title = '<b>Use "Select" to choose a video file.</b><br>It must have an accompanying .openpifpaf.json file in the same folder.'
fc.filter_pattern = ["*.mp4", "*.mkv", "*.avi", "*.webm", "*.mov"]

display(fc)

### Preprocess pose data and tracking information

Run this cell after selecting a video above.


In [None]:
pose_file = f"{fc.selected}.openpifpaf.json"
video_file = fc.selected

data_dir = Path(video_file).with_suffix("")

# Create a folder to store derivative data files, if one doesn't already exist
if not os.path.isdir(data_dir):
    # Run the seekability test on the video if it's new
    seek_score = check_video_seekability(video_file)
    if seek_score < SEEK_SCORE_THRESHOLD:
        print(
            f"WARNING: the video's sequential play frames are only {seek_score:.3%} similar to seeked frames; consider re-encoding and re-running pose estimation before analyzing the video."
        )
        assert(False)

    os.mkdir(data_dir)

print("Video file:", video_file)
print("Posedata file:", pose_file)

cap = cv2.VideoCapture(video_file)
video_fps = cap.get(cv2.CAP_PROP_FPS)
video_width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
video_height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
cap.release()

print("Video dimensions:", video_width, "(w) x", video_height, "(h)")

print("Video FPS:", video_fps)

print("Loading video and JSON files, please wait...")

pose_data, pose_series = preprocess_pose_json(pose_file, video_file)

print("Duration:", pose_series["timestamp"][len(pose_series["timestamp"]) - 1].time())

tracked_pose_data, pose_tracks, tracked_poses = get_pose_tracking(
    video_file, pose_data, video_fps, video_width, video_height
)

pose_data = tracked_pose_data
pose_series["tracked_poses"] = tracked_poses

## Pose normalization, angle calculation and indexing

The following cell needs to be run to enable the pose search features of the posedata explorer app.

The normalization process can take quite a while if it has never been run on a particular set of video/posedata files (~10 minutes for a full-length play). But it then caches the results in pickle (\*.p) files in the same folder as the video and posedata files, meaning the cell will take a very short amount of time on every subsequent invocation for that video.

Generation of the pose angle data also can take a minute or two.

When the explorer's infrastructure switches over to using a local database to store the normalized pose coordinates and other data, these normalization and indexing steps should be entirely replaced by a database ingest process that can be run offline/in advance for a new video/posedata corpus.


In [None]:
(
    normalized_poses,
    normalized_pose_metadata,
    framepose_to_seqno,
    normalized_pose_data,
) = normalize_poses(pose_file, pose_data)

pose_angle_data, pose_angles = get_all_pose_angles(
    pose_file, pose_data, framepose_to_seqno
)

print("Indexing video posedata set for similarity search")

# Using normalized armature coordiantes as primary features
faiss_pose_data = [
    tuple(np.nan_to_num(raw_pose, nan=-1).tolist()) for raw_pose in normalized_poses
]
faiss_IP_index = faiss.IndexFlatIP(34)  # If using normalized coords, rather than angles

# Using pose angles as primary features
# faiss_pose_data = [
#     tuple(np.nan_to_num(angles, nan=-999).tolist()) for angles in pose_angles
# ]
# faiss_IP_index = faiss.IndexFlatIP(28)

faiss_IP_input = np.array(faiss_pose_data).astype("float32")
faiss.normalize_L2(faiss_IP_input)  # Must normalize the inputs!
faiss_IP_index.add(faiss_IP_input)

#### Optional: cluster analysis of normalized poses

The cell below computes a K-means clustering of the poses based on the similarities of their vectors, then calculates and visualizes the relative sizes of the clusters and the averaged armature positions of their poses. Note that if the `average_backgrounds` parameter is set to `True` rather than `False`, drawing the cluster representative poses can take quite a bit longer due to the overhead of averaging the source images for the background.

In [None]:
cluster_labels = cluster_all_poses(faiss_IP_input)

cluster_to_pose, sorted_bin_counts = compute_cluster_distribution(
    cluster_labels, viz=True
)

draw_cluster_representatives(
    cluster_to_pose,
    sorted_bin_counts,
    normalized_poses,
    normalized_pose_metadata,
    pose_data,
    video_file,
    clusters_to_draw=10,
    average_backgrounds=False,
)

### Build and launch the explorer app

This displays an interactive chart visualization of the attributes of the posedata in the .json output file across the runtime of the video.

Clicking anywhere in the chart, moving the slider, or clicking the prev/next buttons will select a frame and draw the poses detected in that frame, with the option of displaying the actual image from the source video as the "background." When a frame is selected, it is also possible to click a specific pose in the frame window to select that pose for comparison with a second pose (which is also selected by clicking on it). And the first selected pose can be used as the "query" to search for the most similar poses across the entire video, which can then be viewed and paged through.

Please also see the instructions below if you are running this notebook in VS Code or JupyterLab Desktop. Note also that the Jupyter server must be running on port 8888 (or 8889) for the explorer app to work in Jupyter/JupterLab.


## Running the notebook in VS Code or JupyterLab Desktop

As of early 2023, if you are running this notebook in VS Code or JupyterLab Desktop instead of Jupyter or JupyterLab, the cell below will not work (BokehJS will load, but no figures will appear) without using one of these workarounds:

### VS Code

Take note of the error message that appears when you try to run the cell below, particularly the long alphanumeric string suggested as a value for `BOKEH_ALLOW_WS_ORIGIN`. Copy this string, then uncomment the lines indicated in the cell below, paste the alphanumeric string in place of the `INSERT_BOKEH_ALLOW_WS_ORIGIN_VALUE_HERE` text, then try running the cell below again to launch the explorer app.

### JupyterLab Desktop

Take note of the error message that appears when you try to run this cell, particularly the number that follows `localhost:` after each of its appearances in the message. Copy that number, replace the value following `bokeh_port =` with the number, and try running the cell again.

In [None]:
bkapp = build_bokeh_app(
    pose_series,
    pose_data,
    normalized_pose_data,
    normalized_pose_metadata,
    pose_angle_data,
    video_file,
    video_width,
    video_height,
    video_fps,
    faiss_IP_index,
)

# --- Special instructions for VS Code ---

# If you are following the steps above to run the explorer app in VS Code,
# uncomment the following 3 lines (delete the '# 's) before running this cell:

# os.environ[
#     "BOKEH_ALLOW_WS_ORIGIN"
# ] = "INSERT_BOKEH_ALLOW_WS_ORIGIN_VALUE_HERE"


# --- Special instructions for JupyterLab Desktop ---

# If you are following the steps above to run the explorer app in JupyterLab
# Desktop, change the "bokeh_port = ..." line below to the number displayed in
# the error message.

bokeh_port = 8888  # <- May need to be replaced to run in JupyterLab Desktop

output_notebook()

show(bkapp, notebook_url=f"localhost:{bokeh_port}")