<h1>Video Sliding Windows</h1>

<p>
So far we restricted ourselves to 1D time series, but the idea of recovering periodic dynamics with geometry can just as easily apply to multivariate signals.  In this module, we will examine sliding windows of videos as an exmaple.  Many natural videos also have periodicity, such as this video of a woman doing jumping jacks
</p>


In [1]:
import io
import base64
from IPython.display import HTML

video = io.open('jumpingjacks.ogg', 'r+b').read()
encoded = base64.b64encode(video)
HTML(data='''<video alt="test" controls>
                <source src="data:video/mp4;base64,{0}" type="video/mp4" />
             </video>'''.format(encoded.decode('ascii')))

<p>
Video can be decomposed into a 3D array, which has dimensions width x height x time.  To tease out periodicity in geometric form, we will do the exact same thing as with sliding window 1D signal embeddings, but instead of just one sample per time shift, we need to take every pixel in every frame in the time window.  The figure below depicts this
</p>

<img src = "VideoStackTime.svg"><BR><BR>

To see this visually in the video next to PCA of the embedding, look at the following video

In [2]:
video = io.open('jumpingjackssliding.ogg', 'r+b').read()
encoded = base64.b64encode(video)
HTML(data='''<video alt="test" controls>
                <source src="data:video/mp4;base64,{0}" type="video/mp4" />
             </video>'''.format(encoded.decode('ascii')))

<h2>PCA Preprocessing for Efficiency</h2><BR>
One issue we have swept under the rug so far is memory consumption and computational efficiency.  Doing a raw sliding window of every pixel of every frame in the video would blow up in memory.  However, even though there are <code>WH</code> pixels in each frame, there are only <code>N</code> frames in the video.  This means that each frame in the video can be represented in an <code>(N-1)</code> dimensional subspace of the pixel space, and the coordinates of this subspace can be used in lieu of the pixels in the sliding window embedding.  This can be done efficiently with a PCA step before the sliding window embedding.  Run the cell below to load code that does PCA efficiently

In [5]:
#Do all of the imports and setup inline plotting
%matplotlib notebook
import numpy as np
import matplotlib.pyplot as plt
from sklearn.decomposition import PCA
from mpl_toolkits.mplot3d import Axes3D
import scipy.interpolate

from TDA import *
from VideoTools import *

##Here is the actual PCA code
def getPCAVideo(I):
    ICov = I.dot(I.T)
    [lam, V] = linalg.eigh(ICov)
    V = V*np.sqrt(lam[None, :])
    return V

<h2>Jumping Jacks Example Live Demo</h2><BR>
Let's now load in code that does sliding window embeddings of videos.  The code is very similar to the 1D case, and it has the exact same parameters.  The only difference is that each sliding window lives in a Euclidean space of dimension the number of pixels times <code>dim</code>.  We're also using linear interpolation instead of spline interpolation to keep things fast

In [9]:
def getSlidingWindowVideo(I, dim, Tau, dT):
    N = I.shape[0] #Number of frames
    P = I.shape[1] #Number of pixels (possibly after PCA)
    pix = np.arange(P)
    NWindows = int(np.floor((N-dim*Tau)/dT))
    X = np.zeros((NWindows, dim*P))
    idx = np.arange(N)
    for i in range(NWindows):
        idxx = dT*i + Tau*np.arange(dim)
        start = int(np.floor(idxx[0]))
        end = int(np.ceil(idxx[-1]))+2
        if end >= I.shape[0]:
            X = X[0:i, :]
            break
        f = scipy.interpolate.interp2d(pix, idx[start:end+1], I[idx[start:end+1], :], kind='linear')
        X[i, :] = f(pix, idxx).flatten()
    return X

Finally, let's load in the jumping jacks video, perform PCA, do a sliding window, and examine the sliding window embedding using TDA.  As before, you should tweak the parameters of the sliding window embedding and study the effect on the geometry.

In [3]:
#Load in video and do PCA to compress dimension
(X, FrameDims) = loadVideo("jumpingjacks.ogg")
X = getPCAVideo(X)

#Given that the period is 30 frames per cycle, choose a dimension and a Tau that capture 
#this motion in the roundest possible way
#Plot persistence diagram and PCA
dim = 30
Tau = 1
dT = 1

#Get sliding window video
XS = getSlidingWindowVideo(X, dim, Tau, dT)

#Mean-center and normalize sliding window
XS = XS - np.mean(XS, 1)[:, None]
XS = XS/np.sqrt(np.sum(XS**2, 1))[:, None]

#Get persistence diagrams
PDs = doRipsFiltration(XS, 1)

#Do PCA for visualization
pca = PCA(n_components = 3)
Y = pca.fit_transform(XS)


fig = plt.figure(figsize=(12, 6))
ax1 = fig.add_subplot(121)
plotDGMAx(ax1, PDs[1])
ax1.set_title("1D Persistence Diagram")

c = plt.get_cmap('jet')
C = c(np.array(np.round(np.linspace(0, 255, Y.shape[0])), dtype=np.int32))
C = C[:, 0:3]
ax2 = fig.add_subplot(122, projection = '3d')
ax2.set_title("PCA of Sliding Window Embedding")
ax2.scatter(Y[:, 0], Y[:, 1], Y[:, 2], c=C)
ax2.set_aspect('equal', 'datalim')
plt.show()

<h1>Periodicities in The KTH Dataset</h1><BR>

We will now examine videos from the <a href = "http://www.nada.kth.se/cvap/actions/">KTH dataset</a>, which is a repository of black and white videos of human activities.  It consists of 25 subjects performing 6 different actions in each of 4 scenarios.  We will use the algorithms developed in this section to measure and rank the periodicity of the different video clips.

<h2>Varying Window Length</h2><BR>
For our first experiment, we will be showing some precomputed results of varying the sliding window length, while choosing Tau and dT appropriately to keep the dimension and the number of points, respectively, the same in the sliding window embedding.  As an example, we will apply it to one of the videos of a subject waving his hands back and forth, as shown below

In [4]:
video = io.open('KTH/handwaving/person01_handwaving_d1_uncomp.ogg', 'r+b').read()
encoded = base64.b64encode(video)
HTML(data='''<video alt="test" controls>
                <source src="data:video/mp4;base64,{0}" type="video/mp4" />
             </video>'''.format(encoded.decode('ascii')))

We have done some additional preprocessing, including applying a bandpass filter to each PCA pixel to cut down on drift in the video.  Below we show a video varying the window size of the embedding and plotting the persistence diagram, "self-similarity matrix" (distance matrix), and PCA of the embedding, as well as an evolving plot of the maximum persistence versus window size:

In [5]:
video = io.open('Handwaving_Deriv10_Block160_PCA10.ogg', 'r+b').read()
encoded = base64.b64encode(video)
HTML(data='''<video alt="test" controls>
                <source src="data:video/mp4;base64,{0}" type="video/mp4" />
             </video>'''.format(encoded.decode('ascii')))

As you can see, the maximum persistence peaks at around 40 frames, which is the period of each hand wave.  This is what the theory we developed for 1D time series would have predicted as the roundest window.<BR>

<h2>KTH Dataset Rankings</h2><BR>
For the final experiment, students will split up into groups and run code on different subsets of the KTH dataset which ranks the video clips in decreasing order of periodicity.  As an example, <a href = "VideoResults/index.html">click here</a> to see the rankings of all activities for the first 4 subjects.  Groups will run the code in <a href = "KTHTests.py">KTHTests.py</a> after modifying it to go through the appropriate subset of the database by changing lines 53 through 55.  A new web page will be generated <a href = "VideoResults/index.html">here</a> to show the resulting rankings.  Note that a fixed window length of 20 frames is maintained throughout all of the experiments.  Feel free to tweak the window size ("win" on line 72), the dimension of the embedding ("dim" on line 73), or any other parameters you think would make the results more meaningful.

<h1>Summary</h1>
<ul>
<li>Periodicity can be studied on general time series data, including multivariate time series such as video</li>
<li>Computational tricks, such as PCA, can be employed to make sliding window videos computationally tractable</li>
</ul>