# Image Analysis Module: Motion Estimation

**Notes from Hany Farid's Computer Vision Course (UC Berkley)**
https://farid.berkeley.edu/downloads/tutorials/learnComputerVision/

There are two common approaches to motion estimation: **differential estimation** and **feature tracking**. We take a look at each below

## Differential Estimation

This approach is built on two fundamental assumptions:

**1) Brightness Constancy**:
This says that we can assume pixels will not significantly change brightness in a short span of time (unless the subject is teleporting!)

$ \frac {df( x(t), y(t), t)} {dt} = 0 $
---

This can be broken down using the chain rule to:

$ \frac{df}{dx}
\frac{dx}{dt} + 
\frac{df}{dy}
\frac{dy}{dt} +
\frac{df}{dt} = 0 $
---

This can be re-written more simply as below where $f_x$ and $f_y$ are the rate of change in the x and y directions of our image. Similarly, $v_x$ and $v_y$ are the velocity of change in the x and y direction with respect to time (how the pixels move between frames):

$ f_x v_x +  f_y v_y + f_t = 0 $ 
---

Re-written again in a linear algebraic form we can see clearly that we're looking for our direction and velocity of change in x,y ($\begin{pmatrix} v_x \\ v_y \end{pmatrix}$) - so we have two unknowns in the first term and only one constraint...

$ \begin{pmatrix} f_x & f_y \end{pmatrix}
 \begin{pmatrix} v_x \\ v_y \end{pmatrix} + f_t = 0 $
 ---

That's a problem above because we only have one pixel we're dealing with... but if we assume that motion is more or less contstant in a small patch of pixels then we can increase the number of constraints!


**2) Assumption that motion is constant:**
This says that if a pixel has motion between two frames then it is reasonable to assume that neighboring pixels have consistent movement as well. Because of this assumption, we are able expand the equation above to include more contraints. For example, if we take a small 3x3 window we can re-write as:

$ \begin{pmatrix} 
f_x(x_1, y_1) & f_y(x_1,y_1) \\
f_x(x_2, y_2) & f_y(x_2,y_2) \\
\vdots & \vdots  \\
f_x(x_9, y_9) & f_y(x_9,y_9) \\
\end{pmatrix}
\begin{pmatrix} v_x \\ v_y \end{pmatrix} + 
\begin{pmatrix} 
f_t(x_1, y_1) \\
f_t(x_2, y_2)  \\
\vdots   \\
f_t(x_9, y_9) \\
\end{pmatrix} = \vec 0 $

So this can be more simply written as below where $A$ is our matrix of spacial derivatives, $\vec v$ is our motion estimate (our unknown) and $\vec t$ is the vector for our temporal derivatives:

$A \vec v + \vec t = \vec 0 $
---

So how do we solve for our unknown vector $\vec v$? We can write a quadratic error function to find the $\vec v$ which minimizes our $A \vec v + \vec t$ since we know this should ideally be 0! Super cool to come out with this from the two assumptions above!

$ E(\vec v) = \vert\vert A \vec v + \vec t \vert\vert ^2 $

The above error function can be solved with a least squares solution:

$ \vec v = -(A^T A)^{-1} A^T \vec t$
---

So then, if our spacial derivative matrix $A$ is invertible, we can write the product of $A$ and it's transpose ($A^T A$) as below, where $\omega$ is the number of pixels we have in our square (9 above):

$ A^T A = \begin{pmatrix} 
\sum_{\omega} f^2_x & \sum_{\omega} f_x f_y \\
\sum_{\omega} f_x f_y & \sum_{\omega} f^2_y
\end{pmatrix} $


Similarly, matrix $A^T$ can be multiplied with the vector $\vec t$ to find the second term in our solution above:

$ A^T \vec t = \begin{pmatrix} 
\sum_{\omega} f_x f_t \\
\sum_{\omega} f_y f_t
\end{pmatrix} $