- To estimate the motion between two or more images, a suitable error metric must first be chosen to compare the images (Section 9.1). Once this has been established, a suitable search technique must be devised. The simplest technique is to exhaustively try all possible alignments, i.e., to do a full search. In practice, this may be too slow, so hierarchical coarse-to-fine techniques (Section 9.1.1) based on image pyramids are normally used. Alternatively, Fourier transforms (Section 9.1.2) can be used to speed up the computation.

- To get sub-pixel precision in the alignment, incremental methods (Section 9.1.3) based on a Taylor series expansion of the image function are often used. Motion estimation can be made more reliable by learning the typical dynamics or motion statistics of the scenes or objects being tracked

- For more complex motions, piecewise parametric spline motion models (Section 9.2.2) can be used.

- In the presence of multiple independent (and perhaps non-rigid) motions, general-purpose optical flow techniques need to be used. For even more complex motions that include a lot of occlusions, layered motion models (Section 9.4), which decompose the scene into coherently moving layers, can work well.

### 9.1 Translational alignment
- The simplest way to establish an alignment between two images or image patches is to shift one image relative to the other.

- Robust error metrics: We can make the above error metric more robust to outliers by replacing the squared error terms with a robust function ρ(ei)

- Spatially varying weights
    - The error metrics above ignore that fact that for a given alignment, some of the pixels being compared may lie outside the original image boundaries. Furthermore, we may want to partially or completely downweight the contributions of certain pixels.

    - All of these tasks can be accomplished by associating a spatially varying per-pixel weight with each of the two images being matched

- Correlation: An alternative to taking intensity differences is to perform correlation, i.e., to maximize the product (or cross-correlation) of the two aligned images

#### 9.1.1 Hierarchical motion estimation

- Now that we have a well-defined alignment cost function to optimize, how can we find its minimum?

- To accelerate this search process, hierarchical motion estimation is often used: an image
pyramid is constructed and a search over a smaller number of discrete pixels (corresponding to the same range of motion) is first performed at coarser levels. The motion estimate from one level of the
pyramid is then used to initialize a smaller local search at the next finer level

- At the coarsest level, we search for the best displacement u(l) that minimizes the difference between images I0(l) and I1(l)

#### 9.1.2 Fourier-based alignment

- Windowed correlation

- Phase correlation

#### 9.1.3 Incremental refinement

### 9.2 Parametric motion
- Many image alignment tasks, for example image stitching with handheld cameras, require the use of more sophisticated motion models

- For parametric motion, instead of using a single constant translation vector u, we use a spatially varying motion field or correspondence map, x'(x; p), parameterized by a low dimensional vector p, where x' can be any of the motion models

#### 9.2.1 Application: Video stabilization

#### 9.2.2 Spline-based motion

#### 9.2.3 Application: Medical image registration

### 9.3 Optical flow