# Running Mean with Smooth Differentials

We want to find a smoothing for the aparent polar wander path with the following properties:

1. It is differentible with respect to time and allow us to analytically compute the derivatives
1. Accounts for spatial uncertanty
1. Accounts for temporal uncertanty
1. It is restricted to lie in the sphere
1. Allows for uncertanty quantification of the curve itself

The classical running mean with fixed temporal window has the incovenient of not being differentiable, which leads to non-smooth curves that try to approximate the running mean. One alternative is to change this window for a smooth window. 

## Mathematical setup

Consider an orderer sequence of paleomagnetic poles $p_1, p_2, \ldots , p_N \in S^2$, with $S^2$ the sphere, with associated measured times $t_1, t_2, \ldots, t_N$. Since we are just going to consider points in the sphere, we will mostly use the geodesic length as distance between points in the sphere, also know as Great-circle or Haversine distance when points are specified by their latitude and longitude:
$$
d(p, q) = \cos^{-1}(p^T q)
$$

## Running mean

At any given time $t$, the running mean is defined by solving the following optimization problem:
$$
p(t) = \text{argmin}_{p \in S^2} \sum_{i=1}^N w(|t - t_i|) \, d(p, p_i)^2,
$$
where $w(\cdot)$ is the averaged window with the properties of being compact supported, differentiable and symmetric.

### Including temporal uncertanty

In the previous example we just consider the case where $t_i$ is known. However, in practice we may want to include a confidence interval for the time variable or measure of uncertaty. For this case, we need to assume some probabilistic model about how the data was generated. One way of doing this is assuming that times are samples from a probabilistic distribition, $t_i \sim T_i$ independent, and then we want to minimize 
$$
p(t) = \text{argmin}_{p \in S^2} \mathbb{E} \left [ \sum_{i=1}^N w(|t - T_i|) \, d(p, p_i)^2 \right ],
$$
and see what correction we should apply to the weights. One example is to assume normal distributions $T_i = N(\tau_i, \sigma_i)$.