# Analysis of Physical Oceanographic Data - SIO 221A
### Python version of [Sarah Gille's](http://pordlabs.ucsd.edu/sgille/sioc221a/index.html) notes by:
#### Bia Villas Bôas (avillasboas@ucsd.edu) & Gui Castelão (castelao@ucsd.edu)

## Lecture 15

*Reading:  Bendat and Piersol, Ch. 5.2.5, 5.2.6, 9.2*

Last time we took a general look at correlation (and correlation coefficients)
and their analog in spectral space:  coherence.  Coherence tells us
how effectively two time series resemble each other at any given
frequency.

We defined the cross-spectrum:

\begin{equation}
\hat{S}_{XY}(f_m)= \frac{\langle X_m^* Y_m\rangle}{T}. \hspace{3cm} (1)
\end{equation}

This is complex:  the real part is the co-spectrum ($C(f)$) and the imaginary
part is the quadrature spectrum ($Q(f)$)---consistent with the terminology
we use to describe cosine and sine being ``in quadrature'' with each other.

From that, squared coherence is:

\begin{equation}
\gamma_{xy}^2(f_k) = \frac{C^2(f_k) + Q^2(f_k)}{S_{xx}(f_k) S_{yy}(f_k)}, \hspace{3cm} (2)
\end{equation}

where we needed $S_{xx}$, $S_{yy}$ and $S_{xy}$ to represent averages of
multiple segments.
Coherence is 1 if two data sets consistently oscillate in the same way
in all segments we consider.

The coherence phase is:

\begin{equation}
\phi(f_k) = \tan^{-1}(-Q(f_k)/C(f_k)), \hspace{3cm} (3)
\end{equation}

where $Q$ is the imaginary part of the co-spectrum $S_{xy}$,
and $C$ is the real part of the co-spectrum.
The phase
tells us the timing difference between the two time series.  If $\phi = 0$,
changes in $x$ and $y$ happen at the same time.  If $\phi = \pi$, then
$x$ is at a peak when $y$ is at a trough.  And a value of $\phi=\pi/2$ or
$\phi=-\pi/2$ tells us that the records are a quarter cycle different.

Now our task is to figure out what is significant.

#### Coherence uncertainty

No estimate is complete without an uncertainty.
We compute a significance level for coherence several ways.  The standard
approach that we discussed previously is to set a threshold for evaluating
whether a calculated coherence exceeds what we might expect from random
white noise.
We started with the uncertainty for the squared coherence, $\gamma^2$:

\begin{equation}
\beta = 1 - \alpha^{1/(n_d-1)}, \hspace{3cm} (4)
\end{equation}

where $n_d$ is the number of segments, $\alpha$ is the significance level and is
tyically 0.05 for a 95\% significance level
(see Thomson and Emery).
In Python, the threshold for $\gamma$ is:

```python
gamma_threshold= sqrt(1-alpha**(1/(nd-1)))
```

An alternate formulation is presented by Bendat and Piersol (Table 9.6), who
report the standard deviation of the squared coherence ($\gamma^2$) to be:

\begin{equation}
\delta_{\gamma_{xy}^2} = \frac{\sqrt{2}(1-\gamma_{xy}^2)}{|\gamma_{xy}| \sqrt{n_d}}. \hspace{3cm} (5)
\end{equation}

These are different metrics.  One tells us whether the derived
coherence is statistically different from zero; the second
evaluates the range of values that would be consistent with an
observed coherence.

#### What is $n_d$?

We have a formulation for coherence uncertainty that depends on the number
of segments.  What if we want to use overlapping segments, just as we
did for the Welch method?  You can test this through a Monte Carlo process.
If you set $n_d$ equal to the total number of segments, ignoring the fact that
some overlap, your error bars will be visibly too small.  You can run
Monte Carlo tests with overlapping segments to figure out how many effective
segments you really have.  And perhaps not surprisingly, the results
are equivalent to what we found in the Welch method (albeit scaled by
a factor of 2, since we're now counting segments and not degrees of freedom):

Window type | Equivalent number of segments ($n_d$)
:-----------:|:-----------------:
Boxcar | 2/3 
Triangle | 8/9 
Hanning | 18/19 $\approx$ 0.95 
 Hamming |  $\sim$ 0.90

Effective number of independent segments relative to the total number
of segments, using 50\% overlap.  (With no overlap, assume $n_d$ segments.)



#### Uncertainties of phase:  What do we believe?

The phase difference that emerges from this is only relevant at the phase
where there is coherence energy (15 cycles/1000 points in the example above),
and in that case the phase is a quarter cycle different.  If we reverse the
order of $x$ and $y$, we'll find negative phase, so a lead will turn into a
lag.

First a little terminology.
Bendat and Piersol provide a good discussion of bias and uncertaintites
in spectral estimators.  As a starting point, the variance of the quantity
that we want to estimate is

\begin{equation}
\text{var}[\tilde{A}] = E[\tilde{A}^2]-A^2,  \hspace{3cm} (6)
\end{equation}

where $A$ is the true value, and $\tilde{A}$ is the unbiased estimate (so
$E[\tilde{A}] = A$.
For spectral estimators we tend to talk about the normalized error:

\begin{equation}
\epsilon^2 = \frac{\text{var}{\tilde{A}}}{A^2}.  \hspace{3cm} (7)
\end{equation}

Bendat and Piersol first derive relationships for the variance of
the spectrum and cross-spectrum in the case of one segment and two degrees
of freedom (see appendix). They then note that
variance scales with $1/n$, where $n$ is the number of degrees of freedom,
so that variance can be inferred for spectra and cross-spectra with any
number of degrees of freedom (by dividing by $n_d$ the number of segments).

The phase error can seem a little murky. One common formulation for phase uncertainty is:

\begin{equation}
\delta_\phi = \sin^{-1}\left[t_{\alpha,2n_d}
\sqrt{\frac{1-\gamma_{xy}^2}{2n_d \gamma_{xy}^2}}\right] \hspace{3cm} (8)
\end{equation}

where $t_{\alpha,2n_d}$ is identified as the "*Student t distribution*", and is actually the inverse of the Student t distribution (``scipy.stats.t.ppf`` in Python). Given an upper cut-off point of $\alpha/2 =0.975$ for the cdf of the t-distribution, we're looking for the corresponding value of the function. In case you have doubts, check Table A9.3 of Koopmans, which shows, for example, that  $t(0.975,20)=2.086$.

But when we plot this up, for our white noise case, it seems to be a complex number, since we've ended up with some out of range values for the arcsine---perhaps this isn't surprising since the phase is ill-defined for white noise.  Bendat
and Piersol provide a different formulation, which has the virtue of producing a real number:

\begin{equation}
\text{std}\left[\phi_{xy}(f)\right] \approx \frac{\left[1-\gamma_{xy}^2(f)\right]^{1/2}}{\left|\gamma_{xy}(f)\right|\sqrt{2 n_d}} \hspace{3cm} (9)
\end{equation}

Zwiers and Von Storch quote Hannan (1970)\footnote{Hannan, 1970, {\it Multiple
Time Series}, John Wiley \& Sons, 536 pp.  (See p. 257, equation 2.11)}
and provide:

\begin{equation}
\delta_\phi = \sin^{-1}\left[t_{(1+p)/2,2n_d-2}
  \frac{\gamma_{xy}^{-2} -1}{2n_d -2}\right],\hspace{3cm} (10)
 \end{equation}
 
where $p$ is the confidence interval (e.g. 0.95), so $(1+p)/2$ and $(1-p)/2$ provide the limits for $p$\% significance levels.
However, Zwiers and Von Storch have misquoted Hannan (1970), who actually
have a form equivalent to this:

\begin{equation}
\delta_\phi = \sin^{-1}\left[t_{(1+p)/2,2n_d-2}
  \left\{\frac{\gamma_{xy}^{-2} -1}{2n_d -2}\right\}^{1/2}\right], \hspace{3cm} (11)
\end{equation}

which is exactly equivalent to Koopmans (1974).
In Python, these become:

```python
# cab is squared coherence between a and b
# for example:

cab = abs(np.mean(fab, axis=0))/np.sqrt(abs(np.mean(faa, axis=0))*abs(np.mean(fbb, axis=0)))

alpha = .05
nd = 10 # number of segments
p = 1-alpha

delta_phase = np.arcsin(scipy.stats.t.ppf(.95, 2*nd)*np.sqrt((1-abs(cab)**2)/(abs(cab)**2*(2*nd))))
delta_phase2 = np.sqrt((1-cab**22)/(abs(cab)**2^2*2*nd))
delta_phase3 = np.arcsin(scipy.stats.t.ppf(.975, 2*nd-2)*(1/cab**2-1)/(2*nd-2))
```

The expressions are similar, though not identical.
Which is most appropriate?
We can test this out by creating a fake data set with a known phase
relationship:

In [1]:
import numpy as np
from numpy import sin, cos, pi
import matplotlib.pyplot as plt
import scipy.stats as stats
from cmath import asin

In [2]:
N = 100
M = 1000
a = np.random.randn(M, N) + cos(2*pi/10*np.arange(1, N+1))*np.ones([M, N])
b = np.random.randn(M, N) + sin(2*pi/10*np.arange(1, N+1))*np.ones([M, N])

In [3]:
fa = np.fft.fft(a, axis=-1)
fb = np.fft.fft(b, axis=-1)
fab = np.conjugate(fa)*fb
faa = np.conjugate(fa)*fa
fbb = np.conjugate(fb)*fb
cab = abs(np.mean(fab, axis=0))/np.sqrt(abs(np.mean(faa, axis=0))*abs(np.mean(fbb, axis=0)))

In [12]:
m = 10
phase_c = []
for i in range(1, M//m+1):
    num = -np.mean(fab[(i-1)*m+1:i*m], axis=0).imag
    den = np.mean(fab[(i-1)*m+1:i*m], axis=0).real
    phase_c.append(np.arctan2(num, den))
phase_c = np.array(phase_c)

nd = m
delta_args1 = stats.t.ppf(.95,2*nd)*np.sqrt((1-cab**2)/(cab**2 * np.sqrt(2*nd)))
delta_phase = np.array([asin(arg) for arg in delta_args1])

delta_phase2 = np.sqrt((1-cab**2)/(abs(cab)**2*2*nd))

delta_args3 = stats.t.ppf(.975, 2*nd-2)*(1/cab**2-1)/(2*nd-2)
delta_phase3 = np.array([asin(arg) for arg in delta_args3])

# compare results
print(delta_phase[10], delta_phase2[10], delta_phase3[10])
print(np.std(phase_c[:, 10]))

(0.2349300289361676+0j) 0.06382055284470536 (0.009508129657135116+0j)
0.06214975490152481


It's clear from these tests that (a) the distribution of the
phases should be roughly Gaussian, (b) Bendat and Piersol's
representation for the standard
deviation of the phase (delta\_phase2) is relatively reliable, (c) the
inverse sine formulations should produce phase errors representing the 95th
percentile.

#### Interpreting Phase

Let's consider a little thought experiment.  What happens if you compute
coherence between two data sets which are essentially the same, aside from
a little noise, except that one is offset in time relative to the other.
For example:

\begin{eqnarray}
a & = & \tau_i^x + n_i \\
b & = & \tau_{i+7}^x + m_i,
\end{eqnarray}

where $\tau_i^x$ is zonal wind at time step $i$, $n_i$ is one type of noise,
and $m_i$ is another noise that is uncorrelated with $n_i$.  Assuming the
noise to be fairly small, what should the coherence and phase
be between $a$ and $b$?

To figure this out, we can estimate the cross-spectrum:

\begin{equation}
G_{ab} = G_{\tau_i,\tau_{i+7}} + G_{\tau_i,m_i} + G_{\tau_{i+7},n_i} + G_{n_i,m_i}.
\end{equation}

Since the noise is uncorrelated with the data $\tau$ and uncorrelated with
other noise, with a large enough sample this becomes:

\begin{equation}
G_{ab} \approx G_{\tau_i,\tau_{i+7}}.
\end{equation}

The wind $\tau$ is coherent with itself, albeit with a little phase lag,
so we expect to find:

\begin{equation}
\gamma^2 = \frac{|G_{ab}|^2}{G_{aa}G_{bb}} \approx  \frac{|G_{aa}|^2}{G_{aa}G_{aa}} = 1
\end{equation}

And the phase is

\begin{equation}
\phi(f)=\mbox{atan2}(\Im{G_{ab}(f)},\Re{G_{ab}(f)}).
\end{equation}

In this case, the phase will simply reflect the 7 timestep shift between $a$
and $b$.  For a frequency of $n$ cycles per $N$ points, one cycles is $N/n$
time units, and the offset will represent a fraction of a cycle:  $7n/N$.  For
higher frequencies, the phase shift will represent a linearly increasing phase.


#### Coherence:  The autocovariance perspective

The power of coherence comes because it gives us a means to compare two
different variables.  With spectra we can ask, is there energy at a given
frequency?  With coherence we can ask whether wind energy at a given
frequency drives an ocean response at a given frequency.  Does the ocean
respond to buoyancy forcing?  Does momentum vary with wind?  Does
one geographic location vary with another location?  Coherence is our
window into the underlying physics of the system.

Last time we wrote the cross-spectrum for $x$ and $y$:

\begin{equation}
\hat{C}_{XY}(\sigma_m)= \frac{\langle X_m^* Y_m\rangle}{\Delta \sigma}
\end{equation}

Just as we considered spectra as the Fourier transform of the autocovariance,
we can now think about the Fourier transform of the lagged co-variance.

\begin{equation}
R_{xy}(\tau) = \frac{1}{2T} \int_{-T}^{T} x^*(t) y(t+\tau)\, dt.
\end{equation}

We can rewrite this:

\begin{eqnarray}
R_{xy}(\tau) & = & \frac{1}{2T} \int_{-T}^{T} \sum_{n=-\infty}^{\infty}
X_n^* e^{-i\sigma_n t} \sum_{m=-\infty}^{\infty} Y_m e^{i\sigma_m (t+\tau)}\, dt \\
 & = & \frac{1}{2T} \sum_{n=-\infty}^{\infty} X_n^* \sum_{m=-\infty}^{\infty} Y_m e^{i\sigma_m \tau} \int_{-T}^{T} e^{i(\sigma_m-\sigma_n) t} \, dt \\
 & = & \sum_{n=-\infty}^{\infty} X_n^* \sum_{m=-\infty}^{\infty} Y_m e^{i\sigma_m \tau} \delta_{nm} \\
 & = & \sum_{n=-\infty}^{\infty} X_n^* Y_n e^{i\sigma_n \tau} \\
 & = & \Delta\sigma \sum_{n=-\infty}^{\infty} C_{XY} e^{i\sigma_n \tau},
\end{eqnarray}

where we used the Kronecker delta $\delta_{nm}$ to extract only frequencies
for which $n=m$, since all other modes are orthogonal.  The result tells us
that the lagged covariance is the inverse Fourier transform of the
cross spectrum.
In other words,

\begin{equation}
C_{XY}(\sigma_n) = \int_{-T}^{T} R_{xy}(\tau) e^{-i\sigma_n\tau}\, d\tau =
\frac{X_n^* Y_n}{\Delta \sigma}
\end{equation}

Thus we could determine the cross-spectrum from the lagged covariance.