# Time domain correlations

Consider a pair of time dependent signals or stochastic (random) processes like the two shown in the figure below. They could be fluctuations in the current in a circuit, or the velocity or position a small (Brownian) particle subject to thermal fluctuations, or the deflection of your ear drum as you listen to this lecture, or... As you can see, this is a problem that arises in every corner of science and engineering.  

![stochastic signals](./two_signals.png)

In fact, these two traces are the angular deflection of a *tiny* mirror suspended from a very thin wire, measured by the reflection of a laser off of the mirror. The difference is that the upper trace was obtained at atmospheric pressure pressure, while the lower trace was obtained at about $10^{-7}$ atm. 
___
**Question:** If you wanted to develop a physical model of this fluctuating mirror, what elements would you put in?
___

Are these signals random? Partially random and partially predictable? How do we know? What sort of mathematical analysis can answer this question?

*Time series analysis* is the branch of statistical analysis that provides the tools to answer such questions. In the figure above, we might wonder whether there is any regularity in the position of the mirror --- is there some period of oscillation hidden in the signal? Does the timeseries have some *memory,* such that the position is correlated with earlier positions? How long is the memory?

We can search for an answer in the *correlation function* of the signal. Let the signal of interest be denoted $x$. The time-correlation (or "auto-correlation") is defined as:

$$C_x(t,t^\prime) = \langle x(t)x(t^\prime)\rangle \tag{Eq. 1}$$

The angle brackets mean "average," which is required when we are dealing with observables that have both randomness and some inherent structure. (**Why?**) The average could be a
* Ensemble average: Repeat the measurement (or simulation) many times, and then average over multiple realizations
* Time average: Collect data for a long time, and treat different $(t,t^{\prime})$ pairs as independent. 

**Time translation invariance** Here, we will consider (usually?) processes which are *ergodic.* **Question:** What does ergodic mean? 

* When a signal is invariant under translation in time, $C_x(t,t^\prime)$ can only depend on the difference $(t-t^\prime)$, which is often referred to as the "lag time". We can also pick $t^\prime = 0$, and write $C_x(t) = \langle x(t)x(0)\rangle$.
* It is common to subtract off the average value of $x$ if it is not zero, and compute instead the auto-correlation of the fluctuations about the average value:

$$\begin{align}
C_x(t) & = \langle \left( x(t) - \langle x\rangle\right) \left( x(0) - \langle x\rangle\right)\rangle\\
 & = \langle x(t)x(0) \rangle - \langle x \rangle^2 \tag{Eq. 2}
\end{align}$$

To build some intuition into what this expression means, let's consider the case when $t=0$. We call this the "equal time" correlation function, or "static" correlation function, because it does not depend on time. Note that if there are $N$ observations in your time series, there are $N$ points at which you must compute the product inside the $\langle...\rangle$ --- $t$ is the *lag time* between two points:

$$C_x(0) = \langle x^2 \rangle - \langle x \rangle^2 \tag{Eq. 3}$$

This is the *variance* of the data series. It's square root is the standard deviation. It is also common to normalize $C_x(t)$ by the variance, putting the $C_x(0)$ in the denominator like this:

$$C_x(t) = \frac{\langle x(t)x(0) \rangle - \langle x \rangle^2}{\langle x^2 \rangle - \langle x \rangle^2} \tag{Eq. 4}$$

If you use a built in routine to compute an autocorrelation, it might be computing Eq. 1, or Eq. 2, or Eq. 4...so you had better know the difference!

At equal times, the signal is perfectly correlated with itself. As the time between observations goes to infinity ($t\rightarrow\infty$) the timeseries will (in most cases) forget its earlier values, and $C_x(t) \rightarrow 0$. What happens in between contains an enormous amount of information about the signal. Often, given a signal of interest (or some data from a simulation), we might try and compute $C(t)$ directly in the time domain, and then see if it fits some simple model, like an exponential decay. In that case, the time constant of the exponential decay tells you how long (*on average*) it takes for the signal to "forget" where it was at an earlier time.

*Aside:* You can think of the correlation function (when we are in discretized time, as we always are in the computer) as a *covariance matrix*, constrained by symmetry across the diagonal, positivity of the eigenvalues, and time-translation invariance. 

Let's use a completely random signal to explore some of these ideas and analysis. To build our timeseries I will draw a series of random numbers, uniformly distributed between $0$ and $1$. There are two loops. In the first loop, I generate the random timeseries and compute its average on the fly. Then in the second (double) loop I compute the correlation function. I'll compute all three versions: Eq. 1, 2, and 4.

**Question:** How does Eq. 2 simplify for a case like this? (Random entries, no correlation between them.) 



In [None]:
import matplotlib.pyplot as plt
import numpy as np
import random as rand

## I'm going to write this code in a silly way, in order to make the math more obvious
## There are *MUCH* smarter/faster ways to compute correlation functions...those will come later

## This cell generates a simple sine wave, and superimposes some random noise on top, 
## and computes the autocorrelation of that signal


N = 100
amp = 1.0 ## amplitude of the signal. vary to see how the various 
          ## statistics change
rand_signal = []
time = []
total = 0 ## this will store the sum of the signal as it is generated 
          ## and at the end of the first loop I'll comnpute the average

for i in range(0,N):
    
    ##NOTE: here is our first use of a pseudo random number, which calls the "random" 
    ## function from the python lib "random." This routine is based on the Mersenne twister
    tmp1 =  amp*rand.random()
    rand_signal.append(tmp1) 
    time.append(i)
    total = total + tmp1

sig_avg = total/N

## this list will hold the autocorrelation functions
C_rand = []  #Eq. 1
C_rand_dev = [] #Eq. 2
C_normed = [] #Eq. 4

# delta is the "lag time"
for delta in range (0,N-10):
    ## total keeps the sum for each time lag, 
    ## incr counts how many terms contribute to the sum
    total = 0 # This will store the sums as we compute Eq. 1
    total2 = 0 # This will store the sums as we compute Eq. 2
    incr = 0
    for i in range (1,N-delta):
        incr = incr + 1
        # the product at each pair of data lagged by delta t
        total = total + rand_signal[i]*rand_signal[i+delta]
        total2 = total2 + (rand_signal[i] - sig_avg)*(rand_signal[i+delta] - sig_avg)
    
    ## here we compute the average (this is dumb when the sum has lots of terms)
    avg = total/incr
    avg2 = total2/incr
    C_rand.append(avg)
    C_rand_dev.append(avg2)
    
# cant figure out how to scale the axes of the C(t) plot
M_C = len(C_rand)
x_axis = np.linspace(0,M_C,M_C)

var = C_rand[0]  #the variance: <x**2>
denom = var - sig_avg*sig_avg # compute <x**2> - <x>**2

#now fill the elements of C_normed with Eq. 4
for i in range (0,M_C):
    tmp = C_rand_dev[i]/denom
    C_normed.append(tmp)

print("<x>: %8.4f" % sig_avg)
print("<x**2>: %8.4f" % var)
print ("<x**2>-<x>**2: %8.4f" % denom)
print("C(t=0) Eq. 1: %8.4f" % C_rand[0])
print("C(t=0) Eq. 2: %8.4f" % C_rand_dev[0])
print("C(t=0) Eq. 4: %8.4f" % C_normed[0])

plt.figure(1, figsize=(10, 3))
plt.subplots_adjust(wspace=0.5)
ax = plt.subplot(221)
ax.set_xlabel('time')
ax.set_ylabel('signal')
plt.plot(time,rand_signal)
#plt.plot(time,noisy_periodic)

ax2 = plt.subplot(222)
ax2.set_xlabel('lag time')
ax2.set_ylabel('C(t) (Eq. 1)')
plt.plot(x_axis,C_rand)

ax3 = plt.subplot(223)
ax3.set_xlabel('time')
ax3.set_ylabel('C(t) (Eq. 2)')
plt.plot(x_axis,C_rand_dev)
#plt.plot(x_axis,C_normed)

ax4 = plt.subplot(224)
ax4.set_xlabel('time')
ax4.set_ylabel('C(t) (Eq. 4)')
plt.plot(x_axis,C_normed)

plt.show()


Let's take a closer look at the loop that computes the correlation function. The data series is just a numbered sequence of values:
![time series](./timeseries.jpg)

To compute the $t=0$ entry in Eq. 1, we form the product of every value with itself, sum them, and then average:
$$C(t=0) = \frac{\left(x_0x_0+x_1x_1+x_2x_2+...+x_{N-1}x_{N-1} \right)}{N}$$

Then we compute the $t=1$ lag time:
$$C(t=1) = \frac{\left(x_0x_1+x_1x_2+x_2x_3+...+x_{N-2}x_{N-1} \right)}{N-1}$$

Then $t=2$:
$$C(t=2) = \frac{\left(x_0x_2+x_1x_3+x_2x_4+...+x_{N-3}x_{N-1} \right)}{N-2}$$

As we go to longer and longer lag times, there are fewer and fewer pairs of values to average together, because the time series is finite in length. This problem gets extreme at the longest lag times:

Lag time $t=N-2$:
$$C(t=N-2) = \frac{\left(x_0x_{N-2}+x_1x_N-1\right)}{2}$$

In the code we therefore implement a double loop --- one loop for the lag time, and the second for the starting position:

    for lag=0, lag < N-1
        for j = 0, j < N:
            compute each product
            add to total
            ## exit j loop ##
        compute avg

**Discussion (important!):** Notice the number of operations required to compute $C_x(t)$, and think about what happens if you have a very long timeseries, and just naively compute the whole damn thing by brute force!

### A signal with some memory
Now let's take a look at a more random signal, but one which still has structure. First we will generate a sequence of *independent* Gaussian distributed values, and compute the autocorrelation of this "signal." Since they are independent, there should be no correlation, just like before. The only difference between this random sequence and the one above is that the random numbers in the last code cell were *uniformly* distributed, and these are *Gaussian* distributed. 

Then we will generate a sequence of Gaussian distributed numbers, but we will make them correlated, by making each new member of the sequence dependent on the previous one. This algorithm (which I shamelessly stole from Markus Deserno's (CMU Physics) website) looks like

$$x_{N+1} = fx_N + \sqrt{(1-f^2)}g_{N+1}$$

where $g_N$ is a random number drawn from a Gaussian distribution of zero mean an unit variance, and $f$ introduce the memory function via a timescale $\tau$: $f=e^{-1/\tau}$. So if we simulate such a sequence and compute its correlation function, we expect to get an exponential decay on a timescale of $\tau$.

*Aside:* This type of noise arises often in the context of the motion of a Brownian particle, and is called "Ornstein-Uhlenbeck" noise.

*Aside2:* Even though most commonly used numerical libraries have built in (pseudo) random number generators, generating sequences of random numbers numerically is not trivial. In fact, most such generators are not really random at all, but deterministic (same seed, same sequence) and repeating (but good generators have very long periods). The random numbers produced by Python's "random" library are based on the Mersenne Twister algorithm (Matsumoto and Nishimura, 1998) which has a period of $2^{19937} - 1$. Python's rand documentation contains a warning about using these sequences for cryptographic applications, for obvious reasons. 

*Aside3:* The Mersenne Twister and its cousins all generate *uniformly* distributed real numbers in the interval \[0,1). If you want Gaussian distributed (or Poisson, or Weibull, or...) then you have to apply a transformation to the uniform i.i.d. ("independent and identically distributed") numbers. These are already implemented in Python as well. But when I was your age, we coded our own transformations while walking uphill through the snow. On punchcards.


In [None]:


## Here I use an algorithm that Markus Deserno wrote up in 2002. 
## It can be found at the CMU biophysics website

tau = 2.0 # vary the timescale to see that the correlation time changes 
        # as expected
f =  np.exp(-1.0/tau)
g_scale = np.sqrt(1 - f**2)
#print(f, g_scale)

# for debugging, use the same sequence of rands
#rand.seed(2)

uni_sig = []
corr_sig = []
t = []
s0 = rand.gauss(0.0, 1.0)
uni_sig.append(s0)
corr_sig.append(s0)
t.append(0)

# the loop generates the sequence ("corr_sig")of correlated random numbers according to Deserno's algorithm
# we will also generate a sequence of Gaussian distributed numbers with no 
# correlation for comparison ("uni_sig")
for i in range(0,10000):
    tmp2 = f*corr_sig[i-1] + g_scale*rand.gauss(0.0, 1.0)
    tmp = rand.gauss(0.0, 1.0)
    uni_sig.append(tmp)
    corr_sig.append(tmp2)
    t.append(i)

# calculate the autocorrelation function. 
# for some reason, every other value of the sequence is anti-correlated. 
# this must be "real," (ie, not a bug), but it disctracts from the main point 
# which is to show that the auto correlation captures the exponential decay in the memory

uni_Ct = []
corr_Ct= []
M = len(corr_sig)
for delta in range (0,50,2):
    total = 0
    total2 = 0
    incr = 0
    for i in range (1,M-delta):
        incr = incr + 1
        total = total + uni_sig[i]*uni_sig[i+delta]
        total2 = total2 + corr_sig[i]*corr_sig[i+delta]
        
    avg = total/incr
    avg2 = total2/incr
    uni_Ct.append(avg)
    corr_Ct.append(avg2)

# normalize the auto correlations by the variance so they start from 1
# ie, we plot Eq. 4
M_C = len(uni_Ct)
for i in range(0,M_C):
    uni_Ct[i] = uni_Ct[i]/uni_Ct[0]
    corr_Ct[i] = corr_Ct[i]/corr_Ct[0]

x2_axis = np.linspace(0,M_C,M_C)
calc_C = np.exp(-1.0*x2_axis/tau)

## start with just the uncorrelated signal and it's autocorr
## then add the correlated signal
## series2 is the correlated signal
plt.figure(1, figsize=(8.5, 4))
plt.subplots_adjust(wspace=0.5)
ax = plt.subplot(121)
ax.set_xlabel('time')
ax.set_ylabel('signal')
ax.set_xlim(right=100)
plt.plot(t,uni_sig)
plt.plot(t,corr_sig)


ax2 = plt.subplot(122)
ax2.set_xlabel('time')
ax2.set_ylabel('C(t)')
plt.plot(x2_axis,uni_Ct)
plt.plot(x2_axis,corr_Ct)
plt.plot(x2_axis,calc_C)

plt.show()