# Time series analysis

In this pre-class assignment, we're going to be working with a time series dataset of a Galactic [low-mass x-ray binary](https://en.wikipedia.org/wiki/X-ray_binary#Low-mass_X-ray_binary), GX 5-1 ([Jonker et al. 2002](http://adsabs.harvard.edu/abs/2002MNRAS.333..665J)).  GX 5-1 contains a number of periodic and quasi-periodic behaviors that we can examine with the techniques from Chapter 11 of Feigelson.

The dataset was taken by [Norris et al. (1990)](http://adsabs.harvard.edu/abs/1990ApJ...361..514N) using the [Ginga](https://en.wikipedia.org/wiki/Ginga_(satellite)) satellite.  The dataset has 65,536 observations, which are contiguous measurements of the number of X-ray counts in intervals of $\Delta t = 1/128$ seconds for a total of 512 seconds. 

The cell immediately below this reads in the data, reshapes the array of counts so it has the right shape, and then creates an array of times.  Instructions are below that!

In [None]:
import numpy as np
import numpy.random as npr
%matplotlib inline
import matplotlib.pyplot as plt

# get the data!
counts = np.loadtxt("GX.dat")
print("original array shape:", counts.shape)

# the array is the wrong shape because the data is structured oddly.  The counts
# are in order in time, and meant to be read left to right, top to bottom.  (So
# each row is contiguous in time, and the following row comes after it in time.)
# we can sort this out by reshaping the array to be 1D
counts = np.reshape(counts,counts.size,order='C')
print("NEW array shape:     ", counts.shape)

# make an array 
times = np.arange(0.0,counts.size/128.0,1.0/128.0)

First, let's plot the data.  Make sure to plot it as points, not as a line!  You may want to zoom in on the first second or two of the data to see what's going on.  Also, make a 1D histogram of the bin counts to get a sense of the behavior of the data.  As you vary the number of bins in the histogram, what happens?

The dataset is clearly very noisy.  Try smoothing it with the Central Moving Average (CMA) smoother (Equation 11.4 in Feigelson) to see if there is underlying structure.  Try varying the number of bins in the moving-average smoother: j = 4, 32, 256, 1024.  How does that affect the observed behavior of the dataset if you plot counts vs. time?  Note: make arrays for each of the smoothed datasets!

Use the [SciPy FFT routines](https://docs.scipy.org/doc/scipy-0.18.1/reference/tutorial/fftpack.html) to calculate the Fourier transform of the dataset and get its resulting behavior in Fourier space.  What happens when you do this for successively more smoothed versions of the dataset?  (Hint: look back at your work for the pre-class and in-class assignments for Day 8, and Homework 3!)

Write a function to calculate the autocorrelation function (Feigelson Eq. 11.3, described in more detail in Section 11.3.2), and do so for each of the smoothed datasets that you created above.  How does this compare to the FFT?  (Hint: think about what the x-axis of the autocorrelation function means as compared to the FFT when trying to make this comparison!)

Finally, write a function that calculates the 2nd-order structure function (Section 11.4.2) $D^2(\tau)$ for the various smoothed datasets above, and plot $D^2(\tau)$ vs. $\tau$.  Compare it to your prior results.