# Time series representation and analysis

The [astropy.timeseries](http://docs.astropy.org/en/stable/timeseries/index.html) sub-package is a relatively new component of astropy which provides a framework for representing time series, and includes some initial analysis tools (but this is expected to grow over time).


<section class="objectives panel panel-warning">
<div class="panel-heading">
<h2><span class="fa fa-certificate"></span> Objectives</h2>
</div>


<div class="panel-body">

<ul>
<li>Create time series</li>
<li>Access data and times</li>
<li>Relative time series</li>
<li>Combining time series</li>
<li>Sorting time series</li>
<li>Folding time series</li>
<li>Periodogram analysis</li>
<li>Reading in data</li>
</ul>

</div>

</section>


## Documentation

This notebook only shows a subset of the functionality in astropy.timeseries. For more information about the features presented below as well as other available features, you can read the
[astropy.timeseries documentation](https://docs.astropy.org/en/stable/timeseries/).

In [None]:
%matplotlib inline
import matplotlib.pyplot as plt
plt.rc('image', origin='lower')
plt.rc('figure', figsize=(10, 6))

## Creating time series

The astropy.timeseries sub-package provides two main classes for representing time series, both of which are sub-classes of [QTable](http://docs.astropy.org/en/stable/api/astropy.table.QTable.html#astropy.table.QTable):

* [TimeSeries](http://docs.astropy.org/en/stable/api/astropy.timeseries.TimeSeries.html#astropy.timeseries.TimeSeries) to represent a series of measurements at different (and exact) times
* [BinnedTimeSeries](http://docs.astropy.org/en/stable/api/astropy.timeseries.BinnedTimeSeries.html#astropy.timeseries.BinnedTimeSeries), to represent a series of measurements in different time bins.

In most cases, [TimeSeries](http://docs.astropy.org/en/stable/api/astropy.timeseries.TimeSeries.html#astropy.timeseries.TimeSeries) might be enough unless you need to treat the finite bin sizes carefully (as for example in X-ray analysis). In this tutorial we will focus on TimeSeries, but if you are interested in representing binned time series, you can take a look at the [astropy.timeseries documentation](http://docs.astropy.org/en/stable/timeseries/index.html).

Let's start off by creating a synthetic time series, and we will look later into how to read existing files.

There are several ways to initialize a time series from scratch. One of the easiest ways if you are dealing with a regularly spaced time series is to specify the start time, the time between samples, and the number of samples:

Another way is to specify the times at which the measurements were made:

In the above cases, we just passed strings to the ``time=`` option, but this is actually a shortcut - the most general way to create the time series which provides the most control is to pass a Time object (from astropy.time):

Once the time series has been initialized, you can add columns to it as if it was a table:

## Accessing data and times

Data is accessed in the same way as for regular tables:

For the time column, it is also possible to use the following shortcut:

Since this is a regular time object, you can convert it to different time scales and formats:

Note that this doesn't change the time in the time series, it just returns a new Time object. On the other hand, you can change the default format by setting ``.format``:

## Relative time series

If you want to work with relative time series, you can do the following:

and this can also be done by taking a time series in absolute times and subtracting a reference time:

## Combining time series

Assuming that you have two time series objects with matching columns as well as another table which represents an additional column:

In [None]:
from astropy.table import Table

ts_a = TimeSeries(time_start='2016-03-22T12:30:31',
                  time_delta=3 * u.s,
                  data={'flux': [1, 4, 5, 3, 2] * u.mJy})

ts_b = TimeSeries(time_start='2016-03-22T12:50:31',
                  time_delta=3 * u.s,
                  data={'flux': [4, 3, 1, 2, 3] * u.mJy})

data = Table(data={'temperature': [40., 41., 40., 39., 30.] * u.K})

There are different ways of combining these - for example you can stack one of the time series with the additional data 'horizontally', which means you preserve the number of rows but construct a new time series with more columns:

or you can stack the time series 'vertically' meaning that the columns are preserved but we make a new time series with all the rows from the original two time series:

## Sorting

Since TimeSeries objects are just tables behind the hood, you can also sort them as you would a table:

## Folding

A common operation with time series is 'folding' by assuming a given period. Let's start off with a synthetic dataset:

In [None]:
ts_synth = TimeSeries(time_start='2019-01-01T00:00:00',
                      time_delta=np.random.uniform(0, 10, 1000) * u.min)
ts_synth['flux'] = np.cos(2 * np.pi * (ts_synth.time.mjd / 0.49822))
ts_synth['flux'] += np.random.normal(0, 0.2, 1000)

In [None]:
plt.plot(ts_synth.time.mjd, ts_synth['flux'], 'o')

Folding is done with the [TimeSeries.fold](http://docs.astropy.org/en/stable/api/astropy.timeseries.TimeSeries.html#astropy.timeseries.TimeSeries.fold) method:

## Periodogram analysis

In the above example it was clear what period to use since we generated the data, but how would we find the period in the first place if it wasn't known? The astropy.timeseries sub-package includes several algorithms for periodogram analysis, the most general one being the Lomb-Scargle algorithm ([Lomb 1976](http://adsabs.harvard.edu/abs/1976Ap%26SS..39..447L), [Scargle 1982](http://adsabs.harvard.edu/abs/1982ApJ...263..835S)). To use this, first create an instance of the [LombScargle](http://docs.astropy.org/en/stable/api/astropy.timeseries.LombScargle.html#astropy.timeseries.LombScargle) class:

To run the periodogram analysis, the easiest way is to call the [autopower](http://docs.astropy.org/en/stable/api/astropy.timeseries.LombScargle.html#astropy.timeseries.LombScargle.autopower) method:

We can zoom in on the low frequency peak by doing:

The peak is located at:

## Reading in data

Reading in time series can be done via the [TimeSeries.read](http://docs.astropy.org/en/stable/api/astropy.timeseries.TimeSeries.html#astropy.timeseries.TimeSeries.read) method. This accepts all the formats supported by the Table class, and requires that you specify the name of the time column, and optionally the format and scale for the time. We will take a look at this by reading a CSV file that looks like:

    Date,A,B,C,D,E,F,G
    2008-03-18,24.68,164.93,114.73,26.27,19.21,28.87,63.44
    2008-03-19,24.18,164.89,114.75,26.22,19.07,27.76,59.98
    ...
    
To read this file, you can do:

For some well defined formats (e.g. Kepler/TESS time series files), it is enough to specify the filename.


<section class="challenge panel panel-success">
<div class="panel-heading">
<h2><span class="fa fa-pencil"></span> The Nobel Challenge (Part 1)</h2>
</div>


<div class="panel-body">

<p>This challenge is a little longer than previous ones, and it brings together several tutorials. The aim here is to start from a realistic dataset which consists of radial velocities of <a href="https://en.wikipedia.org/wiki/51_Pegasi">51 Peg</a> and go all the way through to finding a probability distribution for the companion mass. The starting point is a FITS binary table from <a href="http://vizier.u-strasbg.fr/viz-bin/VizieR?-source=J/AJ/153/138">Birkby+ 2017 on Vizier</a>. The file can be found at <a href="data/birkby2017.fits">data/birkby2017.fits</a>.</p>
<ol>
<li>Start off by reading in the data into a TimeSeries object using TimeSeries.read as shown above. Since this table is not in a standard time series format, you will need to specify the name of the time column, and you will also need to specify the time format with <code>time_format</code>. See the <a href="http://vizier.u-strasbg.fr/viz-bin/VizieR?-source=J/AJ/153/138">Vizier page</a> for more details about the available columns, and the documentation for <a href="http://docs.astropy.org/en/stable/api/astropy.timeseries.TimeSeries.html#astropy.timeseries.TimeSeries.read">TimeSeries.read</a> for details about the options to provide. If you want to be extra careful you can also try setting the time scale to the appropriate one based on what you can find on the Vizier page.</li>
<li>Make a plot of the time series. You'll probably notice that strangely the velocities seem to cluster around two main values, and this doesn't look very much like what you would expect. Read Note 2 on the <a href="http://vizier.u-strasbg.fr/viz-bin/VizieR?-source=J/AJ/153/138">Vizier page</a> to find out what might be going on!</li>
<li>Modify the table so that all the velocities are on the same zero point, using standard table operations. Note that you will need to be careful here because the <code>Set</code> column which identifies each dataset/instrument is <em>always</em> 6 characters long and includes trailing spaces (i.e. <code>''HIRES '</code> not <code>'HIRES'</code>). Make a plot of the updated velocities to make sure things look more sensible.</li>
<li>Run a Lomb-Scargle periodogram analysis of the radial velocities to find whether the data has any periodicity. Note that you will probably have to zoom in a bit on the most promising peak to get a more accurate period.</li>
<li>Fold the time series using the period you found in the previous step, and plot the resulting radial velocity curve. If you got the right period, the resulting curve should look much simpler!</li>
</ol>

</div>

</section>



<section class="challenge panel panel-success">
<div class="panel-heading">
<h2><span class="fa fa-pencil"></span> The Nobel Challenge (Part 2)</h2>
</div>


<div class="panel-body">

<ol>
<li>Now use what we learned in the modeling tutorial to fit a sinusoid to the folded radial velocity curve, plot the fit, and determine the amplitude of the velocity. Note that you may need to adjust the initial parameters of the model (in particular the frequency) to get a good fit.</li>
<li>(optional) Try and extract the error in the amplitude from <code>fitter.fit_info</code>. The <code>param_cov</code> key gives the covariance matrix, and the parameters are in the order given by the <code>param_names</code> attribute on the model. Note also that this will not have units attached, but the units will be the same as the parameters. You can also skip this step and assume an error in the amplitude of 0.5 m/s if you prefer.</li>
</ol>
<p>The challenge will continue in the next tutorial!</p>

</div>

</section>


<center><i>This notebook was written by <a href="https://aperiosoftware.com/">Aperio Software Ltd.</a> &copy; 2019, and is licensed under a <a href="https://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution 4.0 International License (CC BY 4.0)</a></i></center>

![cc](https://mirrors.creativecommons.org/presskit/buttons/88x31/svg/by.svg)