# Work With Time Series File For ESSP4
There is one time series file for each data set of the form: `DS#_timeSeries.csv`

For example, the relevant file for data set three (3) would be: `DS3_timeSeries.csv`

In [1]:
import os
from glob import glob
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

In [2]:
# Specify file name

# Specify data set number
dset_num = 0 

# Specify where all the data set folders are, here saved into "essp_dir" variable
essp_dir = '/mnt/home/lzhao/ceph/SolarData/DataSets/Training/'
dset_dir = os.path.join(essp_dir,f'DS{dset_num}')

example_file = os.path.join(dset_dir,f'DS{dset_num}_timeSeries.csv')

## Read in with `pandas`

In [3]:
df = pd.read_csv(example_file)
# Show a snippet of the beginning of the table
df.head()

Unnamed: 0,Standard File Name,Time [eMJD],RV [m/s],RV Err. [m/s],Exp. Time [s],Airmass,BERV [km/s],Instrument,CCF FWHM [km/s],CCF FWHM Err. [km/s],CCF Contrast,CCF Contrast Err.,BIS [km/s],H-alpha Emission,S Index
0,DS0.001_spec_expres.fits,59332.405937,0.102165,0.180021,169.487,1.07889,-1059077.0,expres,,,,,,,
1,DS0.002_spec_expres.fits,59332.409727,-0.202009,0.180288,168.069,1.086344,-1067009.0,expres,,,,,,,
2,DS0.003_spec_expres.fits,59332.42252,-1.076967,0.192353,167.863007,1.115956,-1093220.0,expres,,,,,,,
3,DS0.004_spec_harpsn.fits,59332.967814,101.044992,0.299046,300.0,1.320948,-0.05109644,harpsn,,,,,,,
4,DS0.005_spec_harpsn.fits,59332.971814,100.919537,0.301153,300.0,1.298025,-0.0575611,harpsn,,,,,,,


In [None]:
# Here are all the column names
df.columns

#### Plot RVs and Errors

In [None]:
plt.figure(figsize=(12,4))
plt.errorbar(df['Time [eMJD]'],df['RV [m/s]'],yerr=df['RV Err. [m/s]'],
             linestyle='None',marker='o',color='k')

## Rename Columns
I know the column names are unwieldy.  You can easily re-name columns using a dictionary with pandas.

In [None]:
# Here are just some example new column names
# The key should be the original column name; the value should be the new name
col_dict = {
    'Standard File Name' : 'file',
    'Time [eMJD]' : 'time',
    'RV [m/s]' : 'rv',
    'RV Err. [m/s]' : 'e_rv',
    'Exp. Time [s]' : 'exptime',
    'Airmass' : 'airmass',
    'BERV [km/s]' : 'berv',
    'Instrument' : 'inst',
    'CCF FWHM [km/s]' : 'fwhm',
    'CCF FWHM Err. [km/s]' : 'e_fwhm',
    'CCF Contrast' : 'contrast',
    'CCF Contrast Err.' : 'e_contrast',
    'BIS [km/s]' : 'bis',
    'H-alpha Emission' : 'ha',
    'S Index' : 'sval'
}

In [None]:
df = pd.read_csv(example_file)
renamed_df = df.rename(columns=col_dict)
renamed_df.head()

## Read Into Dictionary
If you don't like using `pandas` objects, I recommend using pandas to convert into a dictionary of lists.  The only annoying thing with this option is that each time series will be turned into a list, not a numpy array, and therefore will be missing some functionality.

Because of the structure of the CSV file and the mix of data types, it is pretty non-trivial to read it in with `numpy` alone.

In [None]:
df = pd.read_csv(example_file)
data_dict = df.to_dict('list')

In [None]:
plt.figure(figsize=(12,4))
plt.errorbar(df['Time [eMJD]'],df['RV [m/s]'],yerr=df['RV Err. [m/s]'],
             linestyle='None',marker='o',color='k')