### **Welcome to the TSI-Toolkit!**
The **goal of this tutorial** is to familiarize the user with using tsi-toolkit to import their time series data, to create and compare models that aim to capture the variablity in their data, and apply models to make predictions of the time series at new, previously unobserved data points, either in the future (forecasting) or between observed values (interpolation). 

This package was designed to be convenient and intuitive, regardless of your background in Python and machine learning. I deeply appreciate those who voice their frustrations, whether about bugs or how this package could better serve you. Please feel free to open a new issue post in the [Github Repo's Issue Page](https://github.com/collinlewin/tsi-toolkit/issues), or by emailing me, Collin Lewin (clewin@mit.edu).

##### *In this tutorial, we will learn how to...*
1. Importing, cleaning, and plotting time series data
2. The basics of modeling data with Gaussian processes (GPs)
3. How to train a GP using our data and select between GP models
4. Predicting values of the time series at new times (interpolation or forecasting)
5. Generate a slew of powerful products for gaining insight on the data.

In [None]:
%load_ext autoreload
%autoreload 2

from tsi_toolkit.data_loader import TimeSeries
from tsi_toolkit.gaussian_process import GaussianProcess
from tsi_toolkit.preprocessing import Preprocessing
from tsi_toolkit.power_spectrum import PowerSpectrum

import numpy as np
import matplotlib.pyplot as plt

In [None]:
 ### Import data

# Import from file directly
#desktop
file_path = '/home/clewin/projects/tsi-toolkit/data/'
#laptop
#file_path = '/Users/collinlewin/Research/tsi-toolkit/data/'
lightcurve = TimeSeries(file_path = f'{file_path}NGC5548_U_swift.dat')

# Import from array
data = np.genfromtxt(f'{file_path}NGC5548_X_swift.dat')
lightcurve = TimeSeries(times=data[:,0], values=data[:,1], errors=data[:,2])
lightcurve.plot()

# Plot with custom arguments
# the U-band is around 3500 angstroms, and ultraviolet, so let's make the plot violet
lightcurve.plot(figsize=(8,4),
                xlabel='Modified Heliocentric Julian Date (Days)', 
                ylabel='Flux',
                xlim = (lightcurve.times[0], lightcurve.times[-1]),
                title='NGC 5548 U-band Lightcurve',
                fig_kwargs={'linewidth':28},
                plot_kwargs={'color':'purple', 'fmt':'o', 'lw':1, 'ms':3},
                major_tick_kwargs={'direction':'in', 'top':True, 'right':True, 'length':6, 'width':1},
                minor_tick_kwargs={'direction':'in', 'top':True, 'right':True, 'length':3, 'width':0.5},
                )

In [None]:
# Preprocess the data
Preprocessing.remove_nans(lightcurve)
Preprocessing.remove_outliers(lightcurve, threshold=1.5, rolling_window=10, plot=True, verbose=True, save=False)
Preprocessing.remove_outliers(lightcurve, threshold=1.5, rolling_window=50, plot=True, verbose=True, save=False)
Preprocessing.remove_outliers(lightcurve, threshold=1.5, rolling_window=None, plot=True, verbose=True, save=True)

Preprocessing.trim_time_segment(lightcurve, end_time=56815, plot=True, save=False)
Preprocessing.polynomial_detrend(lightcurve, degree=3, plot=True, save=False)
Preprocessing.standardize(lightcurve)

In [None]:
# verbose shows the progress of the optimization
gp_model = GaussianProcess(timeseries=lightcurve,
                 kernel_form='Matern12', white_noise=True, train_iter=1000, verbose=True)

# non verbose
gp_model = GaussianProcess(timeseries=lightcurve,
                 kernel_form='Matern12', white_noise=True, train_iter=1000, verbose=False)
gp_model.get_hyperparameters()
gp_model.akaike_inf_crit()

gp_model.plot()

In [None]:
gp_model = GaussianProcess(timeseries=lightcurve,
                 kernel_form='auto', white_noise=True, train_iter=1000, verbose=True)

In [None]:
print(gp_model.get_hyperparameters())
print(gp_model.akaike_inf_crit())
print(lightcurve.unstandard_mean)

#gp_model.save_model('gp_model.pkl')
#gp_model.load_model('gp_model.pkl')

In [None]:
prediction_times = np.linspace(lightcurve.times[0], lightcurve.times[-1], 1000)
samples = gp_model.sample(prediction_times, num_samples=1000)

In [None]:
power_spectrum = PowerSpectrum(times=prediction_times, values=samples, norm=True)
power_spectrum.plot()
power_spectrum.bin(20, save=False, plot=True)