# II. Sampling and fitting
## 1. -  Existing sampling

In this notebook we look at the time sampling of the files.  We want to compare this sampling to the lightcurve time sampling, so we can know how to proceed.

In [1]:
import numpy as np
import pandas as pd
#import pandas_profiling
import matplotlib.pyplot as plt
%matplotlib inline
%config InlineBackend.figure_format = 'retina'

import seaborn as sns
sns.set_context('talk')

In [2]:
import sys, os
sys.path.append(os.path.abspath("../code/"))

Let's read in the temperature data.

In [3]:
from thermals import get_temperature_data_from_suffix, pre_process_temperature_data

In [4]:
%%time
raw_df = get_temperature_data_from_suffix('BoardTemperatures')
df = pre_process_temperature_data(raw_df, add_jitter=False)

CPU times: user 16.7 s, sys: 3.4 s, total: 20.1 s
Wall time: 8.82 s


What is the typical time sampling of each telemetry source for each campaign?

In [5]:
df['time_delta'] = df.index.to_series().diff().dt.total_seconds()
df.groupby('campaign').time_delta.describe()[['count', '25%', '50%', '75%']]

Unnamed: 0_level_0,count,25%,50%,75%
campaign,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
0,72535.0,59.616,59.616,59.616001
1,114585.0,59.616,59.616,59.616001
2,56773.0,119.232,120.096,120.096
3,49835.0,119.232,120.096,120.096
4,51071.0,119.232,120.096,120.096
5,53956.0,119.232,120.096,120.096
6,56910.0,119.232,120.096,120.096
7,59533.0,119.232,120.096,120.096
8,56722.0,119.232,120.096,120.096
12,47851.0,119.232,120.096,120.096


It  looks like campaigns 2 onward have 2 minute telemetry sampling, while campaigns 0 and 1 had one minute sampling.  Each campaign has slight perturbations in the time sampling.

How about the other files?

In [6]:
file_names = ['BoardTemperatures', 'TelescopeTemperatureTH_2', 'TelescopeTemperatureTH_1', 'TelescopeTemperaturePED']

In [7]:
median_time_samples = pd.DataFrame()

In [8]:
%%time
for file_name in file_names:
    raw_df = get_temperature_data_from_suffix(file_name)
    df = pre_process_temperature_data(raw_df, add_jitter=False)
    df['time_delta'] = df.index.to_series().diff().dt.total_seconds()
    median_time_samples[file_name] = df.groupby('campaign').time_delta.median()

CPU times: user 59.4 s, sys: 11.1 s, total: 1min 10s
Wall time: 33.7 s


In [9]:
median_time_samples

Unnamed: 0_level_0,BoardTemperatures,TelescopeTemperatureTH_2,TelescopeTemperatureTH_1,TelescopeTemperaturePED
campaign,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
0,59.616,57.888,57.888,59.616
1,59.616,57.888,57.888,59.616
2,120.096,118.367999,118.367999,120.096
3,120.096,118.367999,118.367999,120.096
4,120.096,118.367999,118.367999,120.096
5,120.096,118.367999,118.367999,120.096
6,120.096,118.367999,118.367999,120.096
7,120.096,118.367999,118.367999,120.096
8,120.096,118.367999,118.367999,120.096
12,120.096,118.367999,118.367999,120.096


Roughly the same outcome here-- each camapign has about 2 minute cadence telemetry, though the `TelescopeTemperatureTH` telemetry files are collected with a slightly faster duty cycle than the board temperatures.

2 minutes is much faster than the 30 minute long cadence time series, but twice as slow as the one minute short cadence photometry.

This sampling is a bit tricky because it means we should interpolate the thermal sampling onto the finer short cadence time points, but we should interpolate the long cadence sampling onto the finer telemetry reference times.