# Example on how to use the code to plot simple statistics
These methods calculate and visualise basic the confidence intervals, violin plots and box plots of the data.

You can compare how the distibutions change depending on various **sampling**:
* Irregular Sampling: no sample has been dropped other than those that wore all NaN or duplicated.
* Hourly Sampling: the data is downsampled to one reading per hour. The downsampled values are aggregated by mean, max, min or std IOB, COB or BG. Hours that didn't have at least one reading are not included.
* Daily Sampling: the data is downsampled to one reading per day. The downsampled values are aggregated by mean, max, min or std IOB, COB or BG. Days that did not have at least one reading every 3h are not included.

As many TS methods require regular and equal length time series. The various sampled data can be translated into the following different time series:
* Irregular time series: no resampling into a time series is done
* Daily time series: each day that has at least one reading for each hour forms a daily time series
* Weekly time series: each calendar week that has at least one reading for each day of the week forms a weekly time series

Pre-requisition:
You need to generate the required .csv files from the raw dataset for this to works. See documentation on how to do that.

In [None]:
from src.configurations import Irregular, Hourly, Daily, GeneralisedCols, Configuration
from src.read_preprocessed_df import ReadPreprocessedDataFrame
from src.stats import Stats
from src.translate_into_timeseries import TimeColumns, TranslateIntoTimeseries, IrregularTimeseries, DailyTimeseries, WeeklyTimeseries

In [None]:
# read data
zip_id = '' # set a zip ID, if put this to none than the df will contain all the zip ids of all the people in the dataset
irregular_sampling = Irregular()
irregular_raw_df = ReadPreprocessedDataFrame(sampling=irregular_sampling, zip_id=zip_id).df

hourly_sampling = Hourly()
hourly_raw_df = ReadPreprocessedDataFrame(sampling=hourly_sampling, zip_id=zip_id).df

daily_sampling = Daily()
daily_raw_df = ReadPreprocessedDataFrame(sampling=daily_sampling, zip_id=zip_id).df

In [None]:
# other configs
value_columns_not_resampled = [GeneralisedCols.iob, GeneralisedCols.cob, GeneralisedCols.bg]
all_time_columns = [TimeColumns.hour, TimeColumns.week_day, TimeColumns.month, TimeColumns.year]
mean_iob_cob_bg_cols = Configuration.resampled_mean_columns()

daily_ts = DailyTimeseries()  # only full days of data are kept as a ts for each day
weekly_ts = WeeklyTimeseries()  # only full calendar weeks of data are kept as a ts per week
irregular_ts = IrregularTimeseries()  # what ever sampling has been done is kept but no further reshaping into ts is done

## Confidence intervals, violin plots and box plots for processed raw data

Irregular original data, preprocessed to avoid NaN and duplicated values, not shaped into a time series

In [None]:
translate = TranslateIntoTimeseries(irregular_raw_df, irregular_ts, value_columns_not_resampled)
stats = Stats(translate.processed_df, irregular_sampling, irregular_ts, all_time_columns, value_columns_not_resampled)

stats.plot_confidence_interval()
stats.plot_violin_plot()
stats.plot_box_plot()

## Confidence intervals, violin plots and box plots for hourly sampled data

1) Data shaped into daily time series from hourly sampled data

In [None]:
translate = TranslateIntoTimeseries(hourly_raw_df, daily_ts, mean_iob_cob_bg_cols)
stats = Stats(translate.processed_df, hourly_sampling, daily_ts, all_time_columns, mean_iob_cob_bg_cols)

stats.plot_confidence_interval()
stats.plot_violin_plot()
stats.plot_box_plot()

2) Data sampled into hourly samples but not shaped into a ts

In [None]:
translate = TranslateIntoTimeseries(hourly_raw_df, irregular_ts, mean_iob_cob_bg_cols)
stats = Stats(translate.processed_df, hourly_sampling, irregular_ts, all_time_columns, mean_iob_cob_bg_cols)

stats.plot_confidence_interval()
stats.plot_violin_plot()
stats.plot_box_plot()

## Confidence intervals, violin plots and box plots for daily sampled data

1) Data shaped into weekly time series from daily sampled data

In [None]:
translate = TranslateIntoTimeseries(daily_raw_df, weekly_ts, mean_iob_cob_bg_cols)
stats = Stats(translate.processed_df, daily_sampling, weekly_ts, all_time_columns, mean_iob_cob_bg_cols)

stats.plot_confidence_interval()
stats.plot_violin_plot()
stats.plot_box_plot()

2) Data sampled daily but not shaped into any ts

In [None]:
translate = TranslateIntoTimeseries(daily_raw_df, irregular_ts, mean_iob_cob_bg_cols)
stats = Stats(translate.processed_df, daily_sampling, irregular_ts, all_time_columns, mean_iob_cob_bg_cols)

stats.plot_confidence_interval()
stats.plot_violin_plot()
stats.plot_box_plot()