## Analytics Starter

This is a simple analytics starter notebook demonstrating some of the on the go analysis that can be carried for a numeric dataset containing a Timeseries. A sample data set of MMC stock prices is used.

In [None]:
import pandas as pd
import numpy as np
from matplotlib import pyplot  as plt
from src.data_loading.mmc_stock import LocalMMC
from src.analysis.example_analysis import get_rolling_mean
from src.data_processing.example_data_processing import create_timeseries

%matplotlib inline

In [None]:
from labskit.utilities.ow_colors import set_ow_colors

In [None]:
set_ow_colors()

In [None]:
# load our labskit specific settings
import labskit
project = labskit.Settings()

In [None]:
csv_source = LocalMMC(project).data
df = csv_source\
     .assign(rolling_mean=lambda x: get_rolling_mean(x, window=10, column='close'))
df_timeseries = create_timeseries(df)

Indexing by date range:

In [None]:
df_timeseries.loc[pd.Timestamp('2017-12-12'):pd.Timestamp('2015-12-06')].head()

Resample by date frequency:

In [None]:
df_timeseries.resample('5d').mean().head()

In [None]:
df_timeseries.resample('W').agg(['mean', 'sum']).head()

Simple general stats of the data:

In [None]:
df_timeseries.describe()

It is also possible to do simple plots with pandas:

In [None]:
column = 'close'
df_timeseries[column].plot()
df_timeseries[column].rolling(28).mean().plot(label='28 Day Rolling Mean')

plt.legend(bbox_to_anchor=(1., 1.02))