# Check timeseries

This script does some quick analysis on the time series data,
just to check that there isn't anything obvious that's missing.

Uses the `tsam_df_dict.pkl` generated by `representatove-period-processing.ipynb`,
so please refer to it for any instructions.

In [None]:
## Package config

import pickle # Load TSAM formatted data for reuse.
import pandas as pd # Pandas for dataframe stuff
from itertools import product

In [None]:
## Load TSAM data

with open("tsam_df_dict.pkl", "rb") as file: # Load previous saved data.
    tsam_df_dict = pickle.load(file)

In [None]:
## Calculate timeseries diagnostics.

stats = ['min', 'median', 'mean', 'max'] # Statistical indicators used.
df_list = [] # Initialize list for collecting statistics.
for (year, data) in tsam_df_dict.items(): # Loop over all years and data
    df = data # Copy data
    timeseries = df.columns # Record timeseries names
    df['year'] = year # Set year in data.
    df = df.groupby('year').agg(stats) # Calculate stats per year.
    df_list.append(df) # Record stats.
diagnostics = pd.concat(df_list).sort_index().transpose() # Concatenate, sort, and transpose diagnostics.
years = diagnostics.columns # Record years.
diagnostics

In [None]:
## Check which timeseries are not really timeseries

problems = [] # Initialize list of problematic timeseries.
for (ts, y) in product(timeseries, years):
    series = diagnostics.loc[ts][year] # Access diagnostics
    if all(series == series[0]): # If all diagnostics are equal
        problems.append((ts, y)) # Append to problems

# Check the set of problem timeseries
set([ts for (ts, y) in problems])

In [None]:
## Check Finnish timeseries

data = diagnostics.reset_index().rename(columns={'level_1': 'stat'})
data.loc[data['ts_name'].str.contains('FI')]

# Seems ok? At least all of them seem like timeseries

In [None]:
## Check mean values for stuff?

df = data
#df = df.loc[data.ts_name.str.contains('ts_influx')]
df = df.loc[df.ts_name.str.contains('FI')]
df = df.loc[df.stat == 'mean']
df

Seems to finally contain all time series?