# tsam - 2. Example
Example usage of the time series aggregation module (tsam)
Date: 29.06.2019

Author: Maximilian Hoffmann

Import pandas and the relevant time series aggregation class

In [None]:
%load_ext autoreload
%autoreload 2
import copy
import os
import pandas as pd
import matplotlib.pyplot as plt
import tsam.timeseriesaggregation as tsam
%matplotlib inline

### Input data 

Read in time series from testdata.csv with pandas

In [None]:
raw = pd.read_csv('testdata.csv', index_col = 0)

Show a slice of the dataset

In [None]:
raw.head()

Show the shape of the raw input data: 4 types of timeseries (GHI, Temperature, Wind and Load) for every hour in a year

In [None]:
raw.shape

Create a plot function for the temperature for a visual comparison of the time series

In [None]:
def plotTS(data, periodlength, vmin, vmax):
    fig, axes = plt.subplots(figsize = [6, 2], dpi = 100, nrows = 1, ncols = 1)
    stacked, timeindex = tsam.unstackToPeriods(copy.deepcopy(data), periodlength)
    cax = axes.imshow(stacked.values.T, interpolation = 'nearest', vmin = vmin, vmax = vmax)
    axes.set_aspect('auto')  
    axes.set_ylabel('Hour')
    plt.xlabel('Day')

    fig.subplots_adjust(right = 1.2)
    cbar=plt.colorbar(cax)    
    cbar.set_label('Wind [m/s]')

Plot an example series - in this case the wind speed

In [None]:
plotTS(raw['Wind'], 24, vmin = raw['Wind'].min(), vmax = raw['Wind'].max())

### Hierarchical aggregation

Initialize an aggregation class object with hierarchical clustering as method for eight typical days, without any integration of extreme periods. Alternative clusterMethod's are 'averaging','hierarchical' and 'k_medoids'.

In [None]:
aggregation = tsam.TimeSeriesAggregation(raw, noTypicalPeriods = 8, hoursPerPeriod = 24, 
                                        clusterMethod = 'hierarchical', representationMethod='meanRepresentation')

Create the typical periods

In [None]:
typPeriods = aggregation.createTypicalPeriods()

In [None]:
typPeriods

Show shape of typical periods: 4 types of timeseries for 8*24 hours

In [None]:
typPeriods.shape

Repredict the original time series based on the typical periods

In [None]:
predictedPeriods = aggregation.predictOriginalData()

Plot the repredicted data

In [None]:
plotTS(predictedPeriods['Wind'], 24, vmin = raw['Wind'].min(), vmax = raw['Wind'].max())

### Now cluster the wind time series only

Clustering the solar time series only with 8 typical days and hierarchical clustering leads to different typical days in another sequence.

Isolate wind time series and show first lines of data

In [None]:
raw_wind=raw.loc[:,'Wind'].to_frame()
raw_wind.head()

Now same clustering procedure as above for the isolated wind time series

In [None]:
aggregation_wind = tsam.TimeSeriesAggregation(raw_wind, noTypicalPeriods = 8, hoursPerPeriod = 24, 
                                        clusterMethod = 'hierarchical', representationMethod='meanRepresentation')

In [None]:
typPeriods_wind = aggregation_wind.createTypicalPeriods()

Export for preprocess time series for testing

In [None]:
aggregation_wind.normalizedPeriodlyProfiles.to_csv(os.path.join('results','preprocessed_wind.csv'))

In [None]:
typPeriods_wind.shape

In [None]:
predictedPeriods_wind = aggregation_wind.predictOriginalData()

In [None]:
plotTS(predictedPeriods_wind['Wind'], 24, vmin = raw['Wind'].min(), vmax = raw['Wind'].max())

When we compare both plots, we see that 8 typical periods for wind only can better account extreme periods, but the cluster order in general changes

In [None]:
aggregation.clusterOrder

In [None]:
aggregation_wind.clusterOrder

### Predefining cluster sequence

tsam offers the option to aggregate input time series for a predefined cluster order. This means that we can take the cluster Order from the wind time series only and set it as input for the aggregation process for all attributes

In [None]:
aggregation_predefClusterOrder = tsam.TimeSeriesAggregation(raw, noTypicalPeriods = 8, hoursPerPeriod = 24, 
                                 clusterMethod = 'hierarchical', representationMethod='meanRepresentation', 
                                 predefClusterOrder=aggregation_wind.clusterOrder)

In [None]:
typPeriods_predefClusterOrder = aggregation_predefClusterOrder.createTypicalPeriods()

In [None]:
typPeriods_predefClusterOrder.shape

Save typical periods to .csv file

In [None]:
typPeriods_predefClusterOrder.to_csv(os.path.join('results','testperiods_predefClusterOrder.csv'))

In [None]:
predictedPeriods_predefClusterOrder = aggregation_predefClusterOrder.predictOriginalData()

Now we compare the cluster orders

In [None]:
aggregation_wind.clusterOrder

In [None]:
aggregation_predefClusterOrder.clusterOrder

As it can be seen, the cluster order for the four attributes (i.e. he sequence of typical days) is no identical to the cluster order of the wind time series clustering. Now the color plots can be compared:

In [None]:
plotTS(predictedPeriods_predefClusterOrder['Wind'], 24, vmin = raw['Wind'].min(), vmax = raw['Wind'].max())

As it can be seen, the plot for the aggregated wind time series only and the one for four with the predefined cluster Order from the wind time series still differ from each other. This is because of the fact, that only the cluster Order, but not the cluster centers of each cluster are predefined. Since these are in one case deterined for the wind time series only and in the other case for all four attributes in common, the chosen cluster centers (chosen typical days) differ from each other

### Predefining cluster order and cluster centers

If the cluster order and the cluster centers should be taken from the wind time series clustering, we pass the information which days where chosen as typical days for the wind time series to the aggregation of all four typical attributes as well

In [None]:
aggregation_predefClusterOrderAndClusterCenters = tsam.TimeSeriesAggregation(raw, 
                                                  noTypicalPeriods = 8, hoursPerPeriod = 24, 
                                                  clusterMethod = 'hierarchical', representationMethod='meanRepresentation', 
                                                  predefClusterOrder=aggregation_wind.clusterOrder,
                                                  predefClusterCenterIndices=aggregation_wind.clusterCenterIndices)

In [None]:
typPeriods_predefClusterOrderAndClusterCenters = aggregation_predefClusterOrderAndClusterCenters.createTypicalPeriods()

In [None]:
typPeriods_predefClusterOrderAndClusterCenters.shape

Save typical periods to .csv file

In [None]:
typPeriods_predefClusterOrderAndClusterCenters.to_csv(os.path.join('results','testperiods_predefClusterOrderAndClusterCenters.csv'))

In [None]:
predictedPeriods_predefClusterOrderAndClusterCenters = aggregation_predefClusterOrderAndClusterCenters.predictOriginalData()

In [None]:
plotTS(predictedPeriods_predefClusterOrderAndClusterCenters['Wind'], 24, vmin = raw['Wind'].min(), vmax = raw['Wind'].max())

Now even the chosen typical days for the four attributes are the same as for the aggregated wind time series only