# Availability time series generation

**Purpose and background**: This notebook serves to generate resp. put together 
availability time series for demand response potentials which are used 
in the clustering process to calculate weighted average values resp. 
overall availability time series per demand response cluster. 
The latter is used in power market model runs for simulating demand response.

## Method applied
In the following, a brief description of the method applied is given for the sake of replicability.
The main literature sources used are [Gils (2015)](http://dx.doi.org/10.18419/opus-6888), 
[Steuer (2017)](http://dx.doi.org/10.18419/opus-9181) and 
[Ladwig (2018)](https://nbn-resolving.org/urn:nbn:de:bsz:14-qucosa-236074), 
since these publications offer the most detailed information on load profiles and availability
from the ones that were evaluated within the demand response potential meta-analysis.
The actual generation of demand response availability time series was the topic of three bachelor
theses. The bachelor theses combine different sources from the literature and introduce some further
assumptions to account for shortcomings in the literature. The approaches used in detail 
differ per sector and demand response category. The bachelor theses results are build upon 
in this notebook. Detailed method descriptions can be found within these theses and the respective 
supplementary material.

Bibliographical information for the bachelor theses:
* _Stange, Rico (2019): Ermittlung von Kosten und Zeitverfügbarkeiten einer Flexibilisierung der 
Stromnachfrage, Freie wissenschaftliche Arbeit zur Erlangung des Grades eines Bachelor of Science 
am Fachgebiet Energie- und Ressourcenmanagement der TU Berlin._ 
__*Focus: household sector*__
* _Odeh, Jonas (2019): Ermittlung von Kosten und Zeitverfügbarkeiten für die Flexibilisierung 
von Stromnachfragen im GHD-Sektor, Freie wissenschaftliche Arbeit zur Erlangung des Grades eines 
Bachelor of Science am Fachgebiet Energie- und Ressourcenmanagement der TU Berlin._
__*Focus: trade, commerce and services*__
* _Benz, Fabian (2019): Ermittlung von Kosten und Zeitverfügbarkeit für flexible Stromnachfragen 
in der Industrie in Deutschland, Freie wissenschaftliche Arbeit zur Erlangung des Grades eines 
Bachelor of Science am Fachgebiet Energie- und Ressourcenmanagement der TU Berlin._
__*Focus: industry sector*__

### Creation of load profiles per demand response category
Load profiles on the level of demand response categories serve as the basis for creating
demand response availability time series. The load profiles are derived from the literature and
needed since demand response limits can be derived from these.

### Creation of demand response availability time series
Demand response is limited by the following rules (see e.g. Steurer 2017, p. 48):
* Downwards shifts (positive demand response potential) must be smaller or equal to the difference 
of actual load and minimum load:
$$ P_{down}(t) \leq P(t) - P_{min}(t) $$ 
* Upwards shifts (negative demand response potential) must be smaller or equal to the difference 
of maximum load and actual load:
$$ P_{up}(t) \leq P_{max}(t) - P(t) $$

Hence, if there is no minimum load, the load profile marks the overall limit for possible demand 
reductions. In many cases, it is either possible to increase demand up to the overall maximum power 
output or the load profile in a way serves binding here, as well. This is the case e.g. for 
appliances with an implicit thermal storage, such as electrical heating.

**Rules for determining minimum and maximum loads**:

| demand response category | minimum load | maximum load |
| ---- | ---- | ---- |
| household and tcs appliances | none | overall max. demand or max. demand for day / season |
| thermal appliances | none | max. demand per day |
| industrial processes | process-specific | process-specific |

_NOTE: These rules only apply for short-term load shifts resp. load shedding._

**Demand response availability factors**

To account for availability, an availibility factor in upwards resp. downwards shifiting
direction is introduced. This value is normalized, i.e. $ \in [0;1] $. If this isn't already
the case in the baseline, the current maximum value ($ < 1 $) is used as a scaling factor (see
bachelor theses for details).

Interpretation:
* 0 denotes the time(s) of the year when none of the maximum demand response potential is available.
* 1 denotes the time(s) of the year when the maximum demand response potential is available.

# Package imports

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from IPython.core.display import display

# Parameter settings
* Set path folder and filenames for inputs and outputs.
* Define lists with columns to keep (i.e. decide which ones to drop).
* Define two dicts mapping weekdays resp. months names (str) to the int output of pd.DateTimeIndex:
    * *days_dict*: 0 = Monday .. 6 = Sunday
    * *months_dict*: 0 = January .. 12 = December
* Set some boolean control parameters for the workflow:
    * _drop_leap_day_: If True, the leap day will be dropped making leap years comparable to
    regular ones. _NOTE: The weekly pattern will be destroyed by simply dropping the leap day. 
    Another possibility would be to create a synthetic year by simply dropping the first or last
    day of the year._

In [None]:
path_folder_hoho = "./inputs/availability_hoho/"
path_folder_tcs = "./inputs/availability_tcs/"
path_folder_ind = "./inputs/availability_ind/"
path_folder_out = "./out/availability/"
filename_out = "availability_timeseries.xlsx"

filename_hoho_appliances = "Verfuegbarkeit_Weisse_Ware_bearbeitet.xlsm"
filename_hoho_heating = "Verfuegbarkeit_Waermeanwendungen_bearbeitet.xlsm"

filename_tcs_Gil15 = "Gils_2015_berarbeitet.xlsx"
filename_tcs_Gre09 = "Grein_2009_bearbeitet.xlsx"
filename_tcs_Lad18 = "Ladwig_2018_bearbeitet.xlsx"
filename_tcs_Ste17 = "Steurer_2017_bearbeitet.xlsx"

filename_ind = "Zeitverfuegbarkeit2_bearbeitet.xlsx"
filename_ind_add = "Zeitverfuegbarkeit1_bearbeitet.xlsx"

cols_hoho_appliances = ['Monat', 'Gerundete Temperatur', 'Stunden des Jahres',
                        'Stunden des Tages', 'Wochentag', 'Jahreszeit',
                        'Lastzuschaltung KGR normiert', 'Lastabschaltung KGR',
                        'Lastzuschaltung WM normiert', 'Lastabschaltung WM',
                        'Lastzuschaltung WT normiert', 'Lastabschaltung WT',
                        'Lastzuschaltung GS normiert', 'Lastabschaltung GS']

cols_hoho_heating = ['Monat', 'äquivalente Tagesmitteltemperatur',
                     'Kühlbedarf', 'Heizbedarf', 'Gerundete Temperatur',
                     'Stunden des Jahres ', 'Stunden des Tages', 'Wochentag', 'Jahreszeit',
                     'Lastabschaltung NH normiert', 'Lastzuschaltung NH normiert',
                     'Lastabschaltung WP normiert', 'Lastzuschaltung WP normiert',
                     'Lastabschaltung UP', 'Lastabschaltung RK', 'Lastzuschaltung RK',
                     'Lastabschaltung WW Tag', 'Lastzuschaltung WW normiert Tag']

cols_tcs_Gil15 = ['Gerundete Temperatur', 'Heizbedarf', 'äquivalente Tagesmitteltemperatur', 
                  'Stunden des Jahres ', 'Uhrzeit des Tages', 'Wochentag', 'Feiertag', 
                  'Jahreszeit', 'Lastzuschaltung LÜ normiert', 'Lastabschaltung LÜ normiert',
                  'Lastzuschaltung WVP normiert', 'Lastabschaltung WVP normiert']

cols_tcs_Gre09 = ['Monat', 'Gerundete Temperatur', 'Heizbedarf',
                  'äquivalente Tagesmitteltemperatur', 'Stunden des Jahres ',
                  'Uhrzeit des Tages', 'Wochentag', 'Feiertag', 'Jahreszeit',
                  'Lastzuschaltung KGR normiert', 'Lastabschaltung KGR normiert', 
                  'Lastzuschaltung KÜ normiert', 'Lastabschaltung KÜ normiert']

cols_tcs_Lad18 = ['Gerundete Temperatur', 'Heizbedarf', 'äquivalente Tagesmitteltemperatur', 
                  'Stunden des Jahres ', 'Uhrzeit des Tages', 'Wochentag', 'Feiertag', 
                  'Jahreszeit', 'Lastzuschaltung KA normiert', 'Lastabschaltung KA normiert']

cols_tcs_Ste17 = ['Gerundete Temperatur', 'Heizbedarf', 'äquivalente Tagesmitteltemperatur', 
                  'Stunden des Jahres ', 'Uhrzeit des Tages', 'Wochentag', 'Feiertag', 
                  'Jahreszeit', 'Lastzuschaltung EH normiert', 'Lastabschaltung EH normiert',
                  'Lastzuschaltung WP normiert', 'Lastabschaltung WP normiert',
                  'Lastzuschaltung WW normiert', 'Lastabschaltung WW normiert']

days_dict = {0: "Montag",
             1: "Dienstag",
             2: "Mittwoch",
             3: "Donnerstag",
             4: "Freitag",
             5: "Samstag",
             6: "Sonntag"}

months_dict = {1: "Januar",
               2: "Februar",
               3: "März",
               4: "April",
               5: "Mai",
               6: "Juni",
               7: "Juli",
               8: "August",
               9: "September",
               10: "Oktober",
               11: "November",
               12: "Dezember"}

drop_leap_day = False

# Read in data
Read in data for the different sectors:
* hoho: divided by appliances and heating related consumers
* tcs: divided by studies
* ind: divided by processes with varying potentials and those with constant potentials

In [None]:
hoho_appliances = pd.read_excel(path_folder_hoho+filename_hoho_appliances,
                                sheet_name="Datengrundlage", 
                                skiprows=5, header=0, index_col=1)

hoho_heating = pd.read_excel(path_folder_hoho+filename_hoho_heating,
                             sheet_name="Datengrundlage", 
                             skiprows=5, header=0, index_col=2)

tcs_Gil15 = pd.read_excel(path_folder_tcs+filename_tcs_Gil15,
                          sheet_name="Zeitreihen_Lastverschiebung",
                          skiprows=5, header=0, index_col=1)

tcs_Gre09 = pd.read_excel(path_folder_tcs+filename_tcs_Gre09,
                          sheet_name="Zeitreihen_Lastverschiebung",
                          skiprows=5, header=0, index_col=1)

tcs_Lad18 = pd.read_excel(path_folder_tcs+filename_tcs_Lad18,
                          sheet_name="Zeitreihen_Lastverschiebung",
                          skiprows=5, header=0, index_col=1)

tcs_Ste17 = pd.read_excel(path_folder_tcs+filename_tcs_Ste17,
                          sheet_name="Zeitreihen_Lastverschiebung",
                          skiprows=5, header=0, index_col=1)

ind_varying = pd.read_excel(path_folder_ind+filename_ind,
                            sheet_name="Verläufe mit Abhängigkeiten",
                            usecols="A:G", header=0) 

ind_constant = pd.read_excel(path_folder_ind+filename_ind,
                            sheet_name="konstante Verläufe",
                            header=0)

# Prepare data and combine to overall data set
* Delete columns not needed anymore resp. keep only the columns of interest.
* Combine all availability time series to overall DataFrames.
    * Remove duplicate columns and set index dtype to str.
    * Separate positive and negative load shift availability.
    * Store the DataFrames in a dict and use keys for creating Excel sheet names (see below.)
* For the industry sector only: add calendar and temperature information
(timestamp, weekday, month etc.)

In [None]:
# Prepare data (delete columns not needed and adjust index dtype)
hoho_appliances = hoho_appliances[cols_hoho_appliances]
hoho_heating = hoho_heating[cols_hoho_heating]

tcs_Gil15 = tcs_Gil15[cols_tcs_Gil15]
tcs_Gre09 = tcs_Gre09[cols_tcs_Gre09]
tcs_Lad18 = tcs_Lad18[cols_tcs_Lad18]
tcs_Ste17 = tcs_Ste17[cols_tcs_Ste17]

# Convert index to DatetimeIndex
# Index can be used for all sectors / DataFrames due to equal indexing
new_index = hoho_appliances.index.astype(str)
new_index = pd.to_datetime(hoho_appliances.index.values, format='%Y%m%d%H', errors='ignore')

hoho_appliances = hoho_appliances.set_index(new_index)
hoho_heating = hoho_heating.set_index(new_index)

tcs_Gil15 = tcs_Gil15.set_index(new_index)
tcs_Gre09 = tcs_Gre09.set_index(new_index)
tcs_Lad18 = tcs_Lad18.set_index(new_index)
tcs_Ste17 = tcs_Ste17.set_index(new_index)

ind_varying = ind_varying.set_index(new_index)
ind_constant = ind_constant.set_index(new_index)

# display(hoho_appliances.head())
# display(hoho_heating.head())

In [None]:
# Combine the data
hoho_combined = pd.concat([hoho_appliances, hoho_heating], axis=1, sort=False)
hoho_combined = hoho_combined.loc[:, ~hoho_combined.columns.duplicated()]

tcs_combined = pd.concat([tcs_Gil15, tcs_Gre09, tcs_Lad18, tcs_Ste17], axis=1, sort=False)
tcs_combined = tcs_combined.loc[:, ~tcs_combined.columns.duplicated()]

ind_combined = pd.concat([ind_varying, ind_constant], axis=1, sort=False)
ind_combined = ind_combined.loc[:, ~ind_combined.columns.duplicated()]

# Add some calendar and temperature information for the industry sector
ind_combined["Wochentag"] = ind_combined.index.weekday.map(days_dict)
ind_combined["Monat"] = ind_combined.index.month.map(months_dict)
ind_combined["Gerundete Temperatur"] = hoho_combined["Gerundete Temperatur"]
ind_combined["Tagesmitteltemperatur"] = pd.read_excel(path_folder_ind+filename_ind_add,
                                                      sheet_name="RZM", usecols="D", header=0)

# Filter out the leap day (if this is necessary)
# Problem here: Weekly pattern is destroyed for the week with the leap day
if drop_leap_day:
    mask = (hoho_combined.index < "2012-02-29 00:00") | (hoho_combined.index > "2012-02-29 23:00")
    hoho_combined = hoho_combined[mask]
    tcs_combined = tcs_combined[mask]
    ind_combined = ind_combined[mask]
    # Same but for str as index dtype
    # hoho_combined = hoho_combined[~hoho_combined.index.str.contains("0229")]
    
hoho_combined_pos = hoho_combined.filter(regex="Lastabschaltung")
hoho_combined_neg = hoho_combined.filter(regex="Lastzuschaltung")
tcs_combined_pos = tcs_combined.filter(regex="Lastabschaltung")
tcs_combined_neg = tcs_combined.filter(regex="Lastzuschaltung")
ind_combined_pos = ind_combined.filter(regex="LRP")
ind_combined_neg = ind_combined.filter(regex="LZP")

# Dictionary to store the availability DataFrames
dict_availability = {"hoho_pos": hoho_combined_pos,
                     "hoho_neg": hoho_combined_neg,
                     "tcs_pos": tcs_combined_pos,
                     "tcs_neg": tcs_combined_neg,
                     "ind_pos": ind_combined_pos,
                     "ind_neg": ind_combined_neg}

# display(hoho_combined)

# Store the output to Excel

Store every DataFrame on a separate Excel sheet and use the dict keys from above as sheet names.

In [None]:
writer = pd.ExcelWriter(path_folder_out+filename_out, engine = "xlsxwriter")

for k, v in dict_availability.items():

    v.to_excel(writer, sheet_name = k)

writer.save()