## Data creation-Days

This notebook builds a function that adds the day of the week/holidays for each calendar day.

Including type of day as a predictor in energy price forecast.
The days of the week (exogenous varaible) that are generated by this function are:

- day of the week
- weekend or weekday
- holiday or special event

In [None]:
from datetime import date
import holidays
import pandas as pd

In [None]:
#create a datetime range
dates = pd.date_range(start='1/1/2019', end='31/12/2019')
dates

In [None]:
#create an object with all the holidays in denmakr
denmark_holidays = holidays.CountryHoliday('DK')

In [None]:
denmark_holidays.values()

In [None]:
denmark_holidays.get('2018-12-25')

In [None]:
def get_holidays(start='1/1/2019', stop='31/12/2019', country='DK', frequency='D'):
    """
    Takes in a start and stop date and a country.
    
    Produces a dataframe with a daily date time index and columns:
    day_of_week - numerical day of the week identifier 0 for monday
    holiday_bool - boolean true or false for holiday
    holiday_name - name of the holiday if holiday_bool is true
    
    Returns a dataframe
    """
    
    #generate the range of daily dates
    dates = pd.date_range(start=start, end=stop, freq=frequency)
    
    #create the holiday object
    country_holidays = holidays.CountryHoliday(country)

    #create a list for the holiday bool and name
    holiday_list = []
    
    #loop through the dates
    for date in dates:
        #true if holiday in object, false otherwise
        holiday_bool = date in country_holidays
        holiday_names = country_holidays.get(date)
        
        holiday_list.append([holiday_bool, holiday_names])
        
    #create return dataframe
    holidays_data = pd.DataFrame(holiday_list, index=dates, columns=['holiday_bool', 'holiday_name'])
                  
    return holidays_data

In [None]:
holiday_df = get_holidays(start='2015-01-01', stop='2020-12-31')

In [None]:
holiday_df.holiday_name.unique()

In [None]:
holiday_df.head()

In [None]:
def get_days_dummies(start='1/1/2019', stop='31/12/2019', frequency='D'):
    """
    Takes in a start and stop date and frequency.
    
    Produces a dataframe with a date time index at the frequency input and columns:
    weekday_id - numerical day of the week identifier 0 for monday
    
    Returns a dataframe
    """
    
    #generate the range of daily dates
    dates = pd.date_range(start=start, end=stop, freq=frequency)
    
    #create a dataframe of weekday categories
    days = pd.DataFrame(list(dates.weekday), index=dates, columns=['weekday_id'])
    
    days = pd.get_dummies(days['weekday_id'])
    
    columns = ['mon', 'tue', 'wed', 'thur', 'fri', 'sat', 'sun']
    
    days.columns = columns
    
    return days
    

In [None]:
get_days_dummies()