#### Notebook with all the general purpose functions

This notebook contains the general purpose functions that will be used throughout the solution. In this context,
"general purpose" means that these functions dont belong to a particular process such as preprocessing, modeling or
prediction; however, they are used as auxiliary functions to accomplish specific tasks within the code while preserving
modularity and order.

The functions included are:

| Function | Description |
| -------- | ----------- |
| `split_series`  | splits a time series into train and test sets according to the given dates |
| `add_holidays`  | adds the holidays to the time series dataset as an additional binary column |

###### Definition of functions

In [0]:
def split_series(data, start_test, end_test):
    """
    Splits a time series into train and test sets according to the given date boundaries.

    Note that all records before the start of the test set are considered to be part of the train set. Furthermore, the
    end of the train set and start of the test set are assumed to be contiguous.

    Parameters
    __________
        data (pd.DataFrame): Dataset with the time series to split.
        start_test (str): Start of test set (included).
        end_test (str): End of test set (included).

    Returns
    ________
        train_df (pd.DataFrame): Train set DataFrame.
        test_df (pd.DataFrame): Test set DataFrame.
    """
    # Splitting dataset
    train_df = data[data['ds'] < pd.to_datetime(start_test)]
    test_df = data[(data['ds'] >= pd.to_datetime(start_test)) & (data['ds'] <= pd.to_datetime(end_test))]

    return train_df, test_df

In [0]:
def add_holidays(df_data, df_holidays):
    """
    Adds the holidays to the time series dataset as an additional binary column, where the value of this column is 1 for
    the dates where there is a holiday and 0 otherwise.

    Parameters
    __________
        df_data (pd.DataFrame): Dataset with the time series
        df_holidays (pd.DataFrame): Dataset with holidays.

    Returns
    ________
        df_data (pd.DataFrame): Same input "df_data" dataset but modified after adding the binary column with the
            holidays.
    """
    # Adding holidays column to the dataset
    df_data["holiday"] = 0

    # Identifying dates with holidays
    rows = pd.merge(df_data, df_holidays, on="ds", how="left", indicator=True)["_merge"] == "both"

    # Replacing holidays with 1
    df_data.loc[rows, "holiday"] = 1

    return df_data