##**Installations**

- Installing **darts**
- Installing **Optuna**
- Installing **Plotly**

In [1]:
%%capture
!pip install darts
!pip install optuna
!pip install plotly
!pip install lightgbm

In [2]:
try:
    import darts
    import optuna
    import plotly
    import lightgbm
    print("Success")
except ImportError:
    print("Failed to import packages")


Success


##**Imports**

In this cell, we import all the necessary modules for our analysis.

- **Numpy and Pandas:** These are fundamental packages for scientific computing and data manipulation in Python.
- **Matplotlib:** This is a plotting library that we'll use for data visualization.
- **Sklearn:** We'll use this library for data preprocessing and performance metrics.
- **Optuna:** This is a hyperparameter optimization framework, which we'll use to tune our N-BEATS model.
- **Darts:** This library provides us with utilities for time series processing and models, including the N-BEATS model.
- **Google Colab:** We use Google Colab's drive module to mount our Google Drive where our data is stored.


In [3]:
# @title
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
from typing import Union, List
from sklearn.preprocessing import MaxAbsScaler
from IPython.display import display
import time
import gc
#import seaborn as sns

#plotly
import plotly.graph_objects as go
from plotly.subplots import make_subplots


# optuna hyperparameter
import optuna
from optuna.integration import PyTorchLightningPruningCallback
import optuna.visualization as ovis

#darts imports
from darts import TimeSeries
from darts.models import NaiveSeasonal, NaiveDrift, ExponentialSmoothing, VARIMA, AutoARIMA
from darts.models import RNNModel, NBEATSModel, NLinearModel, DLinearModel, LightGBMModel, NaiveMean, NaiveSeasonal, NaiveDrift, LinearRegressionModel, RandomForest
from darts.dataprocessing.transformers import Scaler, MissingValuesFiller, InvertibleMapper
from darts.metrics import mape, smape, mae, r2_score, mse, smape, mase,rmse, rmsle
from darts.utils.timeseries_generation import (gaussian_timeseries,linear_timeseries,sine_timeseries)
from darts.dataprocessing import Pipeline

#Validation Imports
from sklearn.model_selection import TimeSeriesSplit
from sklearn.metrics import mean_squared_error
import statsmodels.api as sm
from sklearn.model_selection._split import _BaseKFold, indexable, _num_samples
from sklearn.utils.validation import _deprecate_positional_args
from sklearn.preprocessing import LabelEncoder

# Drive mount
from google.colab import drive
drive.mount('/content/drive')     #Drive abcr5914


Mounted at /content/drive


##**Required Functions**

The ***below code block*** contains a collection of helper functions that are used in various parts of the notebook for tasks such as preprocessing, model training, and evaluation. Below is a brief description of each function:
- **fit_predict_and_evaluate:** This function fits the model to the training data, makes predictions, and evaluates the model's performance using Mean Absolute Error (MAE) and Symmetric Mean Absolute Percentage Error (sMAPE). This function encapsulates the entire process of training, predicting, and evaluating a model.
```python
def fit_predict_and_evaluate(model, training_set, validation_set, n_forecasts, model_name='Model'):
```

In [4]:
def fit_predict_and_evaluate(model, training_set, validation_set, n_forecasts, past_covariates_train = None, past_covariates_validation = None, series_list=False, model_name='Model'):
    """
    Fit the model to the training data, make predictions, and evaluate the metrics.

    Args:
        model: The model to be trained and used for predictions.
        training_set: The data to train the model.
        validation_set: The data to validate the model.
        n_forecasts (int): The number of forecast steps.
        past_covariates_train: The past covariates for training.
        past_covariates_validation: The past covariates for validation.
        series_list (bool, optional): If True, the training_set will be passed as a series to the predict method. Defaults to False.
        model_name (str, optional): The name of the model. Defaults to 'Model'.

    Returns:
        dict: A dictionary containing the forecast and the metrics.

    Raises:
        ValueError: If the lengths of actual_values and predicted_values are not the same.
    """
    # Validate inputs
    if not model or not training_set or n_forecasts < 1:
        raise ValueError("Invalid inputs provided.")

    # Fit the model and make predictions
    try:
        train_start = time.time()
        model.fit(training_set, past_covariates=past_covariates_train)
        train_stop = time.time()
        forecast_start = time.time()
        forecast = model.predict(n_forecasts, series=training_set if series_list else None, past_covariates=past_covariates_validation, show_warnings=False)
        forecast_stop = time.time()
    except Exception as e:
        print(f"An error occurred while fitting the model or making predictions: {e}")
        return None

    # Validate the forecast and actual values length
    #if len(validation_set) != len(forecast):
     #   raise ValueError("Lengths of validation_set and forecast must be the same.")


    mae_value = mae(validation_set, forecast, inter_reduction = np.mean)
    smape_value = smape(validation_set, forecast, inter_reduction = np.mean)
    rmse_value = rmse(validation_set, forecast, inter_reduction = np.mean)

    train_time = train_stop-train_start
    forecast_time = forecast_stop - forecast_start

    print('-----------------------------------------')
    print(f"MAE ({model_name}) = {mae_value:.2f}")
    print(f"sMAPE ({model_name}) = {smape_value:.2f}")
    print(f"RMSE ({model_name}) = {rmse_value:.2f}")
    print(f"Training Time ({model_name}) = {train_time:.2f}sec")
    print(f"Forecast Time ({model_name}) = {forecast_time:.2f}sec")
    print(f"Total Time ({model_name}) = {train_time+forecast_time:.2f}sec")
    print('-----------------------------------------')

    return {'forecast': forecast, 'MAE': mae_value, 'sMAPE': smape_value, 'RMSE': rmse_value, 'Training Time': train_time, 'Forecast Time': forecast_time, 'Total Time': train_time+forecast_time}

'----------------------------------------------------------------------------------------------------'


def extend_index_with_past(input_chunk_length, train_idx, test_idx, dataframe, gap = False):
    """
    Extends the first index in the list with its previous indexes based on input_chunk_length.

    Parameters:
    input_chunk_length (int): Number of previous indexes to include.
    indexes (list): List of indexes from the DataFrame.
    dataframe (DataFrame): The DataFrame from which indexes are derived.

    Returns:
    list: Extended list of indexes including the range from the first index and its previous indexes.
    """
    if not train_idx or dataframe.empty:
        return []

    index = max(train_idx)
    # Ensuring the start index does not fall below the DataFrame's minimum index
    start_index = max(index - input_chunk_length, dataframe.index.min())

    extended_indexes = list(range(start_index+1, index + 1))+ test_idx if gap == True else list(range(start_index + 1, max(test_idx) + 1))
    return extended_indexes


'----------------------------------------------------------------------------------------------------'

def add_metrics_to_df(df, store_id, dept_id, model_name, mae, smape, rmse, training_time, forecast_time, total_time, dataset_size):
    new_row = pd.DataFrame({
        'Store ID': [store_id],
        'Dept ID': [dept_id],
        'Model Name': [model_name],
        'MAE': [mae],
        'sMAPE': [smape],
        'RMSE': [rmse],
        'Training Time(seconds)': [training_time],
        'Forecast Time(seconds)': [forecast_time],
        'Total Time(seconds)': [total_time],
        'Dataset Size(MB)': [dataset_size]
    })
    df = pd.concat([df, new_row])
    return df

'-------------------------------------------------------------------------------------------------------'
# Function to load datasets
def load_dataset(path):
    return pd.read_csv(path)

'-------------------------------------------------------------------------------------------------------'

# Function to convert and one-hot encode events
def one_hot_encode_and_join(original_dataframe, feature_to_encode, new_name=None):
    dummies = pd.get_dummies(original_dataframe[feature_to_encode].fillna('No_Event'), prefix=new_name)
    return original_dataframe.drop(feature_to_encode, axis=1).join(dummies)

'-------------------------------------------------------------------------------------------------------'

# Creating the DataFrame
columns = ['Store ID', 'Dept ID', 'Model Name', 'MAE', 'sMAPE', 'RMSE', 'Training Time(seconds)', 'Forecast Time(seconds)', 'Total Time(seconds)',
           'Dataset Size(MB)']
metrics_df = pd.DataFrame(columns=columns)


## **Model Creation Functions**

This notebook contains functions for creating instances of various types of models. Each function takes in a set of parameters and returns a model instance with those parameters.

1. **Nbeats Model** : <br>
This function creates and returns an **`NBEATS Model`** with specified parameters.
```python
create_nbeats_model(input_chunk_length=24, output_chunk_length=24, generic_architecture=True,
               num_stacks=4, num_blocks=2, num_layers=2, layer_widths=16,
               n_epochs=1, nr_epochs_val_period=1, batch_size=32, model_name="NBEATS")
```
2. **DLinear Model** : <br>
This function creates an instance of **`DLinearModel`** with the specified parameters.
```python
create_dlinear_model(input_chunk_length=24, output_chunk_length=24, n_epochs=2,
                        batch_size=16, shared_weights=False, kernel_size=25,
                        const_init=True, use_static_covariates=True):
```

3. **NLinear Model** : <br>
This Function creates an instance of **`NLinearModel`** with the specified parameters.
```python
create_nlinear_model(input_chunk_length=24, output_chunk_length=24, n_epochs=2,
                         shared_weights=False, const_init=True, normalize=False,
                         use_static_covariates=True):
```

4. **LightGBM Model**: <br>
This function creates an instance of LightGBMModel with the specified parameters.
```python
create_lightgbm_model(lags=12, output_chunk_length=1, max_depth=3,
                          n_estimators=50, learning_rate=0.1, verbose=-1,num_leaves=31,
                          lags_past_covariates = None, min_data_in_leaf=20,extra_trees=True)
```

In [5]:
def create_nbeats_model(input_chunk_length=24, output_chunk_length=24, generic_architecture=True,
               num_stacks=4, num_blocks=2, num_layers=2, layer_widths=16,
               n_epochs=2, nr_epochs_val_period=1, batch_size=32, model_name="NBEATS"):
    """
    Create and return an NBEATSModel with the specified parameters.

    Args:
        input_chunk_length (int): The length of the input sequence.
        output_chunk_length (int): The length of the output sequence.
        generic_architecture (bool): Whether to use the generic architecture.
        num_stacks (int): The number of stacks.
        num_blocks (int): The number of blocks per stack.
        num_layers (int): The number of layers per block.
        layer_widths (int): The number of units per layer.
        n_epochs (int): The number of epochs.
        nr_epochs_val_period (int): The number of epochs between each validation set evaluation.
        batch_size (int): The size of the batch.
        model_name (str): The name of the model.

    Returns:
        NBEATSModel: A trained NBEATSModel.
    """
    model_nbeats = NBEATSModel(
        input_chunk_length=input_chunk_length,
        output_chunk_length=output_chunk_length,
        generic_architecture=generic_architecture,
        num_stacks=num_stacks,
        num_blocks=num_blocks,
        num_layers=num_layers,
        layer_widths=layer_widths,
        n_epochs=n_epochs,
        nr_epochs_val_period=nr_epochs_val_period,
        batch_size=batch_size,
        model_name=model_name
    )

    return model_nbeats

def create_nlinear_model(input_chunk_length=24, output_chunk_length=24, n_epochs=2,
                         shared_weights=False, const_init=True, normalize=False,
                         use_static_covariates=True):
    """
    Creates an instance of NLinearModel with the specified parameters.

    Parameters:
    input_chunk_length (int): Length of the input chunks.
    output_chunk_length (int): Length of the output chunks.
    n_epochs (int): Number of epochs for training.
    shared_weights (bool): If True, the model will share weights.
    const_init (bool): If True, the model will have constant initialization.
    normalize (bool): If True, the model will use normalization.
    use_static_covariates (bool): If True, the model will use static covariates.

    Returns:
    model (NLinearModel): An instance of NLinearModel.
    """

    model = NLinearModel(
        input_chunk_length=input_chunk_length,
        output_chunk_length=output_chunk_length,
        n_epochs=n_epochs,
        shared_weights=shared_weights,
        const_init=const_init,
        normalize=normalize,
        use_static_covariates=use_static_covariates,
    )

    return model

def create_dlinear_model(input_chunk_length=24,
                         output_chunk_length=24,
                         n_epochs=2,
                         batch_size=16,
                         shared_weights=False,
                         kernel_size=25,
                         const_init=True,
                         use_static_covariates=True):
    """
    Creates an instance of DLinearModel with the specified parameters.

    Parameters:
    input_chunk_length (int): Length of the input chunks.
    output_chunk_length (int): Length of the output chunks.
    n_epochs (int): Number of epochs for training.
    batch_size (int): Batch size for training.
    shared_weights (bool): If True, the model will share weights.
    kernel_size (int): Size of the kernels in the model.
    const_init (bool): If True, the model will have constant initialization.
    use_static_covariates (bool): If True, the model will use static covariates.

    Returns:
    model (DLinearModel): An instance of DLinearModel.
    """

    model = DLinearModel(
        input_chunk_length=input_chunk_length,
        output_chunk_length=output_chunk_length,
        n_epochs=n_epochs,
        batch_size=batch_size,
        shared_weights=shared_weights,
        kernel_size=kernel_size,
        const_init=const_init,
        use_static_covariates=use_static_covariates
    )

    return model


def create_lightgbm_model(lags=12,
                          output_chunk_length=1,
                          max_depth=3,
                          n_estimators=50,
                          learning_rate=0.1,verbose =-1,use_static_covariates = False,num_leaves=31,lags_past_covariates = None, min_data_in_leaf=20,extra_trees=True):
    """
    Creates an instance of LightGBMModel with the specified parameters.

    Parameters:
    lags (int): Number of lags to use in the model.
    output_chunk_length (int): Length of the output chunks.
    max_depth (int): Maximum depth of the trees.
    n_estimators (int): Number of trees in the model.
    learning_rate (float): Learning rate for the model.

    Returns:
    model (LightGBMModel): An instance of LightGBMModel.
    """

    model = LightGBMModel(
        lags=lags,
        output_chunk_length=output_chunk_length,
        max_depth=max_depth,
        n_estimators=n_estimators,
        learning_rate=learning_rate,
        verbose=verbose,
        num_leaves=num_leaves,
        min_data_in_leaf=min_data_in_leaf,
        extra_trees=extra_trees,
        lags_past_covariates = lags_past_covariates
    )

    return model


# **free_ram_memory function**

In [6]:
import numpy as np
import pandas as pd

def reduce_memory(df, verbose=True):
    """Reduce memory of a DataFrame by downcasting numeric types."""

    # Initial memory
    start_mem = df.memory_usage().sum() / 1024**2

    # Define numeric types and their downcast
    type_mapping = {
        'int': [(np.int8, np.iinfo(np.int8)),
                (np.int16, np.iinfo(np.int16)),
                (np.int32, np.iinfo(np.int32)),
                (np.int64, np.iinfo(np.int64))],

        'float': [(np.float16, np.finfo(np.float16)),
                  (np.float32, np.finfo(np.float32)),
                  (np.float64, np.finfo(np.float64))]
    }

    for col, col_type in df.dtypes.items():
        if col_type in ['int16', 'int32', 'int64']:
            for dtype, type_info in type_mapping['int']:
                if type_info.min <= df[col].min() and df[col].max() <= type_info.max:
                    df[col] = df[col].astype(dtype)
                    break
        elif col_type in ['float16', 'float32', 'float64']:
            for dtype, type_info in type_mapping['float']:
                if type_info.min <= df[col].min() and df[col].max() <= type_info.max:
                    df[col] = df[col].astype(dtype)
                    break

    # Calculate and print memory reduction
    end_mem = df.memory_usage().sum() / 1024**2
    if verbose:
        print(f'Memory usage decreased from {start_mem:.2f}MB to {end_mem:.2f}MB ({100 * (start_mem - end_mem) / start_mem:.1f}% reduction)')

    return df


# M5

In [66]:
# Paths to the datasets
base_path = '/content/drive/MyDrive/Master_Thesis/Datasets/M5-Forecasting-accuracy/'
files = ['sell_prices.csv', 'sales_train_evaluation.csv', 'calendar.csv', 'sample_submission.csv']
sell_price_df, sales_train_evaluation_df, calendar_df, submission_file = [load_dataset(base_path + file) for file in files]


# One-hot encode 'event_name_1' and 'event_name_2'
calendar_df = one_hot_encode_and_join(calendar_df, 'event_name_1', 'event')
calendar_df = one_hot_encode_and_join(calendar_df, 'event_name_2', 'event2')
calendar_df.drop(columns =['event_type_1','event_type_2'],inplace = True)

stores = ['CA_1', 'CA_2', 'CA_3', 'CA_4', 'TX_1', 'TX_2', 'TX_3', 'WI_1', 'WI_2', 'WI_3']
departments = ['HOBBIES_1', 'HOBBIES_2', 'HOUSEHOLD_1', 'HOUSEHOLD_2', 'FOODS_1', 'FOODS_2', 'FOODS_3']

def get_store_dept_combinations(stores, departments):
    for store in stores:
        for department in departments:
            yield store, department

# Create a generator
combinations_generator = get_store_dept_combinations(stores, departments)

"\n# Paths to the datasets\nbase_path = '/content/drive/MyDrive/Master_Thesis/Datasets/M5-Forecasting-accuracy/'\nfiles = ['sell_prices.csv', 'sales_train_evaluation.csv', 'calendar.csv', 'sample_submission.csv']\nsell_price_df, sales_train_evaluation_df, calendar_df, submission_file = [load_dataset(base_path + file) for file in files]\n\n\n# One-hot encode 'event_name_1' and 'event_name_2'\ncalendar_df = one_hot_encode_and_join(calendar_df, 'event_name_1', 'event')\ncalendar_df = one_hot_encode_and_join(calendar_df, 'event_name_2', 'event2')\ncalendar_df.drop(columns =['event_type_1','event_type_2'],inplace = True)\n\nstores = ['CA_1', 'CA_2', 'CA_3', 'CA_4', 'TX_1', 'TX_2', 'TX_3', 'WI_1', 'WI_2', 'WI_3']\ndepartments = ['HOBBIES_1', 'HOBBIES_2', 'HOUSEHOLD_1', 'HOUSEHOLD_2', 'FOODS_1', 'FOODS_2', 'FOODS_3']\n\ndef get_store_dept_combinations(stores, departments):\n    for store in stores:\n        for department in departments:\n            yield store, department\n\n# Create a gene

In [67]:
# Store and Department
store_id, dept_id = next(combinations_generator)
print(f'store_id = \'{store_id}\'')
print(f'dept_id = \'{dept_id}\'')

store_id = 'CA_1'
dept_id = 'FOODS_3'


##**Data Preprocessing**

In [68]:
# filtering sales train and sell prices based on store_id and dept_id
sales_train_filtered = sales_train_evaluation_df.query("store_id == @store_id & dept_id == @dept_id")
sell_prices_filtered = sell_price_df.query("store_id == @store_id & item_id.str.contains(@dept_id)")

# merging calendar with sell prices
merged_price_calendar = sell_prices_filtered.merge(calendar_df, on='wm_yr_wk', how='left')

# melting sales train df
melt_vars = [f'd_{i}' for i in range(1, 1942)]
train_melt_df = sales_train_filtered.melt(id_vars=['id'], value_vars=melt_vars, var_name='d', value_name='value')

# merging melted_train_df with merged price calendar
merged_price_calendar['id'] = merged_price_calendar['item_id'] + '_' + merged_price_calendar['store_id'] + '_evaluation'
final_df = train_melt_df.merge(merged_price_calendar, on=['id', 'd'], how='left')

# Cleaning up the DataFrame
final_df.drop(columns=['store_id', 'item_id', 'weekday', 'wday', 'month', 'year'], inplace=True)
final_df_no_nan = final_df.dropna(subset=['date']).copy()
final_df_no_nan['date'] = pd.to_datetime(final_df_no_nan['date'])
columns_order = ['id', 'value', 'sell_price', 'date'] + [col for col in final_df_no_nan.columns if 'event' in col or 'snap' in col]
formatted_df = final_df_no_nan[columns_order].copy()

# Clearing up unwanted dataframes
del train_melt_df, final_df, final_df_no_nan,merged_price_calendar,sell_prices_filtered,sales_train_filtered

# Reducing memory usage
display(reduce_memory(formatted_df, verbose=True))
dataset_size = formatted_df.memory_usage().sum() / 1024**2
garbage_collected = gc.collect()

Memory usage decreased from 429.75MB to 129.41MB (69.9% reduction)


Unnamed: 0,id,value,sell_price,date,snap_CA,snap_TX,snap_WI,event_Chanukah End,event_Christmas,event_Cinco De Mayo,...,event_StPatricksDay,event_SuperBowl,event_Thanksgiving,event_ValentinesDay,event_VeteransDay,event2_Cinco De Mayo,event2_Easter,event2_Father's day,event2_No_Event,event2_OrthodoxEaster
0,FOODS_3_001_CA_1_evaluation,1,2.279297,2011-01-29,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0
4,FOODS_3_005_CA_1_evaluation,1,1.980469,2011-01-29,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0
10,FOODS_3_011_CA_1_evaluation,3,1.980469,2011-01-29,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0
13,FOODS_3_014_CA_1_evaluation,11,1.980469,2011-01-29,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0
14,FOODS_3_015_CA_1_evaluation,5,1.580078,2011-01-29,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1597438,FOODS_3_823_CA_1_evaluation,2,2.980469,2016-05-22,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0
1597439,FOODS_3_824_CA_1_evaluation,0,2.480469,2016-05-22,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0
1597440,FOODS_3_825_CA_1_evaluation,1,3.980469,2016-05-22,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0
1597441,FOODS_3_826_CA_1_evaluation,1,1.280273,2016-05-22,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0


In [69]:
# formatted_df.to_csv(base_path + 'formatted_df.csv') # to download the csv file to the drive

##**TimeSeries Conversion**

In [70]:
# Prepare the data for time series analysis by sorting
sorted_data = formatted_df.copy().sort_values(by='id')

# Initializing the label encoder
label_encoder = LabelEncoder()

# Fit the encoder to the 'id' column and transform it to numeric labels
sorted_data['id_encoded'] = label_encoder.fit_transform(sorted_data['id'])

# Group by 'id' and convert the 'value' column into TimeSeries objects
group_col = ['id_encoded']
date_col = 'date'
value_col = ['value']

# Generate TimeSeries objects for the target variable
grouped_time_series = TimeSeries.from_group_dataframe(
    sorted_data,
    group_cols=group_col,
    time_col=date_col,
    value_cols=value_col
)

# Set the cutoff date for training and validation sets
split_date = pd.Timestamp('2016-04-24')

# Split the data into training and validation sets
train_set = [ts.split_before(split_date)[0] for ts in grouped_time_series]
val_set = [ts.split_after(split_date)[1] for ts in grouped_time_series]

Scale_series = Scaler()
train_series_set = [Scale_series.fit_transform(series) for series in train_set]
val_series_set =[Scale_series.transform(series) for series in val_set]

# Gather additional features for the model
exogenous_features = ['sell_price','snap_CA', 'snap_TX', 'snap_WI'] + \
    [e for e in formatted_df.columns if e.startswith('event')] + \
    [e for e in formatted_df.columns if e.startswith('event2')]

exog_features_data = sorted_data[['id_encoded', 'date'] + exogenous_features].sort_values(by='id_encoded')

# Convert the exogenous features into TimeSeries objects
exog_features_series = TimeSeries.from_group_dataframe(
    exog_features_data,
    group_cols=group_col,
    time_col=date_col,
    value_cols=exogenous_features
)

# Split date for exogenous features
train_exog_cutoff = pd.Timestamp('2016-04-24')
val_exog_cutoff = pd.Timestamp('2016-03-30')

# Split the exogenous features TimeSeries
train_exog_set = [ts.split_before(train_exog_cutoff)[0] for ts in exog_features_series]
val_exog_set = [ts.split_after(val_exog_cutoff)[1] for ts in exog_features_series]

del sorted_data

garbage_collected = gc.collect()

##**Nbeats**

In [71]:
# `run_nbeats` is a function that initializes and returns a model object.
model = create_nbeats_model()

# Then we call the `fit_predict_and_evaluate` function with the appropriate parameters.
results = fit_predict_and_evaluate(
    model=model,
    training_set=train_series_set,
    validation_set=val_series_set,
    n_forecasts=28,
    past_covariates_train=train_exog_set,
    past_covariates_validation=val_exog_set,
    series_list=True,
    model_name='Nbeats'
)

"\n# `run_nbeats` is a function that initializes and returns a model object.\nmodel = create_nbeats_model()\n\n# Then we call the `fit_predict_and_evaluate` function with the appropriate parameters.\nresults = fit_predict_and_evaluate(\n    model=model,\n    training_set=train_series_set,\n    validation_set=val_series_set,\n    n_forecasts=28,\n    past_covariates_train=train_exog_set,\n    past_covariates_validation=val_exog_set,\n    series_list=True,\n    model_name='Nbeats'\n)\n"

##**Lightgbm**

In [72]:
'''create_lightgbm_model(lags=12, output_chunk_length=1, max_depth=3,
                       n_estimators=50, learning_rate=0.1, verbose=-1,num_leaves=31,
                       lags_past_covariates = None, min_data_in_leaf=20,extra_trees=True
'''
model_lgbm = create_lightgbm_model(lags_past_covariates = 7,output_chunk_length=2)


results = fit_predict_and_evaluate(
    model=model_lgbm,
    training_set=train_series_set,
    validation_set=val_series_set,
    n_forecasts=28,
    past_covariates_train=train_exog_set,
    past_covariates_validation=val_exog_set,
    series_list=True,
    model_name='LightGBM'
)



-----------------------------------------
MAE (LightGBM) = 0.12
sMAPE (LightGBM) = 115.71
RMSE (LightGBM) = 0.14
Training Time (LightGBM) = 72.68sec
Forecast Time (LightGBM) = 4.55sec
Total Time (LightGBM) = 77.23sec
-----------------------------------------


In [75]:
# if smape is to be calculated manually
'''
predicted_series_set = results['forecast']
rescaled_val_series_set = [Scale_series.inverse_transform(series) for series in val_series_set]
rescaled_predicted_series_set = [Scale_series.inverse_transform(series) for series in predicted_series_set]

smape(rescaled_val_series_set, rescaled_predicted_series_set, inter_reduction=np.mean)
'''




## **Results Comparision for different subsets of dataset**

In [74]:
metrics_df = add_metrics_to_df(
    metrics_df,
    store_id = store_id,
    dept_id = dept_id,
    model_name = 'LightGBM',
    mae = results['MAE'],
    smape = results['sMAPE'],
    rmse = results['RMSE'],
    training_time = results['Training Time'],
    forecast_time = results['Forecast Time'],
    total_time = results['Total Time'],
    dataset_size = dataset_size
)

display(metrics_df)

Unnamed: 0,Store ID,Dept ID,Model Name,MAE,sMAPE,RMSE,Training Time(seconds),Forecast Time(seconds),Total Time(seconds),Dataset Size(MB)
0,CA_1,HOBBIES_1,LightGBM,0.107672,148.803904,0.12528,33.775537,5.871138,39.646674,67.065893
0,CA_1,HOBBIES_2,LightGBM,0.044555,183.207458,0.061002,10.212956,1.16326,11.376216,22.5633
0,CA_1,HOUSEHOLD_1,LightGBM,0.11863,118.868008,0.14095,38.375891,3.119954,41.495845,81.064966
0,CA_1,HOUSEHOLD_2,LightGBM,0.172562,179.098179,0.242488,38.158104,2.980633,41.138738,83.849216
0,CA_1,FOODS_1,LightGBM,0.088543,125.784571,0.113485,17.765503,1.661181,19.426684,35.845013
0,CA_1,FOODS_2,LightGBM,0.101978,128.406439,0.117915,38.375881,2.375662,40.751543,62.293983
0,CA_1,FOODS_3,LightGBM,0.122019,115.705377,0.141701,72.678828,4.547172,77.225999,129.41293


#**Cross Validation**

In [33]:
from sklearn.model_selection._split import _BaseKFold, indexable, _num_samples
from sklearn.utils.validation import _deprecate_positional_args
#(n_splits=(default 5), max_train_size = (default = None))
# https://github.com/getgaurav2/scikit-learn/blob/d4a3af5cc9da3a76f0266932644b884c99724c57/sklearn/model_selection/_split.py#L2243
class GroupTimeSeriesSplit(_BaseKFold):
    """Time Series cross-validator variant with non-overlapping groups.
    Provides train/test indices to split time series data samples
    that are observed at fixed time intervals according to a
    third-party provided group.
    In each split, test indices must be higher than before, and thus shuffling
    in cross validator is inappropriate.
    This cross-validation object is a variation of :class:`KFold`.
    In the kth split, it returns first k folds as train set and the
    (k+1)th fold as test set.
    The same group will not appear in two different folds (the number of
    distinct groups has to be at least equal to the number of folds).
    Note that unlike standard cross-validation methods, successive
    training sets are supersets of those that come before them.
    Read more in the :ref:`User Guide <cross_validation>`.
    Parameters
    ----------
    n_splits : int, default=5
        Number of splits. Must be at least 2.
    max_train_size : int, default=None
        Maximum size for a single training set.
    Examples
    --------
    >>> import numpy as np
    >>> from sklearn.model_selection import GroupTimeSeriesSplit
    >>> groups = np.array(['a', 'a', 'a', 'a', 'a', 'a',\
                           'b', 'b', 'b', 'b', 'b',\
                           'c', 'c', 'c', 'c',\
                           'd', 'd', 'd'])
    >>> gtss = GroupTimeSeriesSplit(n_splits=3)
    >>> for train_idx, test_idx in gtss.split(groups, groups=groups):
    ...     print("TRAIN:", train_idx, "TEST:", test_idx)
    ...     print("TRAIN GROUP:", groups[train_idx],\
                  "TEST GROUP:", groups[test_idx])
    TRAIN: [0, 1, 2, 3, 4, 5] TEST: [6, 7, 8, 9, 10]
    TRAIN GROUP: ['a' 'a' 'a' 'a' 'a' 'a']\
    TEST GROUP: ['b' 'b' 'b' 'b' 'b']
    TRAIN: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10] TEST: [11, 12, 13, 14]
    TRAIN GROUP: ['a' 'a' 'a' 'a' 'a' 'a' 'b' 'b' 'b' 'b' 'b']\
    TEST GROUP: ['c' 'c' 'c' 'c']
    TRAIN: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]\
    TEST: [15, 16, 17]
    TRAIN GROUP: ['a' 'a' 'a' 'a' 'a' 'a' 'b' 'b' 'b' 'b' 'b' 'c' 'c' 'c' 'c']\
    TEST GROUP: ['d' 'd' 'd']
    """
    @_deprecate_positional_args
    def __init__(self,
                 n_splits=5,
                 *,
                 max_train_size=None
                 ):
        super().__init__(n_splits, shuffle=False, random_state=None)
        self.max_train_size = max_train_size

    def split(self, X, y=None, groups=None):
        """Generate indices to split data into training and test set.
        Parameters
        ----------
        X : array-like of shape (n_samples, n_features)
            Training data, where n_samples is the number of samples
            and n_features is the number of features.
        y : array-like of shape (n_samples,)
            Always ignored, exists for compatibility.
        groups : array-like of shape (n_samples,)
            Group labels for the samples used while splitting the dataset into
            train/test set.
        Yields
        ------
        train : ndarray
            The training set indices for that split.
        test : ndarray
            The testing set indices for that split.
        """
        if groups is None:
            raise ValueError(
                "The 'groups' parameter should not be None")
        X, y, groups = indexable(X, y, groups)
        n_samples = _num_samples(X)
        n_splits = self.n_splits
        n_folds = n_splits + 1
        group_dict = {}
        u, ind = np.unique(groups, return_index=True)
        unique_groups = u[np.argsort(ind)]
        n_samples = _num_samples(X)
        n_groups = _num_samples(unique_groups)
        for idx in np.arange(n_samples):
            if (groups[idx] in group_dict):
                group_dict[groups[idx]].append(idx)
            else:
                group_dict[groups[idx]] = [idx]
        if n_folds > n_groups:
            raise ValueError(
                ("Cannot have number of folds={0} greater than"
                 " the number of groups={1}").format(n_folds,
                                                     n_groups))
        group_test_size = n_groups // n_folds
        group_test_starts = range(n_groups - n_splits * group_test_size,
                                  n_groups, group_test_size)
        for group_test_start in group_test_starts:
            train_array = []
            test_array = []
            for train_group_idx in unique_groups[:group_test_start]:
                train_array_tmp = group_dict[train_group_idx]
                train_array = np.sort(np.unique(
                                      np.concatenate((train_array,
                                                      train_array_tmp)),
                                      axis=None), axis=None)
            train_end = train_array.size
            if self.max_train_size and self.max_train_size < train_end:
                train_array = train_array[train_end -
                                          self.max_train_size:train_end]
            for test_group_idx in unique_groups[group_test_start:
                                                group_test_start +
                                                group_test_size]:
                test_array_tmp = group_dict[test_group_idx]
                test_array = np.sort(np.unique(
                                              np.concatenate((test_array,
                                                              test_array_tmp)),
                                     axis=None), axis=None)
            yield [int(i) for i in train_array], [int(i) for i in test_array]



class PurgedGroupTimeSeriesSplit(_BaseKFold):
    """Time Series cross-validator variant with non-overlapping groups.
    Allows for a gap in groups to avoid potentially leaking info from
    train into test if the model has windowed or lag features.
    Provides train/test indices to split time series data samples
    that are observed at fixed time intervals according to a
    third-party provided group.
    In each split, test indices must be higher than before, and thus shuffling
    in cross validator is inappropriate.
    This cross-validation object is a variation of :class:`KFold`.
    In the kth split, it returns first k folds as train set and the
    (k+1)th fold as test set.
    The same group will not appear in two different folds (the number of
    distinct groups has to be at least equal to the number of folds).
    Note that unlike standard cross-validation methods, successive
    training sets are supersets of those that come before them.
    Read more in the :ref:`User Guide <cross_validation>`.
    Parameters
    ----------
    n_splits : int, default=5
        Number of splits. Must be at least 2.
    max_train_group_size : int, default=Inf
        Maximum group size for a single training set.
    group_gap : int, default=None
        Gap between train and test
    max_test_group_size : int, default=Inf
        We discard this number of groups from the end of each train split
    """

    @_deprecate_positional_args
    def __init__(self,
                 n_splits=5,
                 *,
                 max_train_group_size=np.inf,
                 max_test_group_size=np.inf,
                 group_gap=None,
                 verbose=False
                 ):
        super().__init__(n_splits, shuffle=False, random_state=None)
        self.max_train_group_size = max_train_group_size
        self.group_gap = group_gap
        self.max_test_group_size = max_test_group_size
        self.verbose = verbose

    def split(self, X, y=None, groups=None):
        """Generate indices to split data into training and test set.
        Parameters
        ----------
        X : array-like of shape (n_samples, n_features)
            Training data, where n_samples is the number of samples
            and n_features is the number of features.
        y : array-like of shape (n_samples,)
            Always ignored, exists for compatibility.
        groups : array-like of shape (n_samples,)
            Group labels for the samples used while splitting the dataset into
            train/test set.
        Yields
        ------
        train : ndarray
            The training set indices for that split.
        test : ndarray
            The testing set indices for that split.
        """
        if groups is None:
            raise ValueError(
                "The 'groups' parameter should not be None")
        X, y, groups = indexable(X, y, groups)
        n_samples = _num_samples(X)
        n_splits = self.n_splits
        group_gap = self.group_gap
        max_test_group_size = self.max_test_group_size
        max_train_group_size = self.max_train_group_size
        n_folds = n_splits + 1
        group_dict = {}
        u, ind = np.unique(groups, return_index=True)
        unique_groups = u[np.argsort(ind)]
        n_samples = _num_samples(X)
        n_groups = _num_samples(unique_groups)
        for idx in np.arange(n_samples):
            if (groups[idx] in group_dict):
                group_dict[groups[idx]].append(idx)
            else:
                group_dict[groups[idx]] = [idx]
        if n_folds > n_groups:
            raise ValueError(
                ("Cannot have number of folds={0} greater than"
                 " the number of groups={1}").format(n_folds,
                                                     n_groups))

        group_test_size = min(n_groups // n_folds, max_test_group_size)
        group_test_starts = range(n_groups - n_splits * group_test_size,
                                  n_groups, group_test_size)
        for group_test_start in group_test_starts:
            train_array = []
            test_array = []

            group_st = max(0, group_test_start - group_gap - max_train_group_size)
            for train_group_idx in unique_groups[group_st:(group_test_start - group_gap)]:
                train_array_tmp = group_dict[train_group_idx]

                train_array = np.sort(np.unique(
                                      np.concatenate((train_array,
                                                      train_array_tmp)),
                                      axis=None), axis=None)

            train_end = train_array.size

            for test_group_idx in unique_groups[group_test_start:
                                                group_test_start +
                                                group_test_size]:
                test_array_tmp = group_dict[test_group_idx]
                test_array = np.sort(np.unique(
                                              np.concatenate((test_array,
                                                              test_array_tmp)),
                                     axis=None), axis=None)

            test_array  = test_array[group_gap:]


            if self.verbose > 0:
                    pass

            yield [int(i) for i in train_array], [int(i) for i in test_array]



In [56]:
df_cv = formatted_df.copy()

# Convert 'date' column to datetime
df_cv['date'] = pd.to_datetime(df_cv['date'])

# Create sub-group identifiers combining 'id' and the month
df_cv['sub_group'] = df_cv['id'] + '_' + df_cv['date'].dt.to_period('M').astype(str)

X = df_cv[['id', 'date', 'value']]  # Features including 'id' for grouping
groups = df_cv['sub_group']         # Sub-group as the group labels


## **Group Cross Validation**

In [57]:
train_data_splits = []
test_data_splits = []
train_exog_splits = []
test_exog_splits = []

n_splits = 3

exogenous_features = ['sell_price', 'snap_CA', 'snap_TX', 'snap_WI'] + \
    [e for e in df_cv.columns if e.startswith('event')] + \
    [e for e in df_cv.columns if e.startswith('event2')]


# Performing cross-validation for each group with sub-groups
gtss = GroupTimeSeriesSplit(n_splits=n_splits)

unique_groups = df_cv['id'].unique()
for group in unique_groups:

    group_data = df_cv[df_cv['id'] == group].reset_index(drop=True)

    X_group = group_data  # Group-specific feature
    sub_groups = group_data['sub_group']     # Sub-groups

    for train_idx, test_idx in gtss.split(X_group, groups=sub_groups):

        test_exog_idx = extend_index_with_past(24,train_idx, test_idx, X_group)

        X_train, X_test, X_test_exog = X_group.iloc[train_idx], X_group.iloc[test_idx],X_group.iloc[test_exog_idx]

        X_train_ts, X_test_ts = TimeSeries.from_dataframe(X_train, time_col='date', value_cols='value'),TimeSeries.from_dataframe(X_test, time_col='date', value_cols='value')
        exog_train_ts, exog_test_ts = TimeSeries.from_dataframe(X_train, time_col='date', value_cols=exogenous_features),TimeSeries.from_dataframe(X_test_exog, time_col='date', value_cols=exogenous_features)

        train_data_splits.append(X_train_ts)
        test_data_splits.append(X_test_ts)
        train_exog_splits.append(exog_train_ts)
        test_exog_splits.append(exog_test_ts)

# Splitting the train and test data splits into lists, one for each split
train_splits = [train_data_splits[i::n_splits] for i in range(n_splits)]
test_splits = [test_data_splits[i::n_splits] for i in range(n_splits)]
exog_train_splits = [train_exog_splits[i::n_splits] for i in range(n_splits)]
exog_test_splits = [test_exog_splits[i::n_splits] for i in range(n_splits)]

del train_data_splits, test_data_splits, train_exog_splits, test_exog_splits

# LightGBM

model_lgbm = create_lightgbm_model(lags_past_covariates = 2)

for train_split, test_split, exog_train, exog_test in zip(train_splits, test_splits, exog_train_splits, exog_test_splits ):

  results = fit_predict_and_evaluate(
      model=model_lgbm,
      training_set=train_split,
      validation_set=test_split,
      n_forecasts=28,
      past_covariates_train=exog_train,
      past_covariates_validation=exog_test,
      series_list=True,
      model_name='LightGBM'
  )

# Nbeats

model = create_nbeats_model()

for train_split, test_split, exog_train, exog_test in zip(train_splits, test_splits, exog_train_splits, exog_test_splits ):


  results = fit_predict_and_evaluate(
  model=model,
  training_set=train_split,
  validation_set=test_split,
  n_forecasts=48,
  past_covariates_train=exog_train,
  past_covariates_validation=exog_test,
  series_list=True,
  model_name='Nbeats'
)


-----------------------------------------
MAE (LightGBM) = 1.54
sMAPE (LightGBM) = 128.75
RMSE (LightGBM) = 1.98
Training Time (LightGBM) = 1.20sec
Forecast Time (LightGBM) = 1.37sec
Total Time (LightGBM) = 2.57sec
-----------------------------------------
-----------------------------------------
MAE (LightGBM) = 1.41
sMAPE (LightGBM) = 133.14
RMSE (LightGBM) = 1.81
Training Time (LightGBM) = 1.67sec
Forecast Time (LightGBM) = 0.96sec
Total Time (LightGBM) = 2.63sec
-----------------------------------------
-----------------------------------------
MAE (LightGBM) = 1.41
sMAPE (LightGBM) = 126.35
RMSE (LightGBM) = 1.86
Training Time (LightGBM) = 2.33sec
Forecast Time (LightGBM) = 0.93sec
Total Time (LightGBM) = 3.26sec
-----------------------------------------


INFO:pytorch_lightning.utilities.rank_zero:GPU available: False, used: False
INFO:pytorch_lightning.utilities.rank_zero:TPU available: False, using: 0 TPU cores
INFO:pytorch_lightning.utilities.rank_zero:IPU available: False, using: 0 IPUs
INFO:pytorch_lightning.utilities.rank_zero:HPU available: False, using: 0 HPUs
INFO:pytorch_lightning.callbacks.model_summary:
  | Name          | Type             | Params
---------------------------------------------------
0 | criterion     | MSELoss          | 0     
1 | train_metrics | MetricCollection | 0     
2 | val_metrics   | MetricCollection | 0     
3 | stacks        | ModuleList       | 250 K 
---------------------------------------------------
244 K     Trainable params
6.7 K     Non-trainable params
250 K     Total params
1.004     Total estimated model params size (MB)


Training: |          | 0/? [00:00<?, ?it/s]

/usr/local/lib/python3.10/dist-packages/pytorch_lightning/trainer/call.py:54: Detected KeyboardInterrupt, attempting graceful shutdown...
INFO:pytorch_lightning.utilities.rank_zero:GPU available: False, used: False
INFO:pytorch_lightning.utilities.rank_zero:TPU available: False, using: 0 TPU cores
INFO:pytorch_lightning.utilities.rank_zero:IPU available: False, using: 0 IPUs
INFO:pytorch_lightning.utilities.rank_zero:HPU available: False, using: 0 HPUs


Predicting: |          | 0/? [00:00<?, ?it/s]

KeyboardInterrupt: ignored

## **Purged group cross validation**

In [58]:
train_data_splits = []
test_data_splits = []
train_exog_splits = []
test_exog_splits = []

n_splits = 3

exogenous_features = ['sell_price', 'snap_CA', 'snap_TX', 'snap_WI'] + \
    [e for e in df_cv.columns if e.startswith('event')] + \
    [e for e in df_cv.columns if e.startswith('event2')]


# Performing cross-validation for each group with sub-groups
pgtss = PurgedGroupTimeSeriesSplit(n_splits=n_splits, group_gap=1,max_train_group_size=15,max_test_group_size=5)

unique_groups = df_cv['id'].unique()
for group in unique_groups:

    group_data = df_cv[df_cv['id'] == group].reset_index(drop=True)

    X_group = group_data  # Group-specific feature
    sub_groups = group_data['sub_group']     # Sub-groups

    for train_idx, test_idx in pgtss.split(X_group, groups=sub_groups):

        test_exog_idx = extend_index_with_past(24, train_idx, test_idx, X_group)

        X_train, X_test, X_test_exog = X_group.iloc[train_idx], X_group.iloc[test_idx],X_group.iloc[test_exog_idx]

        X_train_ts, X_test_ts = TimeSeries.from_dataframe(X_train, time_col='date', value_cols='value'),TimeSeries.from_dataframe(X_test, time_col='date', value_cols='value')
        exog_train_ts, exog_test_ts = TimeSeries.from_dataframe(X_train, time_col='date', value_cols=exogenous_features),TimeSeries.from_dataframe(X_test_exog, time_col='date', value_cols=exogenous_features)

        train_data_splits.append(X_train_ts)
        test_data_splits.append(X_test_ts)
        train_exog_splits.append(exog_train_ts)
        test_exog_splits.append(exog_test_ts)

# Splitting the train and test data splits into lists, one for each split
train_splits = [train_data_splits[i::n_splits] for i in range(n_splits)]
test_splits = [test_data_splits[i::n_splits] for i in range(n_splits)]
exog_train_splits = [train_exog_splits[i::n_splits] for i in range(n_splits)]
exog_test_splits = [test_exog_splits[i::n_splits] for i in range(n_splits)]

del train_data_splits, test_data_splits,  train_exog_splits, test_exog_splits

# LightGBM

model_lgbm = create_lightgbm_model(lags_past_covariates = 2)

for train_split, test_split, exog_train, exog_test in zip(train_splits, test_splits, exog_train_splits, exog_test_splits ):

  results = fit_predict_and_evaluate(
      model=model_lgbm,
      training_set=train_split,
      validation_set=test_split,
      n_forecasts=35,
      past_covariates_train=exog_train,
      past_covariates_validation=exog_test,
      series_list=True,
      model_name='LightGBM'
  )

# Nbeats

model = create_nbeats_model()

for train_split, test_split, exog_train, exog_test in zip(train_splits, test_splits, exog_train_splits, exog_test_splits ):


  results = fit_predict_and_evaluate(
  model=model,
  training_set=train_split,
  validation_set=test_split,
  n_forecasts=48,
  past_covariates_train=exog_train,
  past_covariates_validation=exog_test,
  series_list=True,
  model_name='Nbeats'
)


-----------------------------------------
MAE (LightGBM) = 1.53
sMAPE (LightGBM) = 132.09
RMSE (LightGBM) = 1.81
Training Time (LightGBM) = 0.92sec
Forecast Time (LightGBM) = 0.96sec
Total Time (LightGBM) = 1.88sec
-----------------------------------------
-----------------------------------------
MAE (LightGBM) = 1.42
sMAPE (LightGBM) = 128.09
RMSE (LightGBM) = 1.62
Training Time (LightGBM) = 0.93sec
Forecast Time (LightGBM) = 0.93sec
Total Time (LightGBM) = 1.85sec
-----------------------------------------
-----------------------------------------
MAE (LightGBM) = 1.32
sMAPE (LightGBM) = 139.26
RMSE (LightGBM) = 1.48
Training Time (LightGBM) = 1.33sec
Forecast Time (LightGBM) = 1.28sec
Total Time (LightGBM) = 2.61sec
-----------------------------------------


"\nmodel = create_nbeats_model()\n\nfor train_split, test_split, exog_train, exog_test in zip(train_splits, test_splits, exog_train_splits, exog_test_splits ):\n\n\n  results = fit_predict_and_evaluate(\n  model=model,\n  training_set=train_split,\n  validation_set=test_split,\n  n_forecasts=48,\n  past_covariates_train=exog_train,\n  past_covariates_validation=exog_test,\n  series_list=True,\n  model_name='Nbeats'\n)\n"

#**Optuna Hyper parameter optimization**

In [87]:
def objective(trial, training_data, validation_data, model_type, past_cov_train, past_cov_validation,model_name):
    """
    Optuna objective function for hyperparameter optimization.
    This function is to find the best parameters for a model
    to minimize the Symmetric Mean Absolute Percentage Error (sMAPE)
    between predicted and actual values in a validation dataset.
    """
    # Define hyperparameters


    n_epochs = trial.suggest_int('n_epochs', 1, 10)
    batch_size = trial.suggest_categorical('batch_size', [8, 16])

    # Additional hyperparameters for specific models
    if model_type == 'NBEATS':
        output_chunk_length = trial.suggest_categorical('output_chunk_length', [24, 48])
        num_stacks = trial.suggest_int('num_stacks', 1, 3)
        num_blocks = trial.suggest_int('num_blocks', 1, 3)
        num_layers = trial.suggest_int('num_layers', 1, 3)
        layer_widths = trial.suggest_categorical('layer_widths', [16, 32, 64])
        input_chunk_length = trial.suggest_categorical('input_chunk_length', [24, 48])
        dropout = trial.suggest_float('dropout', 0.0, 0.2)
        activation = trial.suggest_categorical('activation', ['ReLU', 'RReLU', 'LeakyReLU'])
        model = create_nbeats_model(
            input_chunk_length=input_chunk_length,
            output_chunk_length=output_chunk_length,
            generic_architecture=True,
            num_stacks=num_stacks,
            num_blocks=num_blocks,
            num_layers=num_layers,
            layer_widths=layer_widths,
            n_epochs=n_epochs,
            nr_epochs_val_period=1,
            batch_size=batch_size,
            model_name="NBEATS"
        )
    elif model_type == 'LightGBM':
        #output_chunk_length = trial.suggest_int('output_chunk_length', 1,24)
        output_chunk_length = trial.suggest_int('output_chunk_length', 1,10)
        lags_past_covariates = trial.suggest_int('lags_past_covariates', 1, 24)
        lags = trial.suggest_int('lags', 1, 24)
        max_depth = trial.suggest_int('max_depth', 2, 10)
        n_estimators = trial.suggest_int('n_estimators', 10, 100)
        learning_rate = trial.suggest_float('learning_rate', 0.01, 0.3)
        num_leaves = trial.suggest_int('num_leaves', 20, 60)
        extra_trees = trial.suggest_categorical('extra_trees', [True, False])

        model = create_lightgbm_model(
            lags_past_covariates=lags_past_covariates,
            output_chunk_length=output_chunk_length,
            lags=lags,
            max_depth=max_depth,
            n_estimators=n_estimators,
            learning_rate=learning_rate,
            verbose=-1,
            num_leaves=num_leaves,
            extra_trees=extra_trees
        )
    else:
        raise ValueError("Invalid model type")

    # Predict and calculate sMAPE
    forecast = fit_predict_and_evaluate(
        model=model,
        training_set=training_data,
        validation_set=validation_data,
        n_forecasts=28,
        past_covariates_train=past_cov_train,
        past_covariates_validation=past_cov_validation,
        series_list=True,
        model_name = model_name
    )
    smape_result = forecast['sMAPE']
    mae_result = forecast['MAE']
    rmse_result = forecast['RMSE']
    combined_metric = 0.2 * mae_result + 0.6 * smape_result + 0.2 * rmse_result

    return combined_metric

In [2]:
study_LGBM = optuna.create_study(direction='minimize')
study_LGBM.optimize(lambda trial: objective(trial, training_data = train_series_set, validation_data = val_series_set, past_cov_train= train_exog_set, past_cov_validation = val_exog_set ,model_name = 'LightGBM',model_type = 'LightGBM'), n_trials=10)

print('Best Params for LightGBM:')
trial_LGBM = study_LGBM.best_trial
print(trial_LGBM.params)

NameError: ignored