<a href="https://colab.research.google.com/github/azhgh22/Walmart-Recruiting-Store-Sales-Forecasting/blob/main/notebooks/group_stat_with_deep_learning.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Notebook Setup

The following setup is provided as a basic example for initializing the notebook environment. It includes necessary imports, optional configuration, and a placeholder for data loading or downloading.

This section is **not part of the core model logic**, and the code here may vary depending on your environment or data access method.

## Setup Environment


In [1]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [2]:
from google.colab import userdata
token = userdata.get('GITHUB_TOKEN')
user_name = userdata.get('GITHUB_USERNAME')
mail = userdata.get('GITHUB_MAIL')

!git config --global user.name "{user_name}"
!git config --global user.email "{mail}"
!git clone https://{token}@github.com/azhgh22/Walmart-Recruiting-Store-Sales-Forecasting.git

%cd Walmart-Recruiting-Store-Sales-Forecasting

Cloning into 'Walmart-Recruiting-Store-Sales-Forecasting'...
remote: Enumerating objects: 484, done.[K
remote: Counting objects: 100% (80/80), done.[K
remote: Compressing objects: 100% (67/67), done.[K
remote: Total 484 (delta 45), reused 21 (delta 13), pack-reused 404 (from 1)[K
Receiving objects: 100% (484/484), 9.23 MiB | 13.42 MiB/s, done.
Resolving deltas: 100% (263/263), done.
/content/Walmart-Recruiting-Store-Sales-Forecasting


In [3]:
%%capture
!pip install -r requirements.txt

In [4]:
from google.colab import userdata
kaggle_json_path = userdata.get('KAGGLE_JSON_PATH')
! ./src/data_loader.sh -f {kaggle_json_path}

Setting up Kaggle credentials...
Ensuring data directory exists at 'data/'...
Downloading data from Kaggle for competition: 'walmart-recruiting-store-sales-forecasting'...
Downloading walmart-recruiting-store-sales-forecasting.zip to data
  0% 0.00/2.70M [00:00<?, ?B/s]
100% 2.70M/2.70M [00:00<00:00, 920MB/s]
Unzipping files...
Archive:  walmart-recruiting-store-sales-forecasting.zip
  inflating: features.csv.zip        
  inflating: sampleSubmission.csv.zip  
  inflating: stores.csv              
  inflating: test.csv.zip            
  inflating: train.csv.zip           
Archive:  features.csv.zip
  inflating: features.csv            
Archive:  sampleSubmission.csv.zip
  inflating: sampleSubmission.csv    
Archive:  test.csv.zip
  inflating: test.csv                
Archive:  train.csv.zip
  inflating: train.csv               
Data downloaded and extracted successfully to 'data/'.


In [5]:
from google.colab import userdata
wandb_login = userdata.get('WANDB_API_LOGIN')
!wandb login {wandb_login}

[34m[1mwandb[0m: No netrc file found, creating one.
[34m[1mwandb[0m: Appending key for api.wandb.ai to your netrc file: /root/.netrc
[34m[1mwandb[0m: W&B API key is configured. Use [1m`wandb login --relogin`[0m to force relogin


## Load and Split Data

In [33]:
from src import data_loader, processing
import importlib
importlib.reload(processing)

dataframes = data_loader.load_raw_data()
df = processing.run_preprocessing(dataframes, process_test=False)['train']
X_train, y_train, X_valid, y_valid = processing.split_data_by_ratio(df, separate_target=True)

print(f"Shapes of X_train and y_train: {X_train.shape}, {y_train.shape}")
print(f"Shapes of X_valid and y_valid: {X_valid.shape}, {y_valid.shape}")

Data loading complete.
Shapes of X_train and y_train: (337256, 15), (337256,)
Shapes of X_valid and y_valid: (84314, 15), (84314,)


In [34]:
X_train

Unnamed: 0,Store,Dept,Date,IsHoliday,Temperature,Fuel_Price,MarkDown1,MarkDown2,MarkDown3,MarkDown4,MarkDown5,CPI,Unemployment,Type,Size
0,1,1,2010-02-05,False,42.31,2.572,,,,,,211.096358,8.106,A,151315
1,1,2,2010-02-05,False,42.31,2.572,,,,,,211.096358,8.106,A,151315
2,1,3,2010-02-05,False,42.31,2.572,,,,,,211.096358,8.106,A,151315
3,1,4,2010-02-05,False,42.31,2.572,,,,,,211.096358,8.106,A,151315
4,1,5,2010-02-05,False,42.31,2.572,,,,,,211.096358,8.106,A,151315
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
337251,22,27,2012-04-13,False,49.89,4.025,5981.5,10877.85,9.5,1633.96,1932.86,141.843393,7.671,B,119557
337252,22,28,2012-04-13,False,49.89,4.025,5981.5,10877.85,9.5,1633.96,1932.86,141.843393,7.671,B,119557
337253,22,29,2012-04-13,False,49.89,4.025,5981.5,10877.85,9.5,1633.96,1932.86,141.843393,7.671,B,119557
337254,22,30,2012-04-13,False,49.89,4.025,5981.5,10877.85,9.5,1633.96,1932.86,141.843393,7.671,B,119557


In [35]:
X_valid

Unnamed: 0,Store,Dept,Date,IsHoliday,Temperature,Fuel_Price,MarkDown1,MarkDown2,MarkDown3,MarkDown4,MarkDown5,CPI,Unemployment,Type,Size
337256,22,32,2012-04-13,False,49.89,4.025,5981.50,10877.85,9.5,1633.96,1932.86,141.843393,7.671,B,119557
337257,22,33,2012-04-13,False,49.89,4.025,5981.50,10877.85,9.5,1633.96,1932.86,141.843393,7.671,B,119557
337258,22,34,2012-04-13,False,49.89,4.025,5981.50,10877.85,9.5,1633.96,1932.86,141.843393,7.671,B,119557
337259,22,35,2012-04-13,False,49.89,4.025,5981.50,10877.85,9.5,1633.96,1932.86,141.843393,7.671,B,119557
337260,22,36,2012-04-13,False,49.89,4.025,5981.50,10877.85,9.5,1633.96,1932.86,141.843393,7.671,B,119557
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
421565,45,93,2012-10-26,False,58.85,3.882,4018.91,58.08,100.0,211.94,858.33,192.308899,8.667,B,118221
421566,45,94,2012-10-26,False,58.85,3.882,4018.91,58.08,100.0,211.94,858.33,192.308899,8.667,B,118221
421567,45,95,2012-10-26,False,58.85,3.882,4018.91,58.08,100.0,211.94,858.33,192.308899,8.667,B,118221
421568,45,97,2012-10-26,False,58.85,3.882,4018.91,58.08,100.0,211.94,858.33,192.308899,8.667,B,118221


# Improve GroupStat (without DeepLearning)

First, we will focus on improving our **GroupStat** model as much as possible.This specifically involves **modifying the feature engineering techniques** used for each group.

We begin by adding the feature `week_of_year_avg`, created using `time_features.WeeklyStoreDept()`. This is a valuable feature because it captures the **average sales trend for each week of the year** at the store–department level. Given that our data exhibits strong **annual seasonality**, this feature helps the model learn recurring yearly patterns more effectively.

In [None]:
from sklearn.pipeline import Pipeline
from xgboost import XGBRegressor
from lightgbm import LGBMRegressor
from feature_engineering import feature_transformers, time_features
from models.group_stat import GroupStatModel

import importlib
importlib.reload(time_features)


columns_to_drop=['MarkDown1', 'MarkDown2', 'MarkDown3', 'MarkDown4', 'MarkDown5']

store_pipeline = Pipeline([
    ('feature_adder', time_features.FeatureAdder(
        add_week_num=True,
        add_holiday_flags=True,
        add_holiday_proximity=True,
        add_holiday_windows=True,
        add_fourier_features=True,
        add_month_and_year=True,
        replace_time_index=True
    )),
    ('object_to_cat', feature_transformers.ObjectToCategory()),
    ('group_stat', feature_transformers.GroupStatFeatureAdder(groupby_cols='Store')),
    ('make_cat', feature_transformers.MakeCategorical(['Store'])),
    ('drop_markdowns', feature_transformers.ChangeColumns(columns_to_drop=columns_to_drop)),
    ('model', XGBRegressor(
        objective='reg:squarederror',
        enable_categorical=True,
        random_state=42,
        n_estimators=200,
        learning_rate=0.1,
        max_depth=7,
        subsample=0.6,
        colsample_bytree=1.0,
        min_child_weight=5
    ))
])

dept_pipeline = Pipeline([
    ('feature_adder', time_features.FeatureAdder(
        add_week_num=True,
        add_holiday_flags=True,
        add_holiday_proximity=True,
        add_holiday_windows=True,
        add_fourier_features=True,
        add_month_and_year=True,
        replace_time_index=True
    )),
    ('group_stat', feature_transformers.GroupStatFeatureAdder(groupby_cols='Dept')),
    ('object_to_cat', feature_transformers.ObjectToCategory()),
    ('make_cat', feature_transformers.MakeCategorical(['Dept'])),
    ('model', XGBRegressor(
        objective='reg:squarederror',
        enable_categorical=True,
        random_state=42,
        n_estimators=300,
        learning_rate=0.1,
        max_depth=7,
        subsample=1.0,
        colsample_bytree=0.5,
        min_child_weight=1
    ))
])

global_pipeline = Pipeline([
    ('feature_adder', time_features.FeatureAdder(
        add_week_num=True,
        add_holiday_flags=True,
        add_holiday_proximity=True,
        add_holiday_windows=True,
        add_fourier_features=True,
        add_month_and_year=True,
        replace_time_index=True
    )),
    ('drop_markdowns', feature_transformers.ChangeColumns(columns_to_drop=columns_to_drop)),
    ('object_to_cat', feature_transformers.ObjectToCategory()),
    ('week_of_year_avg', time_features.WeeklyStoreDept()),
    ('make_cat', feature_transformers.MakeCategorical(['Dept', 'Store'])),
    ('model', LGBMRegressor(
        objective='regression',
        random_state=42,
        verbose=-1,
        n_estimators=1000,
        learning_rate=0.1,
        max_depth=10
    ))
])

In [None]:
from models import group_stat
from src.utils import wmae as compute_wmae

mdl = group_stat.GroupStatModel(store_pipeline=store_pipeline, dept_pipeline=dept_pipeline, global_pipeline=global_pipeline)

mdl.fit(X_train, y_train)

pred = mdl.predict(X_train)
train_wmae = compute_wmae(y_train, pred, is_holiday=X_train['IsHoliday'])
print(f"Train WMAE: {train_wmae:.2f}")

pred = mdl.predict(X_valid)
valid_wmae = compute_wmae(y_valid, pred, is_holiday=X_valid['IsHoliday'])
print(f"Valid WMAE: {valid_wmae:.2f}")

from configs.basic_config import minimal_config
from src.utils import log_to_wandb

log_to_wandb(
    model=mdl,
    train_score=train_wmae,
    val_score=valid_wmae,
    config=minimal_config,
    run_name='group_stat_02',
    artifact_name="group_stat_model",
)

Train WMAE: 559.72
Valid WMAE: 1827.40


0,1
train_wmae,▁
val_wmae,▁

0,1
train_wmae,559.72332
val_wmae,1827.39547


We will include the `week_of_year_avg` feature in **each pipeline**, as it effectively captures annual seasonality and provides valuable context for modeling weekly sales trends.


In [None]:
from sklearn.pipeline import Pipeline
from xgboost import XGBRegressor
from lightgbm import LGBMRegressor
from feature_engineering import feature_transformers, time_features
from models.group_stat import GroupStatModel

import importlib
importlib.reload(time_features)


columns_to_drop=['MarkDown1', 'MarkDown2', 'MarkDown3', 'MarkDown4', 'MarkDown5']

store_pipeline = Pipeline([
    ('feature_adder', time_features.FeatureAdder(
        add_week_num=True,
        add_holiday_flags=True,
        add_holiday_proximity=True,
        add_holiday_windows=True,
        add_fourier_features=True,
        add_month_and_year=True,
        replace_time_index=True
    )),
    ('object_to_cat', feature_transformers.ObjectToCategory()),
    ('group_stat', feature_transformers.GroupStatFeatureAdder(groupby_cols='Store')),
    ('make_cat', feature_transformers.MakeCategorical(['Store'])),
    ('week_of_year_avg', time_features.WeeklyStoreDept(dept_col=None)),
    ('drop_markdowns', feature_transformers.ChangeColumns(columns_to_drop=columns_to_drop)),
    ('model', XGBRegressor(
        objective='reg:squarederror',
        enable_categorical=True,
        random_state=42,
        n_estimators=200,
        learning_rate=0.1,
        max_depth=7,
        subsample=0.6,
        colsample_bytree=1.0,
        min_child_weight=5
    ))
])

dept_pipeline = Pipeline([
    ('feature_adder', time_features.FeatureAdder(
        add_week_num=True,
        add_holiday_flags=True,
        add_holiday_proximity=True,
        add_holiday_windows=True,
        add_fourier_features=True,
        add_month_and_year=True,
        replace_time_index=True
    )),
    ('group_stat', feature_transformers.GroupStatFeatureAdder(groupby_cols='Dept')),
    ('object_to_cat', feature_transformers.ObjectToCategory()),
    ('week_of_year_avg', time_features.WeeklyStoreDept(store_col=None)),
    ('make_cat', feature_transformers.MakeCategorical(['Dept'])),
    ('model', XGBRegressor(
        objective='reg:squarederror',
        enable_categorical=True,
        random_state=42,
        n_estimators=300,
        learning_rate=0.1,
        max_depth=7,
        subsample=1.0,
        colsample_bytree=0.5,
        min_child_weight=1
    ))
])

global_pipeline = Pipeline([
    ('feature_adder', time_features.FeatureAdder(
        add_week_num=True,
        add_holiday_flags=True,
        add_holiday_proximity=True,
        add_holiday_windows=True,
        add_fourier_features=True,
        add_month_and_year=True,
        replace_time_index=True
    )),
    ('drop_markdowns', feature_transformers.ChangeColumns(columns_to_drop=columns_to_drop)),
    ('object_to_cat', feature_transformers.ObjectToCategory()),
    ('week_of_year_avg', time_features.WeeklyStoreDept()),
    ('make_cat', feature_transformers.MakeCategorical(['Dept', 'Store'])),
    ('model', LGBMRegressor(
        objective='regression',
        random_state=42,
        verbose=-1,
        n_estimators=1000,
        learning_rate=0.1,
        max_depth=10
    ))
])

In [None]:
from models import group_stat
from src.utils import wmae as compute_wmae

mdl = group_stat.GroupStatModel(store_pipeline=store_pipeline, dept_pipeline=dept_pipeline, global_pipeline=global_pipeline)

mdl.fit(X_train, y_train)

pred = mdl.predict(X_train)
train_wmae = compute_wmae(y_train, pred, is_holiday=X_train['IsHoliday'])
print(f"Train WMAE: {train_wmae:.2f}")

pred = mdl.predict(X_valid)
valid_wmae = compute_wmae(y_valid, pred, is_holiday=X_valid['IsHoliday'])
print(f"Valid WMAE: {valid_wmae:.2f}")

from configs.basic_config import minimal_config
from src.utils import log_to_wandb

log_to_wandb(
    model=mdl,
    train_score=train_wmae,
    val_score=valid_wmae,
    config=minimal_config,
    run_name='group_stat_03',
    artifact_name="group_stat_model",
)

  df.groupby(self.group_cols_)[self.target_col]


Train WMAE: 560.68
Valid WMAE: 1821.19


0,1
train_wmae,▁
val_wmae,▁

0,1
train_wmae,560.67546
val_wmae,1821.18896


As a final improvement, we will include **shared transformers** (i.e., applied to all data, not group-specific) as part of a unified preprocessing pipeline.

This is beneficial because:

- It allows the model to capture **global patterns** that apply across all stores and departments.
- It simplifies the pipeline architecture and makes feature management more consistent and modular.

This unified approach helps create a more balanced and generalizable model.

In [None]:
from sklearn.pipeline import Pipeline
from xgboost import XGBRegressor
from lightgbm import LGBMRegressor
from feature_engineering import feature_transformers, time_features
from models.group_stat import GroupStatModel

import importlib
importlib.reload(time_features)


columns_to_drop=['MarkDown1', 'MarkDown2', 'MarkDown3', 'MarkDown4', 'MarkDown5']

preprocess = Pipeline([
    ('feature_adder', time_features.FeatureAdder(
        add_week_num=True,
        add_holiday_flags=True,
        add_holiday_proximity=True,
        add_holiday_windows=True,
        add_fourier_features=True,
        add_month_and_year=True,
        replace_time_index=True
    )),
    ('object_to_cat', feature_transformers.ObjectToCategory()),
    ('week_of_year_avg', time_features.WeeklyStoreDept()),
    ('make_cat', feature_transformers.MakeCategorical(['Dept', 'Store'])),
    ('drop_markdowns', feature_transformers.ChangeColumns(columns_to_drop=columns_to_drop)),
])

store_pipeline = Pipeline([
    ('group_stat', feature_transformers.GroupStatFeatureAdder(groupby_cols='Store')),
    ('model', XGBRegressor(
        objective='reg:squarederror',
        enable_categorical=True,
        random_state=42,
        n_estimators=200,
        learning_rate=0.1,
        max_depth=7,
        subsample=0.6,
        colsample_bytree=1.0,
        min_child_weight=5
    ))
])

dept_pipeline = Pipeline([
    ('group_stat', feature_transformers.GroupStatFeatureAdder(groupby_cols='Dept')),
    ('model', XGBRegressor(
        objective='reg:squarederror',
        enable_categorical=True,
        random_state=42,
        n_estimators=300,
        learning_rate=0.1,
        max_depth=7,
        subsample=1.0,
        colsample_bytree=0.5,
        min_child_weight=1
    ))
])

global_pipeline = Pipeline([
    ('model', LGBMRegressor(
        objective='regression',
        random_state=42,
        verbose=-1,
        n_estimators=1000,
        learning_rate=0.1,
        max_depth=10
    ))
])

In [None]:
from models import group_stat
from src.utils import wmae as compute_wmae

mdl = group_stat.GroupStatModel(store_pipeline=store_pipeline, dept_pipeline=dept_pipeline, global_pipeline=global_pipeline)

pipeline = Pipeline([
    ('preprocess', preprocess),
    ('model', mdl)
])

pipeline.fit(X_train, y_train)

pred = pipeline.predict(X_train)
train_wmae = compute_wmae(y_train, pred, is_holiday=X_train['IsHoliday'])
print(f"Train WMAE: {train_wmae:.2f}")

pred = pipeline.predict(X_valid)
valid_wmae = compute_wmae(y_valid, pred, is_holiday=X_valid['IsHoliday'])
print(f"Valid WMAE: {valid_wmae:.2f}")

from configs.basic_config import minimal_config
from src.utils import log_to_wandb

log_to_wandb(
    model=pipeline,
    train_score=train_wmae,
    val_score=valid_wmae,
    config=minimal_config,
    run_name='group_stat_04',
    artifact_name="group_stat_model",
)



Train WMAE: 668.93




Valid WMAE: 1661.68


[34m[1mwandb[0m: Currently logged in as: [33mzhorzholianimate[0m ([33mMLBeasts[0m) to [32mhttps://api.wandb.ai[0m. Use [1m`wandb login --relogin`[0m to force relogin


0,1
train_wmae,▁
val_wmae,▁

0,1
train_wmae,668.9271
val_wmae,1661.67501


Based on our analysis above, we have implemented the final version of our group-based model: **`GeneralWalmartGroupSalesModel`**.

The full implementation can be found in:

```
models/walmart_group_salesv2
```

In [None]:
columns_to_drop=['MarkDown1', 'MarkDown2', 'MarkDown3', 'MarkDown4', 'MarkDown5']

preprocess = Pipeline([
    ('feature_adder', time_features.FeatureAdder(
        add_week_num=True,
        add_holiday_flags=True,
        add_holiday_proximity=True,
        add_holiday_windows=True,
        add_fourier_features=True,
        add_month_and_year=True,
        replace_time_index=True
    )),
    ('object_to_cat', feature_transformers.ObjectToCategory()),
    ('week_of_year_avg', time_features.WeeklyStoreDept()),
    ('make_cat', feature_transformers.MakeCategorical(['Dept', 'Store'])),
    ('drop_markdowns', feature_transformers.ChangeColumns(columns_to_drop=columns_to_drop)),
])

from models.walmart_group_salesv2 import GeneralWalmartGroupSalesModel

pipeline = Pipeline([
    ('preprocess', preprocess),
    ('model', GeneralWalmartGroupSalesModel())
])

pipeline.fit(X_train, y_train)

pred = pipeline.predict(X_train)
train_wmae = compute_wmae(y_train, pred, is_holiday=X_train['IsHoliday'])
print(f"Train WMAE: {train_wmae:.2f}")

pred = pipeline.predict(X_valid)
valid_wmae = compute_wmae(y_valid, pred, is_holiday=X_valid['IsHoliday'])
print(f"Valid WMAE: {valid_wmae:.2f}")

from configs.basic_config import minimal_config
from src.utils import log_to_wandb

log_to_wandb(
    model=pipeline,
    train_score=train_wmae,
    val_score=valid_wmae,
    config=minimal_config,
    run_name='group_stat_05',
    artifact_name="group_stat_model",
)



Train WMAE: 668.93




Valid WMAE: 1661.68


0,1
train_wmae,▁
val_wmae,▁

0,1
train_wmae,668.9271
val_wmae,1661.67501


# Deep Learning with GroupStat

Now, we will combine our developed **deep learning models** with the **GroupStat** model to create an ensemble.

This approach makes intuitive sense because:

- The **deep learning models** primarily learn from **pure time series trends**, without using any engineered features.
- The **GroupStat model**, on the other hand, is based entirely on **feature engineering** and captures patterns from group-level statistics and external variables.

First, we will use the model implemented in:

```
models/deep_walmart_group_sales
```

Specifically, the `DeepWalmartGroupSales` class.

This model is a **hybrid ensemble** that combines a pre-assembled group of **deep neural network models** and the **GroupStat** model.

In [None]:
from models import deep_walmart_group_sales
import importlib
importlib.reload(deep_walmart_group_sales)

from models.deep_walmart_group_sales import DeepWalmartGroupSales
final_model = DeepWalmartGroupSales()
final_model.fit(X_train,y_train)
pred = final_model.predict(X_valid)
valid_wmae = compute_wmae(y_valid, pred, is_holiday=X_valid['IsHoliday'])
print(f"Valid WMAE: {valid_wmae:.2f}")



Valid WMAE: 1465.58




In [10]:
from models import deep_walmart_group_sales
import importlib
importlib.reload(deep_walmart_group_sales)

from models.deep_walmart_group_sales import DeepWalmartGroupSales
final_model = DeepWalmartGroupSales()
final_model.fit(df.drop(columns=['Weekly_Sales']), df['Weekly_Sales'])

from configs.basic_config import minimal_config
from src.utils import log_to_wandb

log_to_wandb(
    model=final_model,
    train_score=-1,
    val_score=1465.58,
    config=minimal_config,
    run_name='group_stat_deep_learning_02',
    artifact_name="group_stat_model",
    artifact_description="Implementation can be found at github. This Model is spacific Have many parameters not logged here."
)

[34m[1mwandb[0m: Currently logged in as: [33mzhorzholianimate[0m ([33mMLBeasts[0m) to [32mhttps://api.wandb.ai[0m. Use [1m`wandb login --relogin`[0m to force relogin


0,1
train_wmae,▁
val_wmae,▁

0,1
train_wmae,-1.0
val_wmae,1465.58


# Full Implementation

In this section, we present the **full implementation** of our core idea:  
the **assembly of a hybrid model** that combines deep learning architectures with the **GroupStat** model.

The following class will be used to assemble multiple models into a combined ensemble.

In [14]:
from sklearn.base import BaseEstimator, RegressorMixin
class NNAssemble(BaseEstimator,RegressorMixin):
  def __init__(self, models):
    self.models = models

  def fit(self, X, y):
    for model in self.models:
      model.fit(X, y)

    return self

  def predict(self, X):
    predictions = [model.predict(X) for model in self.models]
    return np.mean(predictions, axis=0)

Recall from the notebook assembly, the top-performing deep learning models we developed:

In [15]:
from neuralforecast.models import NBEATS
from neuralforecast.models import DLinear
from neuralforecast.models import PatchTST
from neuralforecast.models import TFT
from models.neural_forecast_models import NeuralForecastModels
from src.utils import wmae as compute_wmae
import logging
import torch

logging.getLogger().setLevel(logging.WARNING)
logging.getLogger("neuralforecast").setLevel(logging.WARNING)
logging.getLogger("pytorch_lightning").setLevel(logging.WARNING)
logging.getLogger("lightning_fabric").setLevel(logging.WARNING)

model = NBEATS(
    max_steps= 25 * 104,
    h= 53,
    random_seed= 42,
    input_size=52,
    batch_size= 256,
    learning_rate= 1e-3,
    shared_weights=True,
    optimizer= torch.optim.AdamW,
    activation = 'ReLU',
    enable_progress_bar = False
)
nbeats_model = NeuralForecastModels(models=[model], model_names=['NBEATS'], freq='W-FRI', one_model=True)


model = DLinear(
    max_steps= 25 * 104,
    h= 53,
    random_seed= 42,
    input_size=60,
    batch_size= 512,
    learning_rate= 1e-2,
    optimizer= torch.optim.Adagrad,
    scaler_type= 'robust',
    enable_progress_bar=False,
    enable_model_summary=False
)
dlinear_model = NeuralForecastModels(models=[model], model_names=['DLinear'], freq='W-FRI', one_model=True)


model = PatchTST(
    input_size=52,
    dropout = 0.2,
    h=53,
    max_steps= 60 * 104,
    batch_size=64,
    random_seed=42,
    activation='relu',
    enable_progress_bar=False,
    enable_model_summary=False,
)
patchtst_model = NeuralForecastModels(models=[model], model_names=['PatchTST'], freq='W-FRI', one_model=True)

model = TFT(
    input_size=60,
    dropout = 0.1,
    h=53,
    max_steps= 20 * 104,
    random_seed=42,
    enable_progress_bar=False,
    enable_model_summary=False,
)
tft_model = NeuralForecastModels(models=[model], model_names=['TFT'], freq='W-FRI', one_model=True)

In this notebook, we found that the most effective GroupStat model is the one described below.

In [19]:
from sklearn.pipeline import Pipeline
from xgboost import XGBRegressor
from lightgbm import LGBMRegressor
from feature_engineering import feature_transformers, time_features
from models.group_stat import GroupStatModel

columns_to_drop=['MarkDown1', 'MarkDown2', 'MarkDown3', 'MarkDown4', 'MarkDown5']

preprocess = Pipeline([
    ('feature_adder', time_features.FeatureAdder(
        add_week_num=True,
        add_holiday_flags=True,
        add_holiday_proximity=True,
        add_holiday_windows=True,
        add_fourier_features=True,
        add_month_and_year=True,
        replace_time_index=True
    )),
    ('object_to_cat', feature_transformers.ObjectToCategory()),
    ('week_of_year_avg', time_features.WeeklyStoreDept()),
    ('make_cat', feature_transformers.MakeCategorical(['Dept', 'Store'])),
    ('drop_markdowns', feature_transformers.ChangeColumns(columns_to_drop=columns_to_drop)),
])

from models.walmart_group_salesv2 import GeneralWalmartGroupSalesModel

group_stat = Pipeline([
    ('preprocess', preprocess),
    ('model', GeneralWalmartGroupSalesModel())
])

We proceed by combining the selected models into a single ensemble. The composition of this combined model is shown below.

In [39]:
amodel = NNAssemble([nbeats_model,patchtst_model,tft_model,dlinear_model])
final_model = NNAssemble([amodel, group_stat])
final_model.fit(X_train,y_train)
pred = final_model.predict(X_valid)
valid_wmae = compute_wmae(y_valid, pred, is_holiday=X_valid['IsHoliday'])
print(f"Valid WMAE: {valid_wmae:.2f}")

In [40]:
from models import deep_walmart_group_sales
import importlib
importlib.reload(deep_walmart_group_sales)

final_model.fit(df.drop(columns=['Weekly_Sales']), df['Weekly_Sales'])

from configs.basic_config import minimal_config
from src.utils import log_to_wandb

log_to_wandb(
    model=final_model,
    train_score=-1,
    val_score=1470.43,
    config=minimal_config,
    run_name='group_stat_deep_learning_05',
    artifact_name="group_stat_model",
    artifact_description=""
)

0,1
train_wmae,▁
val_wmae,▁

0,1
train_wmae,-1.0
val_wmae,1470.43
