### This notebook describes how to structure a project.

During the competition there were many tasks to do, plan, research, test, tune, accept disappointment of failed tests, and be happy with a little improvement. 

Therefore, the workspace and the files of the project should be structured in a flexible way with less repeated code. In other words, to split code from data/configuration to save time and reduce errors/bugs.

Therefore, let us first define the main entities in the project.
There are four main entities and I think these will be the same in all projects: **experiment**, **model**, **level**, and **stack**. These will be modeled by classes as described in this notebook.


I would like to give credits to many kernels and websites among them:

 - Good introduction introduction about stacking: https://mlwave.com/kaggle-ensembling-guide/
 - Implementation of stacking and a nice discussion: https://www.kaggle.com/getting-started/18153#post103381
 - Stacking solution for a regression problem: https://www.kaggle.com/serigne/stacked-regressions-top-4-on-leaderboard
 - Fine tuned XGboost params: https://www.kaggle.com/stevenrferrer/30-days-of-ml-optimized-xgboost-5folds
 - Stacking pipeline: https://www.kaggle.com/abhishek/competition-part-6-stacking
 - Multi-seeds stacking: https://www.kaggle.com/hungkhoi/1st-place-stacking-code
 
 
*This is a draft work, and will be improved regularly.*


In [1]:
# Familiar imports
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from IPython.display import display

import os
import glob
from datetime import datetime
from pathlib import Path



# helpers
from sklearn.preprocessing import OrdinalEncoder, LabelEncoder, OneHotEncoder, PowerTransformer, StandardScaler, \
                                  MinMaxScaler, RobustScaler, PolynomialFeatures
from sklearn.compose import ColumnTransformer
from sklearn.model_selection import train_test_split, KFold, cross_val_score, StratifiedKFold
from sklearn.pipeline import make_pipeline

# Models
from sklearn.linear_model import ElasticNet, Lasso, LinearRegression
from sklearn.neighbors import KNeighborsRegressor 
from sklearn.kernel_ridge import KernelRidge
from sklearn.ensemble import RandomForestRegressor, ExtraTreesRegressor, GradientBoostingRegressor
from xgboost import XGBRegressor
from lightgbm import LGBMRegressor
from catboost import CatBoostRegressor

# base
from sklearn.base import BaseEstimator, RegressorMixin, TransformerMixin, clone

# scoring
from sklearn.metrics import mean_squared_error


In [2]:
# notebook options
pd.set_option("display.max_columns", 100)
path = "../input/30-days-of-ml/"
train_file = "train.csv"
test_file = "test.csv"

In [3]:
# Load the training data
train = pd.read_csv(f'{path}{os.sep}{train_file}', index_col=0)
test = pd.read_csv(f'{path}{os.sep}{test_file}', index_col=0)

# Preview the data
# train.describe().T

In [4]:
# Separate target from features
y = train['target']
features = train.drop(['target'], axis=1)

# Preview features
# features.head().T

In [5]:
# identify columns
numerical_cols = features.select_dtypes(include=['int64', 'float64']).columns
categorical_cols = features.select_dtypes(include=['object', 'bool']).columns

# useful for column transformers 
numerical_ix = features.columns.get_indexer(numerical_cols)
categorical_ix = features.columns.get_indexer(categorical_cols)

In [6]:
# work on a copy
X_train = features.copy()
X_test = test.copy()

### Helper functions

These are two functions to save and load predictions, they can be wrapped within a class for a better modeling or kept as they are since they are independent of the project setting.


In [7]:

## helper fucntions
def to_file(data, output_folder, idxs=None, suffix='.csv'):
    print(data)
    df = pd.DataFrame(data)
    df.to_csv(f'{output_folder}{os.sep}{suffix}', index=True)
        
    
def calc_folds_indexes(X, y, n_folds=5, shuffle=True, sampler=KFold, seeds=[42]):
    """
    Create folds from a dataset X and a target y
    sampler: can be KFold,  StratifiedKFold, or any sampling class  
    
    return a list of dictionaries of {'seed':, 'idxs':[train_idxs, test_idxs]}
    """
    folds_idxs_list = []
    for seed in seeds:
        folds = sampler(n_splits=n_folds, 
                        random_state=seed,
                        shuffle=shuffle)

        folds_idxs_list.append({'seed': seed, 'idxs':list(folds.split(X, y))})
        
    return folds_idxs_list
        


# score function
score_func = mean_squared_error
score_func_param = {'square':False}

def score(y, target, average=False):
    # if y is a list then it will return a list of scores
    # if average is True then it will return the mean of the scores
    
    if type(y) in [list, np.ndarray]:
        scores = []
        for y_i in y:
            scores.append(score_func(y_i, target, **score_func_param))
        if average:
            return np.mean(scores)
        else:
            return scores
        
    return score_func(y, target, **score_func_param)
    

### ModelWrapper 
This class role is to avoid coding multiple classes for each model (or model types). We can see that models can actually be categorized into different categories, where some models accept more parameters than the others. For instance xgboost can use an evaluation set to determine the stopping round number, while Lasso does not accept such extra parameters.

Thanks to the flexibility of python and the design of the base models, we can wrap the model and `wrapper` to do what the model should do. In fact, we can easily stretch this class to support sklearn pipelines.


In [8]:
class ModelWrapper():
    def __init__(self, 
                 model,
                 name,
                 uses_eval_set=False,
                 fit_params={}):
        
        self.model = model
        self.name = name
        
        self.uses_eval_set = uses_eval_set
        self.fit_params = fit_params # any extra params for the 'fit' function
                
    def fit(self, X, y, eval_set=None):
        if self.uses_eval_set:
            self.model.fit(X, y, eval_set=eval_set, **(self.fit_params))
        else:
            self.model.fit(X, y, **(self.fit_params)) 
        return self
    

    def predict(self, X):
        return self.model.predict(X)
        
    def clone_me(self, random_state=None):
        wrapper = ModelWrapper(model=clone(self.model), # clone from sklean.base
                               name=self.name, 
                               uses_eval_set=self.uses_eval_set,
                               fit_params=self.fit_params)
        wrapper.name = self.name
        if random_state is not None:
            wrapper.set_random_state(random_state)
        
        return wrapper
    
    def set_random_state(self, random_state):
        if hasattr(self.model, 'random_state'):
            self.model.random_state = random_state
        elif hasattr(self.model, 'random_seed'):
            self.model.random_seed = random_state
            
    def get_random_state(self):
        if hasattr(self.model, 'random_state'):
            return self.model.random_state 
        elif hasattr(self.model, 'random_seed'):
            return self.model.random_seed
    
        

### ModelTrainer
This class role is to train a model and calculate the oofs and the test predictions (meta-features).


In [9]:
class ModelTrainer():
    def __init__(self,
                  model: ModelWrapper):
        
        self.model = model
        
    def calc_oofs(self,
                  X, y,
                  X_test,
                  folds_idxs,
                  transformer=None,
                  verbose=False,
                  use_different_random_states=True):
        """
        Return the oofs predictions and the meta features (test predictions)
        """
        
        test_predictions = 0
        oof_predictions = np.zeros_like(np.array(y))
        valid_mean_score = [] 
        for fold, (train_ix, valid_ix) in enumerate(folds_idxs):
            X_train, X_valid = X[train_ix], X[valid_ix]
            y_train, y_valid = y[train_ix], y[valid_ix]
                        
            
            # transform input
            if transformer is not None:
                X_train = transformer.fit_transform(X_train)
                X_valid = transformer.transform(X_valid)

            
            # check if we train each fold on differently initialized clone
            if use_different_random_states:
                model = self.model.clone_me(random_state=fold)
            else:
                model = self.model.clone_me()
            
            # fit the model
            if model.uses_eval_set:
                model.fit(X_train, y_train, eval_set=[(X_train, y_train), (X_valid, y_valid)])
            else:
                model.fit(X_train, y_train)
                
            ## predictions
            # on the validation set
            valid_predications = model.predict(X_valid)
            score = mean_squared_error(valid_predications, y_valid, squared=False)
            valid_mean_score.append(score)
            oof_predictions[valid_ix] = valid_predications
            
            # on the test set
            # transform it first on a copy
            if transformer is not None:
                X_test_ = transformer.transform(X_test)
            else:
                X_test_ = X_test
                
            test_predictions += model.predict(X_test_) / len(folds_idxs)
            
            if verbose:
                print('Fold:{} score:{:.4f}'.format(fold + 1, score))
        
        if verbose:
            print('Average score:{:.4f} ({:.4f})'.format(np.mean(valid_mean_score), np.std(valid_mean_score) ))
    
        return oof_predictions, test_predictions

### Level
The level class glues all components in a given layer

In [10]:
class Level():
    def __init__(self,
                level_id,
                models,
                folder,
                transformer,
                n_folds=5,
                seeds=[42],
                frozen=False,
                use_different_random_states=True):
        
        self.level_id = level_id
        self.models = models
        self.folder = folder
        self.transformer = transformer
        self.n_folds = n_folds
        self.seeds = seeds
        self.frozen = frozen=False
        self.use_different_random_states = use_different_random_states
    
    def create(self, model_zoo):
        """
         Create a level.
         model_zoo: a dictionay of all avialable models.
        """
        self.model_wrappers = []
        
        # get models 
        # if models is set to 'all' use all models
        if self.models[0].lower() == 'all':
            level_models_names = model_zoo.keys()
        else: 
            level_models_names = self.models

        for model_name in level_models_names:
            # get paramaters  
            model = model_zoo[model_name]
            fit_kwargs = model_zoo[model_name]['fit_kwargs']
            app_params = model_zoo[model_name]['app_params']

            model_wrapper = ModelWrapper(model=model['model'], name=model_name)
            if fit_kwargs is not None:
                model_wrapper.fit_params = fit_kwargs
            if app_params is not None:
                model_wrapper.uses_eval_set = app_params['uses_eval_set']
            self.model_wrappers.append(model_wrapper)

#### Level Trainer
Trains all models in a level 

In [11]:
class LevelTrainer():
    def __init__(self,
                level,
                seeds_folds_idxs_list):
        self.level = level
        self.seeds_folds_idxs_list = seeds_folds_idxs_list
        
    def train(self, X_train, y, X_test, verbose=True, agg_func=None):
        """
        train the level and return the oofs and meta-features for each model in the level
        if the level has many seeds it will either use the agg_func to combine predictions
        or will just return eveything
        
        agg_func: can be None, np.mean, or any other numpy reduction function
        """

        level_oof_preds, level_test_preds = {}, {}
        for model_wrapper in self.level.model_wrappers:
            if verbose:
                print('-'*30)
                print(f'Model:{model_wrapper.name}')
                print('-'*30)

            # train each model with as many times as the length of folds_idxs_list 
            model_oof_preds, model_test_preds = [], []
            
            for seeds_folds_idxs in self.seeds_folds_idxs_list:
                seed, folds_idxs = seeds_folds_idxs['seed'], seeds_folds_idxs['idxs']
                print('-'*30)
                print(f'Seed:{seed}')
                print('-'*30)
                
                trainer = ModelTrainer(model_wrapper)
                oof_preds, test_preds = trainer.calc_oofs(X_train, 
                                                          y,
                                                          X_test,
                                                          transformer=self.level.transformer,
                                                          folds_idxs=folds_idxs,
                                                          verbose=verbose)
                if agg_func is None:
                    level_oof_preds[f'{model_wrapper.name}_seed_{seed}'] =  oof_preds
                    level_test_preds[f'{model_wrapper.name}_seed_{seed}'] =  test_preds
                else: # collect them in order to aggregate them with the agg_func function
                    model_oof_preds.append(oof_preds)
                    model_test_preds.append(test_preds)

          # aggregate the results
        if agg_func is not None:
            level_oof_preds[f'{model_wrapper.name}'] = agg_func(np.column_stack(model_oof_preds))
            level_test_preds[f'{model_wrapper.name}'] = agg_func(np.column_stack(model_test_preds))

        if verbose:
            print('-'*30)

        return pd.DataFrame(level_oof_preds), pd.DataFrame(level_test_preds)

### Experiment 
Since everything boiled down to stacking (even if we have a single level), the experiment class will handle the organization of the resulted files from the test: test and oofs predictions. Therefore, assuming the project has the following structure with a folder called **experiments** we can save our tests in this folder. This is what this class will do. This class is the entry point for any run in the project. It reads the input and the settings and produces the output.

```
    ML30_project
    │   README.md
    │
    └───notebooks
    │   ...
    │
    └───experiments
    │   │   
    │   │
    │   └───experiment_1   
    │   │   level_1_oofs.csv
    │   │   level_1_test.csv
    │   │   level_2_oofs.csv
    │   │   level_2_test.csv
    │   │   ...
    │   │   meta_level_oofs.csv
    │   │   meta_level_test.csv
    │   └───experiment_...
```


>The code that generated the results is important to save too, but that can be done easily by creating a new version of the notebook or copying notebook with the CV_LB results.

>This class is so important when running notebooks in our computers since Kaggle has a nice notebook management system which saves outputs as well.



In [12]:
class Experiment():
    def __init__(self,
                 title,
                 description,
                 stack,
                 model_zoo,
                 main_folder=os.getcwd()):
        
        self.title = title
        self.main_folder = main_folder
        self.stack = stack
        self.model_zoo = model_zoo
        self.description = description
        # create the main folder if it does not exist
        if not os.path.exists(f'{self.main_folder}'):
            os.makedirs(f'{self.main_folder}', exist_ok=True)
        
    def join_folder(self, folder=None):
         """
         Join a folder and output where results will be saved.
         If 'folder' is None, it will create a folder
         with a time stamp.
         """

         # create time stamp and subfolder with the current time stamp
         if folder is not None: # if folder is specified
            self.output_folder = folder 
            # create a folder if does not exit.
            folder_path = f'{self.main_folder}{os.sep}{self.output_folder}'
            if not os.path.exists(folder_path):
                os.makedirs(folder_path)                
         else: # create a folder with the time stamp
            time_stamp = datetime.now().isoformat(' ', 'seconds')
            self.output_folder = self.title + ' ' + time_stamp.replace(':', '-')
            # create and replace if it exits.
            Path(f'{self.main_folder}{os.sep}{self.output_folder}').mkdir(parents=True, exist_ok=True)
    
    
    def run(self, X_train, y, X_test, 
            train_idxs,
            test_idxs,
            verbose=True, store=True):
        
        # run the stack
        for  level_params in self.stack:
            # create all models in the level
            level = Level(**level_params)
            level.create(self.model_zoo)

            print('-'*50)
            print(f'Current Level: {level.level_id}')
            print('-'*50)

            # join the level's output folder
            #self.join_folder(folder=level.folder)

            # create folds indexes for the level
            seeds_folds_idxs_list = calc_folds_indexes(X=X_train,
                                                       y=y,
                                                       n_folds=level.n_folds,
                                                       sampler=KFold,
                                                       seeds=level.seeds)

            # train the level
            if not level.frozen:  # escape any trained level   
                level_trainer = LevelTrainer(level=level, 
                                             seeds_folds_idxs_list=seeds_folds_idxs_list)

                level_oof_preds, level_test_preds =  level_trainer.train(X_train=X_train,
                                                                         y=y,
                                                                         X_test=X_test)
                # store predictions?
                if store:
                    # oofs 
                    level_oof_preds.to_csv(f'{self.main_folder}{os.sep}{self.output_folder}{os.sep}{level.level_id}_oofs.csv')
                    # test predictions
                    level_test_preds.to_csv(f'{self.main_folder}{os.sep}{self.output_folder}{os.sep}{level.level_id}_test.csv')
                    
                
                # update train and test 
                X_train, X_test = level_oof_preds.values, level_test_preds.values
            else:
                print('This level is already trained')
                # load saved of this level and raise error
                fold_id = level.n_folds
                folder =f'{self.main_folder}{os.sep}{self.output_folder}'
                
                # new features 
                level_oof_preds = pd.read_csv(f"{folder}{os.sep}*{fold_id}_oofs.csv")
                level_test_preds = pdf.read_csv(f"{folder}{os.sep}*{fold_id}_test.csv")
                
                X_train = level_oof_preds.values
                X_test = level_test_preds.values
                
            if verbose:
                display(level_oof_preds.head(10))
                display(level_test_preds.head(10))
                
        # return the last output from the last level
        return level_test_preds

### Hyperparameters

Here goes the paramaters of each model. These can actually be stored in an external JSON file.


In [13]:
# Lasso
lasso_params = {
                'alpha': 0.00005
}


# Elastic Net
enet_params = {
               'alpha': 0.00005, 
               'l1_ratio': .9
}

# extra-tree
et_params = {
    'n_jobs': -1,
    'n_estimators': 100,
    'max_features': 0.5,
    'max_depth': 12,
    'min_samples_leaf': 2,
}

# random forest

rf_params_2 = {
    'n_jobs': -1,
    'n_estimators': 500,
    'max_depth': 5
}


rf_params_1 = {
    'n_jobs': -1,
    'n_estimators': 100,
    'max_features': 0.2,
    'max_depth': 8,
    'min_samples_leaf': 2
}

# gradient boosting 
gb_params = {
    'n_estimators': 500,
     'max_depth': 3
}

# xgboost

# Using optuna but still rough
xgb_params_2 = {
                 # gpu (if gpu uncommnet the following three lines)
#                  'tree_method': 'gpu_hist', 
#                  'gpu_id': 0, 
#                  'predictor': 'gpu_predictor',
                # cpu
                 'max_depth': 5,
                 'learning_rate': 0.021252439960137114,
                 'n_estimators': 13500,
                 'subsample': 0.62,
                 'booster': 'gbtree',
                 'colsample_bytree': 0.1,
                 'reg_lambda': 0.1584605320779582,
                 'reg_alpha': 15.715145781076245,
                 'n_jobs': -1
}

xgb_params_1 = {
            'random_state': 1, 
            # gpu (if gpu uncommnet the following three lines)
#             'tree_method': 'gpu_hist', 
#             'gpu_id': 0, 
#             'predictor': 'gpu_predictor',
            # cpu
            'n_jobs': -1,
            'booster': 'gbtree',
            'n_estimators': 10000,
            # optimized params
            'learning_rate': 0.03628302216953097,
            'reg_lambda': 0.0008746338866473539,
            'reg_alpha': 23.13181079976304,
            'subsample': 0.7875490025178415,
            'colsample_bytree': 0.11807135201147481,
            'max_depth': 3,
            #'min_child_weight': 6
}


# catboost
catb_params = {
              # gpu (if gpu uncommnet the following two lines)
#               'task_type': "GPU",
#               'devices': '0:1',
              # cpu only
              'iterations': 6800,
              'learning_rate': 0.93,
              'loss_function': "RMSE",
              'random_state': 42,
              'verbose': 0,
              'thread_count': -1,
              'depth': 1,
              'l2_leaf_reg': 3.28}


# using optuna
params_lgb = {
            # gpu (if gpu uncommnet the following three lines)
#             'device' : 'gpu',
#             'gpu_platform_id':  0,
#             'gpu_device_id: 0,
             # cpu only
             "n_estimators": 10000,
             'metric':'rmse',
             "objective": "regression",
             'max_depth': 12, 
             'subsample': 0.587082286344555, 
             'colsample_bytree': 0.2157299997089329, 
             'learning_rate': 0.01270518267668901,
             'reg_lambda': 36.78473508062132,
             'reg_alpha': 14.155146595119032, 
             'min_child_samples': 6, 
             'num_leaves': 34, 
             'max_bin': 914,
             'cat_smooth': 26,
             'n_jobs': -1,
             'cat_l2': 0.020257336654989123
        }

### These are model/task dependent parameters

In [14]:
# external hyperparamaters

### fit function hyperparamaters
# some models require special paramaters like early stoping in xgboost and lgbm
fit_params = {'early_stopping_rounds': 300,
                  'verbose': False}

### application/implementation paramaters
# These paramaters are implementation dependent 
app_params = {'uses_eval_set':True}



### Models

In [15]:
lrg = LinearRegression()
# lasso
lasso = Lasso(**lasso_params)

# Elastic net
e_net = ElasticNet(**enet_params)

# KNeighborsRegressor
knn =  KNeighborsRegressor()

# extra-tree
extree = ExtraTreesRegressor(**et_params)

# random forest
rfr = RandomForestRegressor(**rf_params_2)

# gradient boosting
gb = GradientBoostingRegressor(**gb_params)

#lgbm
lgb = LGBMRegressor(**params_lgb)

# xgboost 
# variants
xgb_1 =  XGBRegressor(**xgb_params_1)
xgb_2 =  XGBRegressor(**xgb_params_2)

#catboost
catb = CatBoostRegressor(**catb_params)

In [16]:
# compile all settings in one dictionary, 
# we can store/load it then to a JSON file
model_zoo = {
          'LinearRegression': {"model":lrg, "fit_kwargs":None, "app_params": None},
          'Lasso': {"model":lasso, "fit_kwargs":None, "app_params": None},
          'ElasticNet': {"model": e_net, "fit_kwargs":None, "app_params": None},
          'ExtraTreesRegressor': {"model": extree, "fit_kwargs":None, "app_params": None},
          'RandomForestRegressor': {"model": rfr, "fit_kwargs":None, "app_params": None},
          'GradientBoostingRegressor': {"model": gb, "fit_kwargs":None, "app_params": None},
          'XGBRegressor-1': {"model": xgb_1, "fit_kwargs":fit_params, "app_params": app_params},
          'XGBRegressor-2': {"model": xgb_2, "fit_kwargs":fit_params, "app_params": app_params},
          'CatBoostRegressor': {"model": catb, "fit_kwargs": fit_params, "app_params": app_params},
          'LGBMRegressor': {"model": lgb, "fit_kwargs":fit_params, "app_params": app_params}
          # we can add any number of models here 
        }

In [17]:
model_zoo.keys()

dict_keys(['LinearRegression', 'Lasso', 'ElasticNet', 'ExtraTreesRegressor', 'RandomForestRegressor', 'GradientBoostingRegressor', 'XGBRegressor-1', 'XGBRegressor-2', 'CatBoostRegressor', 'LGBMRegressor'])

### Stacking

Here goes the actual stacking procedure. 
   - We first define the architecture, and setup the a session.
   - Define the stack. That is, the models and transformers in the levels

In [18]:
# settings: experiment and stacking architecutre

# initialize the stack to the input
X_train_, X_test_ = X_train.copy(), X_test.copy()

# any special transformers for any level
level_1_transformers = [('cat', OrdinalEncoder(), categorical_ix) , ('num', MinMaxScaler(), numerical_ix)]
#
level_1_transform = ColumnTransformer(transformers=level_1_transformers)


# define the actual stack
stack = [ {"level_id": "level-1", 
           "models": [
                     #'CatBoostRegressor',
                     #'XGBRegressor-2',
                     'XGBRegressor-1',
                     #'LGBMRegressor'
                     # we can add any model here
                    ],
            "n_folds": 5,
            "seeds" : [42, 43, 44 45, 46, 47, 48, 49, 50, 51, 52, 53 ],
            "folder": "level_1", 
            "transformer": level_1_transform, 
            "frozen": True # to freeze the level if already trained
            },
               
#            {"level_id": "level-2",
#             "models": [
#                        'RandomForestRegressor',
#                        #'CatBoostRegressor',
#                      #  other models can be added here
#                       ],
#             "n_folds": 10,
#             "seeds" : [43, 45, 47, 49],
#             "folder": "level_2",
#             "transformer": None,
#             "frozen": False
#           },
         
         # we can add any number of levels here
         # ...
         
          {"level_id": "meta_level",
            "models": [#'LinearRegression',
                       'RandomForestRegressor'
                      ],
            "n_folds": 5,
            "seeds" : [42],
            "folder": "meta_level",
            "transformer": None,
            "frozen": False
          }
         
         
        ]
         

SyntaxError: invalid syntax (<ipython-input-18-b9ceab68bfae>, line 22)

- Loop through each level in the stack

In [None]:
# create experiment
experiments_folder = "Experiments"
experiment_folder = 'experiement_1' # if None a folder with time stamp will be created
experiment_description = "Simple model, multiple seeds"


ml30_experiment = Experiment(title='ML 30 days',
                             description=experiment_description,
                             stack=stack,
                             model_zoo=model_zoo,
                             main_folder=f'{os.getcwd()}{os.sep}{experiments_folder}')

ml30_experiment.join_folder(experiment_folder)

results = ml30_experiment.run(X_train=X_train_.values,
                     y=y.values, 
                     X_test=X_test_.values,
                     train_idxs = X_train_.index,
                     test_idxs = X_test_.index)



In [None]:
# final results
results.head(10)

### Submit the results

In [None]:
predictions = results.iloc[:, -1].values

In [None]:
# Save the predictions to a CSV file
output = pd.DataFrame({'id': X_test.index,
                       'target': predictions})
output.to_csv('submission.csv', index=False)

In [None]:
# results 
output.head(20)