### Stacking

In this notebook we are using another ensemble technique called stacking, in which estimators pass their predictions as additional input features to the second layer estimator and the combiner model(second layer estimator) itself is a trainable model. We can either use mlxtend to build a stacking model or implement it from scratch using the following steps, we have done following in this note book:

1. Stacking from scratch
2. Stacking using mlxtend

#### Stacking from scratch

In [144]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
import datetime
from sklearn.model_selection import GridSearchCV
from sklearn.model_selection import KFold
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.linear_model import LogisticRegression
from mlxtend.regressor import StackingRegressor
from sklearn.ensemble import RandomForestRegressor, GradientBoostingRegressor, ExtraTreesRegressor, AdaBoostRegressor
import utils
from sklearn.linear_model import LinearRegression
from sklearn.svm import SVR
from mlxtend.regressor import StackingCVRegressor
from sklearn.linear_model import Lasso
from sklearn.linear_model import Ridge
from sklearn.ensemble import RandomForestRegressor
from sklearn.model_selection import GridSearchCV
RANDOM_SEED = 42
import warnings
warnings.filterwarnings('ignore')

In [136]:
df_train = pd.read_csv('dataset/df_train.csv')
df_test = pd.read_csv('dataset/df_test.csv')
target = df_train['SalePrice']
df_train = df_train.drop(['SalePrice'], axis = 1)
X_train, X_test, y_train, y_test = train_test_split(df_train, target, test_size = 0.25, random_state = 42)

<i>Building the first level estimators (weak estimators)</i>

In [137]:
#building the first level estimators Weak estimators
from sklearn.tree import DecisionTreeRegressor
from sklearn.neighbors import KNeighborsRegressor

def Stacking(model,train,y,test,n_fold):
    folds=KFold(n_splits=n_fold,random_state=1)
    train_pred= np.empty((0,1), float)
    
    for train_indices,val_indices in folds.split(train):
        x_train,x_val=train.iloc[train_indices],train.iloc[val_indices]
        y_train,y_val=y.iloc[train_indices],y.iloc[val_indices]

        model.fit(x_train, y_train)
        
        pred = model.predict(x_val)
        train_pred=np.append(train_pred, pred)
        
    test_pred = model.predict(test)
    return test_pred.reshape(-1,1),train_pred

In [138]:
# weak estimator 1
model_dt = DecisionTreeRegressor(min_samples_leaf=3, min_samples_split=9, random_state=500)

# weak estimator 2
model_knn = KNeighborsRegressor(n_neighbors=5, algorithm='ball_tree')

#making predictions with the first level estimators
test_dt ,train_dt=Stacking(model=model1,n_fold=10, train=X_train,test=X_test,y=y_train)
test_knn ,train_knn=Stacking(model=model2,n_fold=5,train=X_train,test=X_test,y=y_train)

#converting them to dataframe
train_dt=pd.DataFrame(train_dt)
test_dt=pd.DataFrame(test_dt)

train_knn=pd.DataFrame(train_knn)
test_knn=pd.DataFrame(test_knn)

Appending the predictions to the dataset

In [139]:
pred_df_train = pd.concat([train_dt, train_knn], axis = 1)
pred_df_test = pd.concat([test_dt, test_knn], axis = 1)

#concatinating the first level predictions with the x_train data
x_train_second = pd.concat([X_train.reset_index(drop=True), pred_df_train.reset_index(drop=True)], axis=1)
x_test_second = pd.concat([X_test.reset_index(drop=True), pred_df_test.reset_index(drop=True)], axis=1)

Building the second layer estimator

In [140]:
from sklearn.linear_model import Ridge

reg_meta = Ridge(random_state=500)

#training the model using second traing set
reg_meta.fit(x_train_second.values, y_train)

Ridge(alpha=1.0, copy_X=True, fit_intercept=True, max_iter=None,
      normalize=False, random_state=500, solver='auto', tol=0.001)

using stacked ensembles for predictions

In [141]:
pred_stack = reg_meta.predict(x_test_second)
r2, rmse, rmsle = utils.evaluate_model(pred_stack, y_test)
df_metric = pd.DataFrame({'model':'Stacking','rmse':[rmse], 'r2':[r2], 'rmsle':[rmsle]})
df_metric.set_index('model')

Unnamed: 0_level_0,rmse,r2,rmsle
model,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Stacking,0.118419,0.91376,0.009237


In [142]:
print('Train score:',reg_meta.score(x_train_second, y_train))
print('Test score:',reg_meta.score(x_test_second, y_test))

Train score: 0.9390033322720837
Test score: 0.9137604157867704


#### Stacking using mlxtend

In [145]:
ridge = Ridge(random_state=RANDOM_SEED)
lasso = Lasso(random_state=RANDOM_SEED)
rf = RandomForestRegressor(random_state=RANDOM_SEED)

params = {'lasso__alpha': [0.1, 1.0, 10.0],
          'ridge__alpha': [0.1, 1.0, 10.0]}

stack = StackingCVRegressor(regressors=(lasso, ridge),
                            meta_regressor=rf, 
                            random_state=RANDOM_SEED,
                            use_features_in_secondary=True)

grid = GridSearchCV(
    estimator=stack, 
    param_grid=params, 
    cv=5,
    refit=True
)

grid.fit(X_train, y_train)

print("Best: %f using %s" % (grid.best_score_, grid.best_params_))

Best: 0.896329 using {'lasso__alpha': 10.0, 'ridge__alpha': 10.0}


In [146]:
grid_pred = grid.predict(X_test)
r2, rmse, rmsle = utils.evaluate_model(grid_pred, y_test)
df_metric = pd.DataFrame({'model':'Stacking','rmse':[rmse], 'r2':[r2], 'rmsle':[rmsle]})

In [147]:
df_metric

Unnamed: 0,model,rmse,r2,rmsle
0,Stacking,0.132791,0.891556,0.010461
