# Stacking Ensemble

Stacking ensemble methods use the results from a number of other models, and uses a meta-estimator to learn the relationships between the prediction results of the first layer of models to produce a final prediction. For this experiment, a combination of the naive bayes, logistic regression, support vector machine, random forest, and XGBoost models will be used in the sklearn.ensemble.StackingClassifier model. The meta-estimator will be a logistic regression model, since more complex models are often not chosen as the meta-estimator since they may overfit easier. 

In [2]:
#libaries
from sklearn.ensemble import StackingClassifier
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
from sklearn.svm import SVC 
from sklearn.metrics import classification_report
import pickle

In [None]:
#opening pickle file of augmented added data
with open('/content/drive/Shareddrives/SignLanguageData/combined_augmented_data.pkl','rb') as f:
    X_train,y_train,X_test,y_test = pickle.load(f)

In [None]:
#standardized data 
#using sklearn standard scaler model and fitting on training data 
sc = StandardScaler().fit(X_train.reshape(X_train.shape[0],-1))
X_train = sc.transform(X_train.reshape(X_train.shape[0],-1))
X_test = sc.transform(X_test.reshape(X_test.shape[0],-1))

Defining all models as estimators using the best parameters found in previous experiment notebooks. 

In [None]:
#defining estimators 
all_estimators = [
    SVC(kernel = 'rbf', gamma = 'auto', C = 2.6389473684210527)


]

In [None]:
#training stacking classifier 
all_stack = StackingClassifier(estimators=all_estimators, final_estimator=LogisticRegression())
all_stack.fit(X_train,y_train)

#predictions 
y_pred_train = all_stack.predict(X_train)
y_pred_test = all_stack.predict(X_test)

In [None]:
#classification report for train
print(classification_report(y_train,y_pred_train))

In [None]:
#classification report for test 
print(classification_report(y_test,y_pred_test))

Defining only the best performing models as estimators using the best parameters found in previous experiment notebooks. 

In [None]:
best_estimators = [
    SVC(kernel = 'rbf', gamma = 'auto', C = 2.6389473684210527)

    
]

In [None]:
#training stacking classifier 
best_stack = StackingClassifier(estimators=best_estimators, final_estimator=LogisticRegression())
best_stack.fit(X_train,y_train)

#predictions 
y_pred_train = best_stack.predict(X_train)
y_pred_test = best_stack.predict(X_test)

In [None]:
#classification report for train
print(classification_report(y_train,y_pred_train))

In [None]:
#classification report for test 
print(classification_report(y_test,y_pred_test))