# Ensemble methods. Exercises


In this section we have only one exercise:

1. Find the best three classifier in the stacking method using the classifiers from scikit-learn package, such as:


* Linear regression,
* Nearest Neighbors,
* Linear SVM,
* Decision Tree,
* Naive Bayes,
* QDA.

In [176]:
%store -r data_set
%store -r labels
%store -r test_data_set
%store -r test_labels
%store -r unique_labels

## Exercise 1: Find the best three classifier in the stacking method

In [177]:
import numpy as np
from sklearn.metrics import accuracy_score

from sklearn.linear_model import LinearRegression
from sklearn.neighbors import KNeighborsClassifier
from sklearn.svm import SVC
from sklearn.tree import DecisionTreeClassifier
from sklearn.naive_bayes import GaussianNB
from sklearn.discriminant_analysis import QuadraticDiscriminantAnalysis

In [178]:
used_cl_list = [KNeighborsClassifier, DecisionTreeClassifier, GaussianNB, QuadraticDiscriminantAnalysis, SVC]
def build_classifiers():
    classifiers = []
    classifiers.append(KNeighborsClassifier())
    classifiers.append(DecisionTreeClassifier())
    classifiers.append(GaussianNB())
    classifiers.append(QuadraticDiscriminantAnalysis())
    classifiers.append(SVC(gamma='auto'))
    classifiers.append(LinearRegression())
    return classifiers
    #return [classifier() for classifier in used_cl_list]

In [183]:
def build_stacked_classifier(classifiers, st_cl):
    output = []
    fitted_classifiers = [cl.fit(data_set, labels) for cl in classifiers]
    for classifier in fitted_classifiers:
        output.append(classifier.predict(data_set))
    output = np.array(output).reshape((130,len(classifiers)))
    
    stacked_classifier = st_cl
    stacked_classifier.fit(output.reshape((130,len(classifiers))), labels.reshape((130,)))
    test_set = []
    for classifier in fitted_classifiers:
        test_set.append(classifier.predict(test_data_set))
    test_set = np.array(test_set).reshape((len(test_set[0]),len(classifiers)))
    predicted = stacked_classifier.predict(test_set)
    print('stacked classifier name: ', st_cl.__class__.__name__)
    print('predicted:\n', predicted, '\n')
    return predicted

In [184]:
predicted = []
for cl_idx in range(len(used_cl_list) -1):
    classifiers = build_classifiers()
    plain_classifiers = [classifiers[plain_cl_idx] for plain_cl_idx in range(len(classifiers)) if cl_idx != plain_cl_idx]
    predicted.append(build_stacked_classifier(plain_classifiers, classifiers[cl_idx]))

accuracy = accuracy_score(test_labels, predicted[0])
accuracies_list = [accuracy_score(test_labels, prediction) for prediction in predicted]
print( 'accuracies_list -> ', accuracies_list )

stacked classifier name:  KNeighborsClassifier
predicted:
 [0 2 0 0 2 0 0 0 2 0 0 0 2 2 0 0 2 2 0 2] 

stacked classifier name:  DecisionTreeClassifier
predicted:
 [1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2] 

stacked classifier name:  GaussianNB
predicted:
 [2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2] 

stacked classifier name:  QuadraticDiscriminantAnalysis
predicted:
 [0 0 0 0 2 2 0 0 2 0 0 0 2 2 0 0 2 1 2 1] 

accuracies_list ->  [0.4, 0.2, 1.0, 0.35]
