# Example using Random Forest

In this example we show how to apply different DCS and DES techniques for a classification dataset.

A very important aspect in dynamic selection is the generation of a pool of classifiers. A common practice in the dynamic selection literature is to generate a pool of classifiers using the Bagging (Bootstrap Aggregating) method.

In this example we generate a pool of classifiers using the Bagging technique implemented on the Scikit-learn library. Then, we compare the results obtained by combining this pool of classifiers using the standard Bagging combination approach versus the application of dynamic selection technique to select the set of most competent classifiers

In [1]:
from sklearn.datasets import load_breast_cancer
from sklearn.calibration import CalibratedClassifierCV
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split

# Example of DCS techniques
from DCS.OLA import OLA
from DCS.APriori import APriori
from DCS.MCB import MCB
# Example of DES techniques
from DES.KNORAE import KNORAE
from DES.DESP import DESP
from DES.KNORAU import KNORAU

ModuleNotFoundError: No module named 'DCS'

## Loading a classification dataset and preparing the data

In [None]:
data = load_breast_cancer()
X = data.data
y = data.target

# split the data into training and test data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33)

# Split the data into training and DSEL for DS techniques
X_train, X_dsel, y_train, y_dsel = train_test_split(X_train, y_train, test_size=0.5)

## Training a Random Forest

Here we generate a pool of classifiers using the Random Forest technique, which each base classifier is a decision tree classifier. The library works with any type of base classifier. The only requirement is that the base classifiers are able to estimate probabilities through the function (predict_proba)

In [None]:
# Calibrating Perceptrons to estimate probabilities
pool_classifiers = RandomForestClassifier(n_estimators=10)
pool_classifiers.fit(X_train2, y_train2)

## Initializing DS techniques

Here we initialize the DS techniques. Three DCS and three DES techniques are considered in this example. In this example, we specify the size of the region of competence (k = 5)

In [None]:
# DES techniques
knorau = KNORAU(pool_classifiers, k=5)
kne = KNORAE(pool_classifiers, k=5)
desp = DESP(pool_classifiers, k=5)
# DCS techniques
ola = OLA(pool_classifiers, k=5)
mcb = MCB(pool_classifiers, k=5)
apriori = APriori(pool_classifiers, k=5)

## Fitting the DS techniques

The function fit(data, target) is used to fit each dynamic selection method. The fit function prepares the algorithm that estimates the region of competence (e.g., K-NN algorithm) and pre-process information required to apply the DS techniques.

In [None]:
knorau.fit(X_dsel, y_dsel)
kne.fit(X_dsel, y_dsel)
desp.fit(X_dsel, y_dsel)
ola.fit(X_dsel, y_dsel)
mcb.fit(X_dsel, y_dsel)
apriori.fit(X_dsel, y_dsel)

## Calculate classification accuracy of each technique

In this case, the first result is the classification accuracy of the random forest classifier, which combines the outputs of each base decision tree using the majority voting scheme. 

Using DS techniques, instead of combining all decision trees, only the ones that are more competent locally are used for classification. In the case of DCS techniques, the decision tree that is most competent locally is used for prediction. In the case of DES techniques, an ensemble containing the most competent decision trees are selected to predict the label of a given query sample.

In [None]:
print('Classification accuracy of Random Forest: ', pool_classifiers.score(X_test, y_test))
print('Classification accuracy of KNORA-Union: ', knorau.score(X_test, y_test))
print('Classification accuracy of KNORA-Eliminate: ', kne.score(X_test, y_test))
print('Classification accuracy of DESP: ', desp.score(X_test, y_test))
print('Classification accuracy of OLA: ', ola.score(X_test, y_test))
print('Classification accuracy of A priori: ', apriori.score(X_test, y_test))