# Designing A Stacked Ensemble Estimator That Is Scikit Learn Compatible

### Scott Allen Cambo [(Website)](www.scottallencambo.com)

As many of you know, [Scikit Learn](http://scikit-learn.org/stable/) is easily the most popular library for machine learning in Python. Like me, you were probably first exposed to it as part of a classroom project or tutorial and the excitment of seeing some reasonable accuracy on fresh new data was enough to get you on the machine learning bandwagon. Even though Scikit Learn is well designed and flexible, you will eventually find yourself saying "If only it could do *this other thing*, then I'd really be getting somewhere". When this happens, the first thing you should do is check to make sure that someone else hasn't already thought of this and developed exactly what you need (or at least something that is close enough). If you still can't find what you are looking for, then you may need to delve into the exciting world of extending Scikit Learn with your own custom functionalities.

In this blog, I'll give an example of a scenario where I wanted to recreate an approach to classifying *context* by Vaizman, Ellis, and Lanckriet at UCSD using their **[publicly available Extrasensory dataset](http://extrasensory.ucsd.edu/)**. In this work, classifiers trained on features derived from various sensors (accelerometer, gyroscope, GPS, etc.) are combined to get better classification accuracy than they might have independently. The paper talks about this in the context of "sensor fusion", a topic within ubiquitous computing that aims to resolve input from multiple sensors for better insight into the surrounding environment. However, the ways of doing sensor fusion described in the paper are really two different forms of ensemble learning called ***bagging*** and ***stacking***.

### Note : Scikit learn already has built-in functionality for creating bagging estimators from a set of base estimators. In practice, you should use these whenever possible.

However, for this research, there are a couple additional features that will probably be helpful:
* We want to be able to tell the estimator which exactly which sensors to use training data from
* The ability to "stack" classifiers. 

This means that a meta classifier learns to predict the correct label using the predictions from a group of varying classifiers that are each prone to different errors.

# Load Modules

You can ignore this part here. If you are curious about the code that I wrote for importing the data, you can find it in the "extrasense" directory of my **[GitHub repo for this project](https://github.com/scottofthescience/extrasensory_ar_analysis)**.

In [1]:
import pandas as pd
import numpy as np
import sys
sys.path.append('/home/sac086/extrasensory/')
import extrasense as es
from sklearn.metrics import accuracy_score, roc_auc_score
from sklearn.preprocessing import StandardScaler, Imputer
from sklearn.model_selection import cross_val_score, KFold, train_test_split
from sklearn.pipeline import Pipeline
from sklearn.linear_model import LogisticRegression
from sklearn.base import BaseEstimator, ClassifierMixin, clone
import xgboost as xgb
import inspect

# Load data

In [2]:
features_df = es.get_impersonal_data(leave_users_out=[], drop_nan_rows=False, sensors=None, label_type="activity", labeled_only=True)

timestamps = features_df.pop('timestamp')
label_source = features_df.pop("label_source")
labels = features_df.pop("label")
user_ids = features_df.pop("user_id")

In [3]:
from collections import Counter
c = Counter(labels)
for key, val in c.items():
    print("%s : %s" % (key, val))

BICYCLING : 5020
FIX_running : 1090
LYING_DOWN : 141461
FIX_walking : 22136
SITTING : 136728
STAIRS : 822


In [4]:
# The Extrasensory group was thoughtful enough to provide 
# the user id numbers of the participants whose data was 
# used in each fold of the cross validation process.
# We'll use those same folds so that we can more directly 
# compare the results of our classifier with theirs.

folds = es.get_uids_from_es_folds()

# Helpers

I've started using the [Pipeline classes](http://scikit-learn.org/stable/modules/pipeline.html) in Scikit Learn to make sure that I'm properly normalizing my data and handling NaNs. This can be an easy thing to forget and a tedious thing to add to every training process so instead I have the *make_pipeline()* function below that adds all this to my estimator of choice. For this blog, we'll be using this function with our custom ensemble estimator.

In [5]:
def make_pipeline(clf, **params):
    """Helper function that takes a classifier and its parameters as input
    and returns a pipeline with an imputer that replaces NaN values with the mean value for
    that feature and a StandardScaler() for normalizing the data to have a zero mean and unit variance.
    
    Args:
        clf (sklearn estimator): the scikit learn compatible estimator that should be added to the pipeline
        **params (dict): a dictionary representing the parameter names and values for clf
    
    Returns:
        A Scikit Learn Pipeline object with clf(**params), an Imputer, and a StandardScaler
    """
    steps = []
    steps.append(('imputer', Imputer(missing_values='NaN', strategy='mean', axis=0)))
    steps.append(('standardize', StandardScaler()))
    steps.append(('clf', clf(**params)))
    model = Pipeline(steps)
    return model

In [6]:
# an example of how this function gets used
clf = make_pipeline(LogisticRegression, **dict(class_weight='balanced'))

These next three functions are used to operationalize the same cross validation metrics that
were used in the ExtraSensory papers.

In [7]:
def get_train_test_ind(test_fold_uids, all_user_ids):
    """Takes in the test fold ids and the list of all user ids 
    for our features and returns the indeces of the features that should be
    in the training data and test data
    
    Args:
        test_fold_uids (list): list containing the user ids of those who should be in the test fold
        all_user_ids (list): list where each element represents a sample and id of the user it came from
    
    Returns:
        a tuple where the first element is a list of the traing indeces and the second element is a list
        of the test indeces"""
    bool_arr = all_user_ids.isin(test_fold_uids)
    test_ind = all_user_ids.index[bool_arr]
    bool_arr = np.logical_not(bool_arr)
    train_ind = all_user_ids.index[bool_arr]
    return train_ind, test_ind
    

In [8]:
def get_metrics(y, y_pred, verbose=True):
    """This function takes the true labels and the predicted labels and returns the 
    accuracy, sensitivity, specificity, balanced accuracy, and precision that were used in
    the ExtraSensory paper.
    
    Args:
        y (list): the list of true labels
        y_pred (list): the list of predictions
        verbose (bool): If true, the metrics will also be printed to the output
    
    Returns:
        a dictionary with the metrics and their values"""
    predictions = []
    # Naive accuracy (correct classification rate):
    accuracy = np.mean(y_pred == y);
    
    # Count occorrences of true-positive, true-negative, false-positive, and false-negative:
    tp = np.sum(np.logical_and(y_pred,y));
    tn = np.sum(np.logical_and(np.logical_not(y_pred),np.logical_not(y)));
    fp = np.sum(np.logical_and(y_pred,np.logical_not(y)));
    fn = np.sum(np.logical_and(np.logical_not(y_pred),y));
    
    # Sensitivity (=recall=true positive rate) and Specificity (=true negative rate):
    sensitivity = float(tp) / (tp+fn);
    specificity = float(tn) / (tn+fp);
    
    # Balanced accuracy is a more fair replacement for the naive accuracy:
    balanced_accuracy = (sensitivity + specificity) / 2.;
    
    # Precision:
    # Beware from this metric, since it may be too sensitive to rare labels.
    # In the ExtraSensory Dataset, there is large skew among the positive and negative classes,
    # and for each label the pos/neg ratio is different.
    # This can cause undesirable and misleading results when averaging precision across different labels.
    precision = float(tp) / (tp+fp);
    
    if verbose:
        print("-"*10);
        print('Accuracy*:         %.2f' % accuracy);
        print('Sensitivity (TPR): %.2f' % sensitivity);
        print('Specificity (TNR): %.2f' % specificity);
        print('Balanced accuracy: %.2f' % balanced_accuracy);
        print('Precision**:       %.2f' % precision);
        print("-"*10);
        
    return {'sensitivity' : sensitivity,
            'specificity' : specificity,
            'accuracy' : accuracy,
            'balanced accuracy' : balanced_accuracy,
            'precision' : precision}

In [9]:
def test_model(model_getter, context, **params):
    """this function is used to operationalize the evaluation of an algorithm
    so that it can be directly compared to the results of the ExtraSensory work.
    Specifically, it uses the exact cross validation folds and metrics of the original 
    work.
    
    Args:
        model_getter (function or class): this should be a function the returns an instance of the model to evaluate
                                          or an instance of the model's class.
        context (str): This is a string that indicates which context label to evaluate
        
        **params (dict): a dictionary representing the model parameter names and the values to use
        
    Returns:
        a list of the metrics returned for all folds
    """
    folds = es.get_uids_from_es_folds()
    fold_metrics = []
    
    for i, kf in enumerate(folds):
        print('Fold #%s' % i)
        model = model_getter(**params)
        
        train_ind, test_ind = get_train_test_ind(kf, user_ids)
        
        print("Training model...")
        X_train = features_df.iloc[train_ind]
        y_train = labels.iloc[train_ind]
        y_train = np.array([1 if y == context else 0 for y in y_train])
        
        X_test = features_df.iloc[test_ind]
        y_test = labels.iloc[test_ind]
        y_test = np.array([1 if y == context else 0 for y in y_test])

        model.fit(X_train, y_train)
        
        print("Testing model...")
        y_pred = model.predict(X_test)
        
        metrics = get_metrics(y_test, y_pred, verbose=True)
        fold_metrics.append(metrics)
    
    return fold_metrics

In [10]:
def get_mean_metrics(metrics):
    """This function takes the list of fold metrics and aggregates them into a single metric
    Args:
        metrics (list): list of metrics from the get_metrics() method 
                        representing the results of each cross validation fold"""
    mean_metrics = {}
    
    for fold_metrics in metrics:
        for key, val in fold_metrics.items():
            if key in mean_metrics:
                mean_metrics[key].append(val)
            else:
                mean_metrics[key] = [val]
    
    print("-"*10);
    print('Accuracy*:         %.2f' % np.mean(mean_metrics['accuracy']));
    print('Sensitivity (TPR): %.2f' % np.mean(mean_metrics['sensitivity']));
    print('Specificity (TNR): %.2f' % np.mean(mean_metrics['specificity']))
    print('Balanced accuracy: %.2f' % np.mean(mean_metrics['balanced accuracy']))
    print('Precision**:       %.2f' % np.mean(mean_metrics['precision']))
    print("-"*10);

# Creating Our Ensemble Estimator

Scikit Learn provides A LOT of functionality for extending the code base, but it's important to keep in mind what it is that you actually want to achieve. A lot of this functionality is in place to help developers who are either planning to contribute to the scikit learn code base or they are planning to maintain a pretty serious development project that needs to be compatible (e.g.: Scikit Flow which is a wrapper for tensorflow that makes it compatible with Scikit Learn). If this is the direction you are thinking of then you should probably read the ["contributing" page](http://scikit-learn.org/stable/developers/contributing.html) on the scikit learn website and not this blog.

For this work, we are simply making a somewhat maintainable, scikit learn compatible, but definitely not production code worthy estimator class. Fortunately for us, scikit learn is developed with a ["Duck Typing"](https://en.wikipedia.org/wiki/Duck_typing) principle meaning that if it talks like a scikit learn estimator and it walks like a scikit learn estimator, then it is, for all practicle purposes of compatibility, a scikit learn estimator. With that in mind, we need to know what scikit learn will be looking for when we hand it one of our own estimators...

### Parameters
Scikit learn methods and classes for optimizing and evaluating, like GridSearchCV, need to access and manipulate
(nearly) all the parameters used in the estimator. It will specifically call ```get_params()``` and ```set_params()``` methods
of the estimator to do this. Fortunately, we don't have to write these ourselves since they are inherited from 
the BaseEstimator class that we'll extend. We will need to make sure that our initializer method, ```__init__```, can take
a dictionary of parameters as an argument and that each parameter has a default value that is set in the arguments (even if that value is ```None```).

Here's an example where we use the **[```inspect```](https://docs.python.org/3/library/inspect.html)** module to set the class attributes in a clean way:
```
class NewClassifier(BaseEstimator, ClassifierMixin):
    def __init__(self, k_neighbors=5, distance_function=None):
        args, _, _, values = inspect.getargvalues(inspect.currentframe())
        for arg, val in values.items():
            setattr(self, arg, val)
```
Generally speaking, you won't need to worry about what a frame is. All you really 
need to know is that this code is getting the argument values and names from the current method
and so they can be passed as class attributes.

### fit() and predict()
It should be pretty clear why ```fit()``` and ```predict()``` are needed. ```fit()``` is scikit learn's name for all
methods that start estimator training and ```predict()``` is the method that is generally used for making a prediction. In some cases, you might also add ```predict_proba()``` which returns the estimators confidence in each
potential prediction. This is required if you plan on using sklearn's [```roc_auc_score```](http://scikit-learn.org/stable/modules/generated/sklearn.metrics.roc_auc_score.html).

### sklearn BaseEstimator and Mixins
The ```BaseEstimator``` class provides a lot of the general functionality needed in an estimator 
like the ```get_params()``` and ```set_params()``` methods. There are a few Mixins to choose from
depending on the type of estimator you are creating ([full list here](http://scikit-learn.org/stable/modules/classes.html#module-sklearn.base)). Since we are making a classifier, we'll just
use the ```ClassifierMixin``` class.

### ```check_estimator()```
You should know that there is a check_estimator() function in scikit learn which can be used to 
make sure that your new estimator is compatible, but it just so happens that it doesn't play too well
with ensembles as this [**github issue currently documents**](https://github.com/scikit-learn/scikit-learn/issues/6079#issuecomment-166990309). So in this case, we'll just test it in true Duck Type fashion by seeing if it quacks and walks like a duck.

# Designing Our Ensemble Classifier
Remember we want to add two things:
* The ability to choose features based on the sensors they come from
* The ability to take the ensemble group and feed it into a meta estimator (stacked ensemble)

Creating the first feature is simple a matter of including ```sensors``` as an initialization parameter and writing the ```fit()``` and ```predict()``` methods to use the ```sensors``` list to figure out which features will be used to train which classifiers.

In creating the second feature, we need to think about keeping the data we train our base estimators on separate from the data that we train our meta estimator on. Here's why.

First, let's think about why stacked ensembles have potential. Patterns can arise from the outputs of an ensemble of classifiers where a subset of the ensemble err in a particular way with one set of input and a different subset of the ensemble err in a particular way with another set of input and so on... When we use a **"bagged"** ensemble, we ignore that there may be patterns and we aggregate across the outputs to get the most prominent answer. So if we have an ensemble of 5 classifiers, and 3 of them output ```True```, while 2 of them output ```False```, then the bagged ensemble estimator will predict ```True```. However, there is likely signal in this pattern from the output of the base estimators that we aren't using. Maybe one classifier is always wrong and its vote shouldn't be counted as much. The right meta estimator can learn this and effectively discount it. Maybe 3 of the 5 classifiers tend to err in the same way when confronted with a particular type of input and when this happens the error outweighs the correct classification of the other 2. Most meta estimators here should be able to catch this problem. For our meta estimator to have input, we must first learn from predictions that the base estimators make on something. If we use all of the training data to train our base estimators and then train our meta estimators by learning from predictions that the base estimators make on the training data they learned from, then we are not learning from how these estimators may behave in the real world. Instead, we are learning from their training errors which are nearly always different from test errors. This is why we need to reserve a good portion of our training data for the meta estimator to learn from.

This means that our stacked ensemble's ```fit()``` method will need to follow this procedure:
1. split the training data into training data for the base classifiers and training data for the meta classifier
2. train the base classifiers
3. use the base classifiers to make predictions on the training data for the meta classifier
    * this creates a set of predictions from data the base classifiers have not yet seen allowing our meta estimator to learn the kind of prediction behavior the base classifiers will exhibit on data that is outside the training set
4. use the base classifier predictions and the true labels on the meta classifier training set to train the meta classifier

In [11]:
class BoostingStackingClassifier(BaseEstimator, ClassifierMixin):
    """
    EnsembleClassifier can be used for a bagging or stacking ensemble with an estimator of the
    programmer's choice.
    
    Attributes: 
        base_clf (sklearn compatible estimator): The base classifier to be used as the prototype for all base classifiers
        base_clf_params (dict): A dictionary of parameters and their values to be used in each base classifier
        meta_clf (sklearn compatible estimator): (stacked ensemble only) the classifier that learn from the predictions of all the base classifiers
        meta_clf_params (dict): A dictionary of parameters and their values to be used in the meta estimator
        confidence_threshold (float between 0 and 1): the prediction will be positive if the estimators confidence is greater than this number
        sensors ([str]): the sensors representing the features to build classifiers for
        verbose (bool): If True, more output will print to the screen
        ensemble_type ("bagging" or "stacking"): determines the type of ensemble that will be used.
        scale (bool):If True, scale stats are calculated during training. Training data and test data are scaled accordingly
        impute (bool): If True, mean values are calculated during training. NaN values in training data and test 
            data are replaced with mean for that feature
    """

    
    def __init__(self, base_clf=LogisticRegression(class_weight="balanced"), base_clf_params=None, \
                       meta_clf=LogisticRegression(class_weight="balanced"), meta_clf_params=None, \
                       confidence_threshold=0.5, \
                       sensors=None, verbose=True, ensemble_type='bagging', \
                       scale=True, impute=True):
        args, _, _, values = inspect.getargvalues(inspect.currentframe())

        for arg, val in values.items():
            setattr(self, arg, val)
                
    def setup_imputer(self, X_train):
        self.imputer = Imputer(missing_values='NaN', strategy='mean', axis=0).fit(X_train)
        return self.imputer.transform(X_train)
    
    def setup_standard_scaler(self, X_train):
        self.scaler = StandardScaler().fit(X_train)
        return self.scaler.transform(X_train)
    
    def fit(self, X_train, y_train, sensors=None):
        """The scikit learn compatible method for training the model
        Args:
            X_train (a numpy array or pandas DataFrame object): A matrix where each row 
                represents a training instance and column represents a feature set.
            y_train (a numpy array or pandas Series object): an array where each element is a label that corresponds
                with each row in X_train
        Returns:
            None
        """
        if sensors:
            self.sensors = sensors
            
        if self.impute:
            columns = X_train.columns
            X_train = pd.DataFrame(self.setup_imputer(X_train), columns=columns)
        
        if self.scale:
            columns = X_train.columns
            X_train = pd.DataFrame(self.setup_standard_scaler(X_train), columns=columns)

        if self.ensemble_type is 'bagging':
            self.bag_fit(X_train, y_train)
            
        if self.ensemble_type is 'stacking':
            self.stack_fit(X_train, y_train)
    
    def stack_fit(self, X_train, y_train):
        """A helper method for training a stacked ensemble
        
        Args:
            X_train (pandas DataFrame object): A matrix where each row 
                represents a training instance and column represents a feature set.
            y_train (a numpy array or pandas Series object): an array where each element is a label that corresponds
                with each row in X_train
        Returns:
            None
        """
        # split training data
        X_train_base, X_train_meta, y_train_base, y_train_meta = train_test_split(X_train, y_train, test_size=0.5)

        # convert back to pandas DataFrame
        X_train_base = pd.DataFrame(X_train_base, columns=X_train.columns)
        X_train_meta = pd.DataFrame(X_train_meta, columns=X_train.columns)
        
        # train base classifiers
        self.bag_fit(X_train_base, y_train_base)
        y_pred_base = self.get_base_predictions(X_train_meta)

        # train meta classifier
        self.meta_clf.fit(y_pred_base, y_train_meta)

    def  bag_fit(self, X_train, y_train):
        """A helper method for training the bagged ensemble
        
        Args:
            X_train (a numpy array or pandas DataFrame object): A matrix where each row 
                represents a training instance and column represents a feature set.
            y_train (a numpy array or pandas Series object): an array where each element is a label that corresponds
                with each row in X_train
        Returns:
            None
        """
        self.classifiers = {}
        for sensor in self.sensors:
            if self.verbose:
                print("Training classifier with data from %s sensor" % sensor)
            X_train_sensor = self.get_features_for_sensor(X_train, sensor)
            clf_sensor = clone(self.base_clf)
            if self.base_clf_params is not None:
                clf_sensor.set_params(**self.base_clf_params)
                
            clf_sensor.fit(X_train_sensor, y_train)
            self.classifiers[sensor] = clf_sensor
    
    def get_features_for_sensor(self, X, sensor):
        """A helper method that maps the features to the sensors they were derived from.
        
        Args:
            X (pandas DataFrame): the complete training feature set with named columns
            sensor (str): the name of the sensor from which we derive the features for.
        
        Returns:
            A pandas DataFrame with only the appropriate features relating to the sensor."""
        feature_cols = []
        
        for col in X.columns:
            sensor_names = es.sensor_key_dict[sensor]
            for sensor_name in sensor_names:
                if sensor_name in col:
                    feature_cols.append(col)
                    break
        
        return X[X.columns.intersection(feature_cols)]
    
    def predict(self, X_test, sensors=None):
        """The scikit learn compatible method for predicting the label from features.
        
        Args:
            X_test (pandas DataFrame): The features from which we will estimate a label for
            sensors (list or None): if None, use the sensors list from initializing the classifier object. If list, 
                                    only use the classifiers that were trained on the features from these sensors.
        Returns:
            A list of probability estimates for each label."""
        if sensors is None:
            sensors = self.sensors
            
        if self.impute:
            columns = X_test.columns
            X_test = pd.DataFrame(self.imputer.transform(X_test), columns=columns)
        
        if self.scale:
            columns = X_test.columns
            X_test = pd.DataFrame(self.scaler.transform(X_test), columns=columns)
            
        if self.ensemble_type is 'bagging':
            return self.bag_predict(X_test)
        
        if self.ensemble_type is 'stacking':
            return self.stack_predict(X_test)
            
    def bag_predict(self, X_test):
        """The prediction method for the bagging ensemble mode.
        
        Args:
            X_test (pandas DataFrame): the features
        
        Returns:
            A list of estimated labels for each test instance"""
        
        y_pred = self.get_base_predictions(X_test)
        # average across rows
        y_mean_pred = y_pred.mean(axis=1)
        y_pred = y_mean_pred > self.confidence_threshold # may have to convert this from boolean to integer 1,0
        return y_pred.astype(int)
    
    def stack_predict(self, X_test):
        """The prediction method for the stacking ensemble mode.
        
        Args:
            X_test (pandas DataFrame): the features
        Returns:
            A list of estimated labels for each test instance"""
        y_base_pred_probas = self.get_base_predictions(X_test)
        y_pred = self.meta_clf.predict(y_base_pred_probas)
        #y_meta_pred = self.meta_clf.predict_proba(y_base_pred_probas)[:,1]
        #y_pred = y_meta_pred > self.confidence_threshold
        return y_pred
    
    def get_base_predictions(self, X_test):
        """The helper method for making a set of ensemble predictions.
        Args:
            X_test (pandas DataFrame): the features
        Returns:
            A pandas DataFrame with columns representing each base estimator's probability estimates 
            and rows representing the test instances."""
        predictions_by_classifier = {}
        
        for sensor in self.sensors:
            X_test_sensor = self.get_features_for_sensor(X_test, sensor)
            predictions = self.classifiers[sensor].predict_proba(X_test_sensor)[:,1]
            predictions_by_classifier[sensor] = pd.Series(predictions, name=sensor)

        predictions_df = pd.concat([p for p in predictions_by_classifier.values()], axis=1)
        
        return predictions_df

In [12]:
sensors = ["Acc", "Gyro", "Magnet", "WAcc", "Compass", "Loc", "Aud", "AP", "PS", "LF"]
clf = BoostingStackingClassifier(sensors=sensors, ensemble_type='stacking')

In [13]:
context = "SITTING"
train_ind, test_ind = get_train_test_ind(folds[0], user_ids)

print("Training model...")
X_train = features_df.iloc[train_ind]
y_train = labels.iloc[train_ind]
y_train = np.array([1 if y == context else 0 for y in y_train])

X_test = features_df.iloc[test_ind]
y_test = labels.iloc[test_ind]
y_test = np.array([1 if y == context else 0 for y in y_test])

Training model...


In [14]:
clf.fit(X_train, y_train)

Training classifier with data from Acc sensor
Training classifier with data from Gyro sensor
Training classifier with data from Magnet sensor
Training classifier with data from WAcc sensor
Training classifier with data from Compass sensor
Training classifier with data from Loc sensor
Training classifier with data from Aud sensor
Training classifier with data from AP sensor
Training classifier with data from PS sensor
Training classifier with data from LF sensor


# Use Late Fusion Averaging (Bagging)

In [15]:
test_metrics = test_model(BoostingStackingClassifier, \
                          'BICYCLING', \
                          sensors=sensors)

Fold #0
Training model...
Training classifier with data from Acc sensor
Training classifier with data from Gyro sensor
Training classifier with data from Magnet sensor
Training classifier with data from WAcc sensor
Training classifier with data from Compass sensor
Training classifier with data from Loc sensor
Training classifier with data from Aud sensor
Training classifier with data from AP sensor
Training classifier with data from PS sensor
Training classifier with data from LF sensor
Testing model...
----------
Accuracy*:         0.87
Sensitivity (TPR): 0.78
Specificity (TNR): 0.87
Balanced accuracy: 0.83
Precision**:       0.04
----------
Fold #1
Training model...
Training classifier with data from Acc sensor
Training classifier with data from Gyro sensor
Training classifier with data from Magnet sensor
Training classifier with data from WAcc sensor
Training classifier with data from Compass sensor
Training classifier with data from Loc sensor
Training classifier with data from Aud

In [16]:
get_mean_metrics(test_metrics)

----------
Accuracy*:         0.89
Sensitivity (TPR): 0.79
Specificity (TNR): 0.89
Balanced accuracy: 0.84
Precision**:       0.10
----------


# Use Learned Weights (Stacking)

In [17]:
test_metrics = test_model(BoostingStackingClassifier, \
                          'BICYCLING', \
                          sensors=sensors, ensemble_type="stacking")

Fold #0
Training model...
Training classifier with data from Acc sensor
Training classifier with data from Gyro sensor
Training classifier with data from Magnet sensor
Training classifier with data from WAcc sensor
Training classifier with data from Compass sensor
Training classifier with data from Loc sensor
Training classifier with data from Aud sensor
Training classifier with data from AP sensor
Training classifier with data from PS sensor
Training classifier with data from LF sensor
Testing model...
----------
Accuracy*:         0.87
Sensitivity (TPR): 0.80
Specificity (TNR): 0.87
Balanced accuracy: 0.84
Precision**:       0.04
----------
Fold #1
Training model...
Training classifier with data from Acc sensor
Training classifier with data from Gyro sensor
Training classifier with data from Magnet sensor
Training classifier with data from WAcc sensor
Training classifier with data from Compass sensor
Training classifier with data from Loc sensor
Training classifier with data from Aud

In [18]:
get_mean_metrics(test_metrics)

----------
Accuracy*:         0.89
Sensitivity (TPR): 0.77
Specificity (TNR): 0.89
Balanced accuracy: 0.83
Precision**:       0.10
----------


# Conclusion

There are many different ways of configuring ensembles that become important for a number of applications. Each which can merit a fair amount of customization of the scikit learn toolkit.

Here are some examples:

**Active Learning**: The extent of disagreement between classifiers in the ensemble can be used as a measure of *uncertainty*. In active learning, it's common to make the assumption that samples we are uncertain about will provide the best improvement when labeled by a person.

**Semi-supervised Learning**: In semi-supervised learning the goal is to take what is learned from supervised learning on a small pool of labeled samples and extend that to a large pool of unlabeled samples. The most confident predictions on the unlabeled samples are then added to the pool of labeled samples for another iteration of supervised learning. The challenge here is that when the structure that is learned is not perfect, we can be inadvertently adding incorrectly labeled samples to our labeled pool. However, with an ensemble of classifiers that each have a different perspective, we can choose only the samples who received the same prediction from each of our classifiers. Effectively, the ensemble becomes a higher quality vetting process for good labeled samples.

Each of these approaches may merit specialized functionality for reasoning about the varying responses from the classifiers. Using what I've written above as a general template, you should be able to quickly and easily make sure that your new estimator passes the *duck test* for scikit learn compatibility.

### Note : 
Scikit Learn is still working on being more pandas compatible. At the moment, you can feed a pandas Series or DataFrame to most scikit learn objects and functions without any problem, but you will generally receive the numpy equivelent in return.  In parts of this project, I create a variable to hold the parts of the DataFrame that are important to me (column names) and then transform the numpy array back afterward. There are other ways of doing this using either [sklearn-pandas](https://github.com/scikit-learn-contrib/sklearn-pandas) or with method decorators to wrap functions where this will happen. Either way just be aware of how well (or not well) pandas and sklearn play as you customize and extend sklearn objects.

Cheers!

**Cambo**