All implemented modes are associated with examples, check "combo examples" for more information.
"examples/classifier_stacking_example.py" demonstrates the basic API of stacking (meta ensembling). "examples/classifier_dcs_la_example.py" demonstrates the basic API of Dynamic Classifier Selection by Local Accuracy. "examples/classifier_des_la_example.py" demonstrates the basic API of Dynamic Ensemble Selection by Local Accuracy.
It is noted the basic API is consistent across all these models.
Initialize a group of classifiers as base estimators
# initialize a group of classifiers classifiers = [DecisionTreeClassifier(random_state=random_state), LogisticRegression(random_state=random_state), KNeighborsClassifier(), RandomForestClassifier(random_state=random_state), GradientBoostingClassifier(random_state=random_state)]
Initialize, fit, predict, and evaluate with Stacking
from combo.models.classifier_stacking import Stacking clf = Stacking(base_estimators=classifiers, n_folds=4, shuffle_data=False, keep_original=True, use_proba=False, random_state=random_state) clf.fit(X_train, y_train) y_test_predict = clf.predict(X_test) evaluate_print('Stacking | ', y_test, y_test_predict)
See a sample output of classifier_stacking_example.py
Decision Tree | Accuracy:0.9386, ROC:0.9383, F1:0.9521 Logistic Regression | Accuracy:0.9649, ROC:0.9615, F1:0.973 K Neighbors | Accuracy:0.9561, ROC:0.9519, F1:0.9662 Gradient Boosting | Accuracy:0.9605, ROC:0.9524, F1:0.9699 Random Forest | Accuracy:0.9605, ROC:0.961, F1:0.9693 Stacking | Accuracy:0.9868, ROC:0.9841, F1:0.9899
"examples/classifier_comb_example.py" demonstrates the basic API of predicting with multiple classifiers. It is noted that the API across all other algorithms are consistent/similar.
Initialize a group of classifiers as base estimators
# initialize a group of classifiers classifiers = [DecisionTreeClassifier(random_state=random_state), LogisticRegression(random_state=random_state), KNeighborsClassifier(), RandomForestClassifier(random_state=random_state), GradientBoostingClassifier(random_state=random_state)]
Initialize, fit, predict, and evaluate with a simple aggregator (average)
from combo.models.classifier_comb import SimpleClassifierAggregator clf = SimpleClassifierAggregator(classifiers, method='average') clf.fit(X_train, y_train) y_test_predicted = clf.predict(X_test) evaluate_print('Combination by avg |', y_test, y_test_predicted)
See a sample output of classifier_comb_example.py
Decision Tree | Accuracy:0.9386, ROC:0.9383, F1:0.9521 Logistic Regression | Accuracy:0.9649, ROC:0.9615, F1:0.973 K Neighbors | Accuracy:0.9561, ROC:0.9519, F1:0.9662 Gradient Boosting | Accuracy:0.9605, ROC:0.9524, F1:0.9699 Random Forest | Accuracy:0.9605, ROC:0.961, F1:0.9693 Combination by avg | Accuracy:0.9693, ROC:0.9677, F1:0.9763 Combination by w_avg | Accuracy:0.9781, ROC:0.9716, F1:0.9833 Combination by max | Accuracy:0.9518, ROC:0.9312, F1:0.9642 Combination by w_vote| Accuracy:0.9649, ROC:0.9644, F1:0.9728 Combination by median| Accuracy:0.9693, ROC:0.9677, F1:0.9763
"examples/cluster_comb_example.py" demonstrates the basic API of combining multiple base clustering estimators. "examples/cluster_eac_example.py" demonstrates the basic API of Combining multiple clusterings using evidence accumulation (EAC).
Initialize a group of clustering methods as base estimators
# Initialize a set of estimators estimators = [KMeans(n_clusters=n_clusters), MiniBatchKMeans(n_clusters=n_clusters), AgglomerativeClustering(n_clusters=n_clusters)]
Initialize a Clusterer Ensemble class and fit the model
from combo.models.cluster_comb import ClustererEnsemble # combine by Clusterer Ensemble clf = ClustererEnsemble(estimators, n_clusters=n_clusters) clf.fit(X)
Get the aligned results
# generate the labels on X aligned_labels = clf.aligned_labels_ predicted_labels = clf.labels_
"examples/detector_comb_example.py" demonstrates the basic API of combining multiple base outlier detectors.
Initialize a group of outlier detection methods as base estimators
# Initialize a set of estimators detectors = [KNN(), LOF(), OCSVM()]
Initialize a simple averaging aggregator, fit the model, and make the prediction.
from combo.models.detector combination import SimpleDetectorAggregator clf = SimpleDetectorAggregator(base_estimators=detectors) clf_name = 'Aggregation by Averaging' clf.fit(X_train) y_train_pred = clf.labels_ # binary labels (0: inliers, 1: outliers) y_train_scores = clf.decision_scores_ # raw outlier scores # get the prediction on the test data y_test_pred = clf.predict(X_test) # outlier labels (0 or 1) y_test_scores = clf.decision_function(X_test) # outlier scores
Evaluate the prediction using ROC and Precision @ Rank n.
# evaluate and print the results print("\nOn Training Data:") evaluate_print(clf_name, y_train, y_train_scores) print("\nOn Test Data:") evaluate_print(clf_name, y_test, y_test_scores)
See sample outputs on both training and test data.
On Training Data: Aggregation by Averaging ROC:0.9994, precision @ rank n:0.95 On Test Data: Aggregation by Averaging ROC:1.0, precision @ rank n:1.0