Skip to content

Commit

Permalink
update documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
yzhao062 committed Jul 28, 2019
1 parent cafc970 commit b30b497
Show file tree
Hide file tree
Showing 3 changed files with 46 additions and 80 deletions.
36 changes: 26 additions & 10 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -89,6 +89,26 @@ combo is featured for:
* **Optimized performance with JIT and parallelization** when possible, using `numba <https://github.com/numba/numba>`_ and `joblib <https://github.com/joblib/joblib>`_.


**API Demo**\ :


.. code-block:: python
from combo.models.stacking import Stacking
# base classifiers
classifiers = [DecisionTreeClassifier(), LogisticRegression(),
KNeighborsClassifier(), RandomForestClassifier(),
GradientBoostingClassifier()]
clf = Stacking(base_clfs=classifiers) # initialize a Stacking model
clf.fit(X_train)
# predict on unseen data
y_test_labels = clf.predict(X_test) # label prediction
y_test_proba = clf.predict_proba(X_test) # probability prediction
**Table of Contents**\ :


Expand Down Expand Up @@ -211,11 +231,9 @@ demonstrates the basic API of predicting with multiple classifiers. **It is note
# initialize a group of classifiers
classifiers = [DecisionTreeClassifier(random_state=random_state),
LogisticRegression(random_state=random_state),
KNeighborsClassifier(),
RandomForestClassifier(random_state=random_state),
GradientBoostingClassifier(random_state=random_state)]
classifiers = [DecisionTreeClassifier(), LogisticRegression(),
KNeighborsClassifier(), RandomForestClassifier(),
GradientBoostingClassifier()]
#. Initialize, fit, predict, and evaluate with a simple aggregator (average)
Expand Down Expand Up @@ -308,11 +326,9 @@ demonstrates the basic API of stacking (meta ensembling).
# initialize a group of classifiers
classifiers = [DecisionTreeClassifier(random_state=random_state),
LogisticRegression(random_state=random_state),
KNeighborsClassifier(),
RandomForestClassifier(random_state=random_state),
GradientBoostingClassifier(random_state=random_state)]
classifiers = [DecisionTreeClassifier(), LogisticRegression(),
KNeighborsClassifier(), RandomForestClassifier(),
GradientBoostingClassifier()]
#. Initialize, fit, predict, and evaluate with Stacking
Expand Down
20 changes: 20 additions & 0 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -94,6 +94,26 @@ combo is featured for:
* **Optimized performance with JIT and parallelization** when possible, using `numba <https://github.com/numba/numba>`_ and `joblib <https://github.com/joblib/joblib>`_.


**API Demo**\ :


.. code-block:: python
from combo.models.stacking import Stacking
# base classifiers
classifiers = [DecisionTreeClassifier(), LogisticRegression(),
KNeighborsClassifier(), RandomForestClassifier(),
GradientBoostingClassifier()]
clf = Stacking(base_clfs=classifiers) # initialize a Stacking model
clf.fit(X_train)
# predict on unseen data
y_test_labels = clf.predict(X_test) # label prediction
y_test_proba = clf.predict_proba(X_test) # probability prediction
----


Expand Down
70 changes: 0 additions & 70 deletions examples/temp_do_not_use.py
Original file line number Diff line number Diff line change
@@ -1,70 +0,0 @@
# -*- coding: utf-8 -*-
"""Example of combining multiple base classifiers. Two combination
frameworks are demonstrated:
1. Average: take the average of all base detectors
2. maximization : take the maximum score across all detectors as the score
"""
# Author: Yue Zhao <zhaoy@cmu.edu>
# License: BSD 2 clause


import os
import sys

# temporary solution for relative imports in case combo is not installed
# if combo is installed, no need to use the following line
sys.path.append(
os.path.abspath(os.path.join(os.path.dirname("__file__"), '..')))

import numpy as np

from sklearn.cluster import KMeans
from sklearn.cluster import MiniBatchKMeans
from sklearn.cluster import AgglomerativeClustering

from sklearn.datasets import load_breast_cancer
from sklearn.preprocessing import StandardScaler

from combo.models.cluster_comb import clusterer_ensemble_scores
from combo.models.cluster_comb import ClustererEnsemble
from combo.utils.utility import generate_bagging_indices

import warnings

warnings.filterwarnings("ignore")

if __name__ == "__main__":
# Define data file and read X and y
random_state = 42
X, y = load_breast_cancer(return_X_y=True)

n_clusters = 5
n_estimators = 3

# Initialize a set of estimators
estimators = [KMeans(n_clusters=n_clusters),
MiniBatchKMeans(n_clusters=n_clusters),
AgglomerativeClustering(n_clusters=n_clusters)]

clf = ClustererEnsemble(estimators, n_clusters=n_clusters)
clf.fit(X)
predicted_labels = clf.labels_
aligned_labels = clf.aligned_labels_

# Clusterer Ensemble without ininializing a new Class
original_labels = np.zeros([X.shape[0], n_estimators])

for i, estimator in enumerate(estimators):
estimator.fit(X)
original_labels[:, i] = estimator.labels_

# Invoke method directly without initialiing a new Class
labels_by_vote1 = clusterer_ensemble_scores(original_labels, n_estimators,
n_clusters)
labels_by_vote2, aligned_labels = clusterer_ensemble_scores(
original_labels, n_estimators, n_clusters, return_results=True)

labels_by_vote3 = clusterer_ensemble_scores(original_labels, n_estimators,
n_clusters, reference_idx=1)

0 comments on commit b30b497

Please sign in to comment.