refactor code to reduce redundancy

yzhao062 · Jul 29, 2019 · 9bb6fc5 · 9bb6fc5
1 parent ec7be23
commit 9bb6fc5
Show file tree

Hide file tree

Showing 6 changed files with 22 additions and 23 deletions.
diff --git a/CHANGES.txt b/CHANGES.txt
@@ -9,4 +9,5 @@ v<0.0.4>, <07/17/2019> -- Update documentation.
 v<0.0.4>, <07/21/2019> -- Add code maintainability.
 v<0.0.5>, <07/27/2019> -- Add median combination and score_to_proba function.
 v<0.0.5>, <07/28/2019> -- Add Stacking (meta ensembling).
-v<0.0.6>, <07/29/2019> -- Enable Appveyor integration.
+v<0.0.6>, <07/29/2019> -- Enable Appveyor integration.
+v<0.0.6>, <07/29/2019> -- Update requirements file.
diff --git a/README.rst b/README.rst
@@ -68,16 +68,16 @@ combo: A Python Toolbox for Machine Learning Model Combination
 -----
 
 
-**combo** is a Python toolbox for combining or aggregating ML models and
-scores for various tasks, including **classification**, **clustering**,
-**anomaly detection**, and **raw score**. It has been widely used in data
-science competitions and real-world tasks, such as Kaggle.
+**combo** is a comprehensive Python toolbox for combining machine
+learning (ML) models and scores for various tasks, including **classification**,
+**clustering**, **anomaly detection**, and **raw score**.
 
-Model and score combination can be regarded as a subtask of
+Model combination has been widely used in data science competitions and
+real-world tasks, such as Kaggle. It can be considered as a subtask of
 `ensemble learning <https://en.wikipedia.org/wiki/Ensemble_learning>`_,
 but is often beyond the scope of ensemble learning. For instance,
 averaging the results of multiple runs of a ML model is deemed as
-a reliable way of eliminating the randomness for better stability. See
+a reliable way of eliminating the randomness. See
 figure below for some popular combination approaches.
 
 .. image:: https://raw.githubusercontent.com/yzhao062/combo/master/docs/figs/framework_demo.png
@@ -88,9 +88,8 @@ figure below for some popular combination approaches.
 combo is featured for:
 
 * **Unified APIs, detailed documentation, and interactive examples** across various algorithms.
-* **Advanced models**, including dynamic classifier/ensemble selection and LSCP.
-* **Broad applications** for classification, clustering, anomaly detection, and raw score.
-* **Comprehensive coverage** for supervised, unsupervised, and semi-supervised scenarios.
+* **Advanced models**, such as dynamic classifier/ensemble selection.
+* **Comprehensive coverage** for classification, clustering, anomaly detection, and raw score.
 * **Optimized performance with JIT and parallelization** when possible, using `numba <https://github.com/numba/numba>`_ and `joblib <https://github.com/joblib/joblib>`_.
 
 
@@ -106,7 +105,7 @@ combo is featured for:
                       KNeighborsClassifier(), RandomForestClassifier(),
                       GradientBoostingClassifier()]
 
-       clf = Stacking(base_clfs=classifiers) # initialize a Stacking model
+       clf = Stacking(base_estimators=classifiers) # initialize a Stacking model
        clf.fit(X_train)
 
        # predict on unseen data
@@ -340,7 +339,7 @@ demonstrates the basic API of stacking (meta ensembling).
 
        from combo.models.stacking import Stacking
 
-       clf = Stacking(base_clfs=classifiers, n_folds=4, shuffle_data=False,
+       clf = Stacking(base_estimators=classifiers, n_folds=4, shuffle_data=False,
                    keep_original=True, use_proba=False, random_state=random_state)
 
        clf.fit(X_train, y_train)

diff --git a/docs/example.rst b/docs/example.rst
@@ -128,7 +128,7 @@ demonstrates the basic API of stacking (meta ensembling).
 
        from combo.models.stacking import Stacking
 
-       clf = Stacking(base_clfs=classifiers, n_folds=4, shuffle_data=False,
+       clf = Stacking(base_estimators=classifiers, n_folds=4, shuffle_data=False,
                    keep_original=True, use_proba=False, random_state=random_state)
 
        clf.fit(X_train, y_train)

diff --git a/docs/index.rst b/docs/index.rst
@@ -73,16 +73,16 @@ Welcome to combo's documentation!
 -----
 
 
-**combo** is a Python toolbox for combining or aggregating ML models and
-scores for various tasks, including **classification**, **clustering**,
-**anomaly detection**, and **raw score**. It has been widely used in data
-science competitions and real-world tasks, such as Kaggle.
+**combo** is a comprehensive Python toolbox for combining machine
+learning (ML) models and scores for various tasks, including **classification**,
+**clustering**, **anomaly detection**, and **raw score**.
 
-Model and score combination can be regarded as a subtask of
+Model combination has been widely used in data science competitions and
+real-world tasks, such as Kaggle. It can be considered as a subtask of
 `ensemble learning <https://en.wikipedia.org/wiki/Ensemble_learning>`_,
 but is often beyond the scope of ensemble learning. For instance,
 averaging the results of multiple runs of a ML model is deemed as
-a reliable way of eliminating the randomness for better stability. See
+a reliable way of eliminating the randomness. See
 figure below for some popular combination approaches.
 
 .. image:: https://raw.githubusercontent.com/yzhao062/combo/master/docs/figs/framework_demo.png
@@ -93,9 +93,8 @@ figure below for some popular combination approaches.
 combo is featured for:
 
 * **Unified APIs, detailed documentation, and interactive examples** across various algorithms.
-* **Advanced models**, including dynamic classifier/ensemble selection and LSCP.
-* **Broad applications** for classification, clustering, anomaly detection, and raw score.
-* **Comprehensive coverage** for supervised, unsupervised, and semi-supervised scenarios.
+* **Advanced models**, such as dynamic classifier/ensemble selection.
+* **Comprehensive coverage** for classification, clustering, anomaly detection, and raw score.
 * **Optimized performance with JIT and parallelization** when possible, using `numba <https://github.com/numba/numba>`_ and `joblib <https://github.com/joblib/joblib>`_.
 
 

diff --git a/docs/rebuilt.sh b/docs/rebuilt.sh
diff --git a/examples/stacking_example.py b/examples/stacking_example.py
@@ -53,7 +53,7 @@
 
     print()
     # build a Stacking model and evaluate
-    clf = Stacking(base_clfs=classifiers, n_folds=4, shuffle_data=False,
+    clf = Stacking(classifiers, n_folds=4, shuffle_data=False,
                    keep_original=True, use_proba=False,
                    random_state=random_state)