Merge pull request #95 from rodrigo-arenas/0.9.0

Adaptive Learning
rodrigo-arenas · Jun 4, 2022 · d7cb37a · d7cb37a
2 parents 667e396 + 7a817e1
commit d7cb37a
Show file tree

Hide file tree

Showing 27 changed files with 1,012 additions and 51 deletions.
diff --git a/README.rst b/README.rst
@@ -43,6 +43,7 @@ Main Features:
 * **Algorithms**: Set of different evolutionary algorithms to use as an optimization procedure.
 * **Callbacks**: Custom evaluation strategies to generate early stopping rules,
   logging (into TensorBoard, .pkl files, etc) or your custom logic.
+* **Schedulers**: Adaptive methods to control learning parameters.
 * **Plots**: Generate pre-defined plots to understand the optimization process.
 * **MLflow**: Build-in integration with mlflow to log all the hyperparameters, cv-scores and the fitted models.
 

diff --git a/dev-requirements.txt b/dev-requirements.txt
@@ -16,3 +16,4 @@ numpydoc
 nbsphinx
 tensorflow>=2.0.0
 tqdm>=4.61.1
+tk
diff --git a/docs/api/schedules.rst b/docs/api/schedules.rst
@@ -0,0 +1,26 @@
+Schedules
+---------
+
+.. currentmodule:: sklearn_genetic.schedules
+
+.. autosummary::
+   base.BaseAdapter
+   ExponentialAdapter
+   InverseAdapter
+   PotentialAdapter
+
+.. autoclass:: sklearn_genetic.schedules.base.BaseAdapter
+   :members:
+   :undoc-members: False
+
+.. autoclass:: ExponentialAdapter
+   :members:
+   :undoc-members: False
+
+.. autoclass:: InverseAdapter
+   :members:
+   :undoc-members: False
+
+.. autoclass:: PotentialAdapter
+   :members:
+   :undoc-members: False
diff --git a/docs/images/schedules_comparison_0.png b/docs/images/schedules_comparison_0.png
diff --git a/docs/images/schedules_comparison_1.png b/docs/images/schedules_comparison_1.png
diff --git a/docs/images/schedules_exponential_0.png b/docs/images/schedules_exponential_0.png
diff --git a/docs/images/schedules_exponential_1.png b/docs/images/schedules_exponential_1.png
diff --git a/docs/images/schedules_inverse_0.png b/docs/images/schedules_inverse_0.png
diff --git a/docs/images/schedules_inverse_1.png b/docs/images/schedules_inverse_1.png
diff --git a/docs/images/schedules_potential_0.png b/docs/images/schedules_potential_0.png
diff --git a/docs/images/schedules_potential_1.png b/docs/images/schedules_potential_1.png
diff --git a/docs/index.rst b/docs/index.rst
@@ -66,6 +66,7 @@ as it is usually advised to look further which distribution works better for you
    tutorials/basic_usage
    tutorials/callbacks
    tutorials/custom_callback
+   tutorials/adapters
    tutorials/understand_cv
    tutorials/mlflow
 
@@ -94,6 +95,7 @@ as it is usually advised to look further which distribution works better for you
    api/gasearchcv
    api/featureselectioncv
    api/callbacks
+   api/schedules
    api/plots
    api/mlflow
    api/space

diff --git a/docs/notebooks/Adaptive_Learning.ipynb b/docs/notebooks/Adaptive_Learning.ipynb
@@ -0,0 +1,188 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "source": [
+    "# Digits Adaptive Learning\n",
+    "\n",
+    "In this example, we want to create a decay strategy for the mutation probability,\n",
+    "and an ascend strategy for the crossover probability,\n",
+    "lets call them $p_{mt}(t; \\alpha)$ and $p_{cr}(t; \\alpha)$ respectively;\n",
+    "this will enable the optimizer to explore more diverse solutions in the first iterations.\n",
+    "Take into account that on this scenario, we must be careful on choosing $\\alpha, p_0, p_f$,\n",
+    "this is because the evolutionary implementation requires that:\n",
+    "\n",
+    "\n",
+    "\n",
+    "$p_{mt}(t; \\alpha) + p_{cr}(t; \\alpha) <= 1;  \\forall t $\n",
+    "\n",
+    "The same idea can be used for hypeparameter tuning or feature selection.\n"
+   ],
+   "metadata": {
+    "collapsed": false
+   }
+  },
+  {
+   "cell_type": "markdown",
+   "source": [
+    "### Import the data and split it in train and test sets"
+   ],
+   "metadata": {
+    "collapsed": false
+   }
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": true
+   },
+   "outputs": [],
+   "source": [
+    "from sklearn_genetic import GASearchCV\n",
+    "from sklearn_genetic import ExponentialAdapter\n",
+    "from sklearn_genetic.space import Continuous, Categorical, Integer\n",
+    "from sklearn.ensemble import RandomForestClassifier\n",
+    "from sklearn.model_selection import train_test_split, StratifiedKFold\n",
+    "from sklearn.datasets import load_digits\n",
+    "from sklearn.metrics import accuracy_score\n",
+    "\n",
+    "data = load_digits()\n",
+    "n_samples = len(data.images)\n",
+    "X = data.images.reshape((n_samples, -1))\n",
+    "y = data['target']\n",
+    "X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "source": [
+    "### Create the adaptive strategy"
+   ],
+   "metadata": {
+    "collapsed": false
+   }
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "outputs": [],
+   "source": [
+    "mutation_adapter = ExponentialAdapter(initial_value=0.8, end_value=0.2, adaptive_rate=0.1)\n",
+    "crossover_adapter = ExponentialAdapter(initial_value=0.2, end_value=0.8, adaptive_rate=0.1)"
+   ],
+   "metadata": {
+    "collapsed": false,
+    "pycharm": {
+     "name": "#%%\n"
+    }
+   }
+  },
+  {
+   "cell_type": "markdown",
+   "source": [
+    "### Define the classifier to tune"
+   ],
+   "metadata": {
+    "collapsed": false
+   }
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "outputs": [],
+   "source": [
+    "clf = RandomForestClassifier()\n",
+    "param_grid = {'min_weight_fraction_leaf': Continuous(0.01, 0.5, distribution='log-uniform'),\n",
+    "              'bootstrap': Categorical([True, False]),\n",
+    "              'max_depth': Integer(2, 30),\n",
+    "              'max_leaf_nodes': Integer(2, 35),\n",
+    "              'n_estimators': Integer(100, 300)}\n",
+    "\n",
+    "cv = StratifiedKFold(n_splits=3, shuffle=True)\n",
+    "\n",
+    "evolved_estimator = GASearchCV(estimator=clf,\n",
+    "                               cv=cv,\n",
+    "                               scoring='accuracy',\n",
+    "                               population_size=20,\n",
+    "                               generations=25,\n",
+    "                               mutation_probability=mutation_adapter,\n",
+    "                               crossover_probability=crossover_adapter,\n",
+    "                               param_grid=param_grid,\n",
+    "                               n_jobs=-1)"
+   ],
+   "metadata": {
+    "collapsed": false,
+    "pycharm": {
+     "name": "#%%\n"
+    }
+   }
+  },
+  {
+   "cell_type": "markdown",
+   "source": [
+    "### Fit the model and see some results"
+   ],
+   "metadata": {
+    "collapsed": false
+   }
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "outputs": [],
+   "source": [
+    "# Train and optimize the estimator\n",
+    "evolved_estimator.fit(X_train, y_train)\n",
+    "# Best parameters found\n",
+    "print(evolved_estimator.best_params_)\n",
+    "# Use the model fitted with the best parameters\n",
+    "y_predict_ga = evolved_estimator.predict(X_test)\n",
+    "print(accuracy_score(y_test, y_predict_ga))\n",
+    "\n",
+    "# Saved metadata for further analysis\n",
+    "print(\"Stats achieved in each generation: \", evolved_estimator.history)\n",
+    "print(\"Best k solutions: \", evolved_estimator.hof)"
+   ],
+   "metadata": {
+    "collapsed": false,
+    "pycharm": {
+     "name": "#%%\n"
+    }
+   }
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "outputs": [],
+   "source": [],
+   "metadata": {
+    "collapsed": false,
+    "pycharm": {
+     "name": "#%%\n"
+    }
+   }
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 2
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython2",
+   "version": "2.7.6"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 0
+}
diff --git a/docs/release_notes.rst b/docs/release_notes.rst
@@ -3,6 +3,26 @@ Release Notes
 
 Some notes on new features in various releases
 
+What's new in 0.9.0dev0
+-----------------------
+
+^^^^^^^^^
+Features:
+^^^^^^^^^
+
+* Introducing Adaptive Schedulers to enable adaptive mutation and crossover probabilities;
+  currently, supported schedulers are:
+
+  - :class:`~sklearn_genetic.schedules.ExponentialAdapter`
+  - :class:`~sklearn_genetic.schedules.InverseAdapter`
+  - :class:`~sklearn_genetic.schedules.PotentialAdapter`
+
+* Changed the default values of `mutation_probability` and `crossover_probability`
+  to 0.8 and 0.2, respectively.
+
+* The `weighted_choice` function used in :class:`~sklearn_genetic.GAFeatureSelectionCV` was
+  re-written to give more probability to a number of features closer to the `max_features` parameter
+
 What's new in 0.8.1
 -------------------