Fix second issue from #42 (#60)

* fix issues with metric evaluation in transformer * improve transformer selection algorithm to maintain fittest program * update documentation for new updates * update changelog with changes in random sampling
trevorstephens · Nov 16, 2017 · dfbae86 · dfbae86
1 parent 6ab0bbc
commit dfbae86
Show file tree

Hide file tree

Showing 9 changed files with 164 additions and 151 deletions.
diff --git a/doc/advanced.rst b/doc/advanced.rst
@@ -63,7 +63,7 @@ This can then be added to a ``gplearn`` estimator like so::
 After fitting, you will see some of your programs will have used your own
 customized functions, for example::
 
-    mul(logical(X0, mul(-0.629, X3), X7, sub(0.790, X7)), X9)
+    sub(logical(X6, add(X11, 0.898), X10, X2), X5)
 
 .. image:: images/ex3_fig1.png
     :align: center

diff --git a/doc/changelog.rst b/doc/changelog.rst
@@ -4,12 +4,25 @@
 Release History
 ===============
 
-Version 0.2.1
+Version 0.3.0
 -------------
 
+- Fixed two bugs in :class:`genetic.SymbolicTransformer` where the final
+  solution selection logic was incorrect and suboptimal. This fix will change
+  the solutions from all previous versions of `gplearn`. Thanks to
+  `iblasi <https://github.com/iblasi>`_ for diagnosing the problem and helping
+  craft the solution.
+- Fixed bug in :class:`genetic.SymbolicRegressor` where a custom fitness
+  measure was defined in :func:`fitness.make_fitness()` with the parameter
+  `greater_is_better=True`. This was ignored during final solution selection.
+  This change will alter the results from previous releases where
+  `greater_is_better=True` was set in a custom fitness measure. By
+  `sun ao <https://github.com/eggachecat>`_.
 - Increase minimum required version of ``scikit-learn`` to 0.18.1. This allows
   streamlining the test suite and removal of many utilities to reduce future
-  technical debt.
+  technical debt. **Please note that due to this change, previous versions
+  may have different results** due to a change in random sampling noted
+  `here <http://scikit-learn.org/stable/whats_new.html#version-0-18-1>`_.
 - Drop support for Python 2.6 and add support for Python 3.5 and 3.6 in order
   to support the latest release of ``scikit-learn`` 0.19 and avoid future test
   failures. By `hugovk <https://github.com/hugovk>`_.

diff --git a/doc/examples.rst b/doc/examples.rst
@@ -65,16 +65,13 @@ solutions small, since we know the truth is a pretty simple equation::
         |    Population Average   |             Best Individual              |
     ---- ------------------------- ------------------------------------------ ----------
      Gen   Length          Fitness   Length          Fitness      OOB Fitness  Time Left
-       0    38.13     386.19117972        7   0.331580808730   0.470286152255     55.15s
-       1     9.91    1.66832489614        5   0.335361761359   0.488347149514      1.25m
-       2     7.76      1.888657267        7   0.260765934398   0.565517599814      1.45m
-       3     5.37    1.00018638338       17   0.223753461954   0.274920433701      1.42m
-       4     4.69   0.878161643513       17   0.145095322600   0.158359554221      1.35m
-       5      6.1    0.91987274474       11   0.043612562970   0.043612562970      1.31m
-       6     7.18    1.09868887802       11   0.043612562970   0.043612562970      1.23m
-       7     7.65    1.96650325011       11   0.043612562970   0.043612562970      1.18m
-       8     8.02    1.02643443398       11   0.043612562970   0.043612562970      1.08m
-       9     9.07    1.22732144371       11   0.000781474035  0.0007814740353     59.43s
+       0    38.13     458.57768152        5   0.320665972828   0.556763539274      1.28m
+       1     9.97    1.70232723129        5   0.320201761523   0.624787148042     57.78s
+       2     7.72    1.94456344674       11   0.239536660154   0.533148180489     46.35s
+       3     5.41   0.990156815469        7   0.235676349446   0.719906258051     37.93s
+       4     4.66   0.894443363616       11   0.103946413589   0.103946413589     32.20s
+       5     5.41   0.940242380405       11   0.060802040427   0.060802040427     28.15s
+       6     6.78     1.0953592564       11   0.000781474035   0.000781474035     24.85s
 
 The evolution process stopped early as the error of the best program in the 9th
 generation was better than 0.01. It also appears that the parsimony coefficient
@@ -154,12 +151,12 @@ We can also inspect the program that the :class:`SymbolicRegressor` found::
 And check out who its parents were::
 
     print est_gp._program.parents
-    
+
     {'method': 'Crossover',
-     'parent_idx': 374,
+     'parent_idx': 1555,
      'parent_nodes': [1, 2, 3],
-     'donor_idx': 116,
-     'donor_nodes': [0, 1, 2, 6]}
+     'donor_idx': 78,
+     'donor_nodes': []}
 
 This dictionary tells us what evolution operation was performed to get our new
 individual, as well as the parents from the prior generation, and any nodes
@@ -235,7 +232,7 @@ dataset and see how it performs on the final 200 again::
     est.fit(new_boston[:300, :], boston.target[:300])
     print est.score(new_boston[300:, :], boston.target[300:])
     
-    0.853618353633
+    0.841750404385
 
 Great! We have improved the :math:`R^{2}` score by a significant margin. It
 looks like the linear model was able to take advantage of some new non-linear

diff --git a/doc/gp_examples.ipynb b/doc/gp_examples.ipynb
diff --git a/doc/images/ex1_fig1.png b/doc/images/ex1_fig1.png
diff --git a/doc/images/ex1_fig2.png b/doc/images/ex1_fig2.png
diff --git a/doc/images/ex1_fig3.png b/doc/images/ex1_fig3.png
diff --git a/gplearn/genetic.py b/gplearn/genetic.py
@@ -480,23 +480,28 @@ def fit(self, X, y, sample_weight=None):
         if isinstance(self, TransformerMixin):
             # Find the best individuals in the final generation
             fitness = np.array(fitness)
-            hall_of_fame = fitness.argsort()[:self.hall_of_fame]
+            if self._metric.greater_is_better:
+                hall_of_fame = fitness.argsort()[::-1][:self.hall_of_fame]
+            else:
+                hall_of_fame = fitness.argsort()[:self.hall_of_fame]
             evaluation = np.array([gp.execute(X) for gp in
                                    [self._programs[-1][i] for
                                     i in hall_of_fame]])
             if self.metric == 'spearman':
                 evaluation = np.apply_along_axis(rankdata, 1, evaluation)
 
-            # Iteratively remove the worst individual of the worst pair
             with np.errstate(divide='ignore', invalid='ignore'):
                 correlations = np.abs(np.corrcoef(evaluation))
             np.fill_diagonal(correlations, 0.)
             components = list(range(self.hall_of_fame))
             indices = list(range(self.hall_of_fame))
+            # Iteratively remove least fit individual of most correlated pair
             while len(components) > self.n_components:
-                worst = np.unravel_index(np.argmax(correlations),
-                                         correlations.shape)
-                worst = worst[np.argmax(np.sum(correlations[worst, :], 1))]
+                most_correlated = np.unravel_index(np.argmax(correlations),
+                                                   correlations.shape)
+                # The correlation matrix is sorted by fitness, so identifying
+                # the least fit of the pair is simply getting the higher index
+                worst = max(most_correlated)
                 components.pop(worst)
                 indices.remove(worst)
                 correlations = correlations[:, indices][indices, :]

diff --git a/gplearn/tests/test_genetic.py b/gplearn/tests/test_genetic.py
@@ -829,7 +829,7 @@ def test_transformer_iterable():
     est.fit(X, y)
     fitted_len = len(est)
     fitted_iter = [gp.length_ for gp in est]
-    expected_iter = [15, 19, 19, 12, 9, 10, 7, 14, 6, 21]
+    expected_iter = [8, 12, 2, 29, 9, 33, 9, 8, 4, 22]
 
     assert_true(fitted_len == 10)
     assert_true(fitted_iter == expected_iter)
@@ -1022,29 +1022,73 @@ def test_warm_start():
     assert_equal(cold_program, warm_program)
 
 
-def test_customizied_regressor_metrics():
-    """Check whether parameter greater_is_better works fine"""
+def test_customized_regressor_metrics():
+    """Check whether greater_is_better works for SymbolicRegressor."""
 
     x_data = rng.uniform(-1, 1, 100).reshape(50, 2)
     y_true = x_data[:, 0] ** 2 + x_data[:, 1] ** 2
 
-    est_gp = SymbolicRegressor(metric='mean absolute error', stopping_criteria=0.000001, random_state=415,
-                               parsimony_coefficient=0.001, verbose=0, init_method='full', init_depth=(2, 4))
+    est_gp = SymbolicRegressor(metric='mean absolute error',
+                               stopping_criteria=0.000001, random_state=415,
+                               parsimony_coefficient=0.001, init_method='full',
+                               init_depth=(2, 4))
     est_gp.fit(x_data, y_true)
     formula = est_gp.__str__()
-    assert_equal("add(mul(X1, X1), mul(X0, X0))", formula, True)
+    assert_equal('add(mul(X1, X1), mul(X0, X0))', formula, True)
 
     def neg_mean_absolute_error(y, y_pred, sample_weight):
         return -1 * mean_absolute_error(y, y_pred, sample_weight)
 
-    customizied_fitness = make_fitness(neg_mean_absolute_error, greater_is_better=True)
+    customizied_fitness = make_fitness(neg_mean_absolute_error,
+                                       greater_is_better=True)
 
-    c_est_gp = SymbolicRegressor(metric=customizied_fitness, stopping_criteria=-0.000001, random_state=415,
-                                 parsimony_coefficient=0.001, verbose=0, init_method='full', init_depth=(2, 4))
+    c_est_gp = SymbolicRegressor(metric=customizied_fitness,
+                                 stopping_criteria=-0.000001, random_state=415,
+                                 parsimony_coefficient=0.001, verbose=0,
+                                 init_method='full', init_depth=(2, 4))
     c_est_gp.fit(x_data, y_true)
     c_formula = c_est_gp.__str__()
-
-    assert_equal("add(mul(X1, X1), mul(X0, X0))", c_formula, True)
+    assert_equal('add(mul(X1, X1), mul(X0, X0))', c_formula, True)
+
+
+def test_customized_transformer_metrics():
+    """Check whether greater_is_better works for SymbolicTransformer."""
+
+    est_gp = SymbolicTransformer(generations=2, population_size=100,
+                                 hall_of_fame=10, n_components=1,
+                                 metric='pearson', random_state=415)
+    est_gp.fit(boston.data, boston.target)
+    for program in est_gp:
+        formula = program.__str__()
+    expected_formula = ('sub(div(mul(X4, X12), div(X9, X9)), '
+                        'sub(div(X11, X12), add(X12, X0)))')
+    assert_equal(expected_formula, formula, True)
+
+    def _neg_weighted_pearson(y, y_pred, w):
+        """Calculate the weighted Pearson correlation coefficient."""
+        with np.errstate(divide='ignore', invalid='ignore'):
+            y_pred_demean = y_pred - np.average(y_pred, weights=w)
+            y_demean = y - np.average(y, weights=w)
+            corr = ((np.sum(w * y_pred_demean * y_demean) / np.sum(w)) /
+                    np.sqrt((np.sum(w * y_pred_demean ** 2) *
+                             np.sum(w * y_demean ** 2)) /
+                            (np.sum(w) ** 2)))
+        if np.isfinite(corr):
+            return -1 * np.abs(corr)
+        return 0.
+
+    neg_weighted_pearson = make_fitness(function=_neg_weighted_pearson,
+                                        greater_is_better=False)
+
+    c_est_gp = SymbolicTransformer(generations=2, population_size=100,
+                                   hall_of_fame=10, n_components=1,
+                                   stopping_criteria=-1,
+                                   metric=neg_weighted_pearson,
+                                   random_state=415)
+    c_est_gp.fit(boston.data, boston.target)
+    for program in c_est_gp:
+        c_formula = program.__str__()
+    assert_equal(expected_formula, c_formula, True)
 
 
 if __name__ == "__main__":