Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix issue#42-(1) #44

Merged
merged 1 commit into from
Nov 10, 2017
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion gplearn/genetic.py
Original file line number Diff line number Diff line change
Expand Up @@ -471,7 +471,10 @@ def fit(self, X, y, sample_weight=None):

if isinstance(self, RegressorMixin):
# Find the best individual in the final generation
self._program = self._programs[-1][np.argmin(fitness)]
if self._metric.greater_is_better:
self._program = self._programs[-1][np.argmax(fitness)]
else:
self._program = self._programs[-1][np.argmin(fitness)]

if isinstance(self, TransformerMixin):
# Find the best individuals in the final generation
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we expand the PR to cover the transformer as well? I think that the hall_of_fame = fitness.argsort()[:self.hall_of_fame] should be the place to do this. Should be adequate to cut the results in the opposite direction for the bottom end using hall_of_fame = fitness.argsort()[self.population_size - self.hall_of_fame:]

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably all the existing tests should pass with this change, just add an additional regression test to keep this bug removed in the future as well.

Expand Down
25 changes: 25 additions & 0 deletions gplearn/tests/test_genetic.py
Original file line number Diff line number Diff line change
Expand Up @@ -1010,6 +1010,31 @@ def test_warm_start():
assert_equal(cold_program, warm_program)


def test_customizied_regressor_metrics():
"""Check whether parameter greater_is_better works fine"""

x_data = rng.uniform(-1, 1, 100).reshape(50, 2)
y_true = x_data[:, 0] ** 2 + x_data[:, 1] ** 2

est_gp = SymbolicRegressor(metric='mean absolute error', stopping_criteria=0.000001, random_state=415,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we reduce these line lengths to 79 chars or less for PEP8?

parsimony_coefficient=0.001, verbose=0, init_method='full', init_depth=(2, 4))
est_gp.fit(x_data, y_true)
formula = est_gp.__str__()
assert_equal("add(mul(X1, X1), mul(X0, X0))", formula, True)

def neg_mean_absolute_error(y, y_pred, sample_weight):
return -1 * mean_absolute_error(y, y_pred, sample_weight)

customizied_fitness = make_fitness(neg_mean_absolute_error, greater_is_better=True)

c_est_gp = SymbolicRegressor(metric=customizied_fitness, stopping_criteria=-0.000001, random_state=415,
parsimony_coefficient=0.001, verbose=0, init_method='full', init_depth=(2, 4))
c_est_gp.fit(x_data, y_true)
c_formula = c_est_gp.__str__()

assert_equal("add(mul(X1, X1), mul(X0, X0))", c_formula, True)

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this test fail on master?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just checked and it does. Might be a better test to ensure the formulas are equal to each other though?

Copy link
Contributor Author

@eggachecat eggachecat Aug 15, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@trevorstephens Sorry for my late feedback.. Maybe I should not compare the string of the formula considering other possible solutions. I will test whether the outputs of the formula given X almost equal to the right answer

Copy link
Contributor Author

@eggachecat eggachecat Aug 15, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@trevorstephens Hi it just occurred to me that why would test failed anyway?Since I specify function call a fix random_state parameter, shouldn't it give the same result forever?Just curious...

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries @eggachecat ... It's open source lol, happens when people have the time to work on it! I think checking the string is fine for a negative version of the same metric.

It's a regression check so its intention is to ensure the bug does not return in the future. So yes, it should always pass unless something changes later that brings the bug back. This check would ensure that the bug is detected before merging code later on that brings back the problem.


if __name__ == "__main__":
import nose
nose.runmodule()