New closure methods #131

pricebenjamin · 2019-02-10T01:26:02Z

This PR is intended to solve issue #130. Performance tests should be conducted. There's probably a better solution for determining eps, but this method determines the smallest eps available for the inferred dtype. It may fail if x1.dtype != x2.dtype. (How often would that be the case?)

* Note that the last sentence in the `Initialization` section was completely removed. It was unclear what this sentence was meant to say, but did not make sense in the context of the surrounding paragraph.

* Some additional explanation is necessary in this section. Specifically, does the winner of a tournament undergo exactly one of these operations, or does it undergo potentially all of these ops, with the probability of each op being performed given by its respective parameter?

pricebenjamin · 2019-02-10T01:46:34Z

Integers are not inexact, and thus do not have an eps. Can we safely cast inputs to floats? e.g.

abs_x2 = np.abs(x2, dtype=np.float64)

If we always cast to float64, then we could define eps globally instead of looking it up every call.

pricebenjamin · 2019-02-10T04:52:03Z

test_transformer_iterable() (and probably others) from /gplearn/tests/test_genetic.py fails because the protected functions have been modified.

coveralls · 2019-02-10T05:30:16Z

Coverage remained the same at 98.703% when pulling 70f0c2b on pricebenjamin:new_closure into 07b41a1 on trevorstephens:master.

pricebenjamin · 2019-02-15T04:51:17Z

@trevorstephens, any input on this?

trevorstephens · 2019-02-15T08:34:43Z

I'll be going through the issue tracker tomorrow! Sorry, I've been quite busy the past couple weeks and haven't had a chance to go through the recent input, always thankful for the community involvement and wish I had more time!

…

On Fri, Feb 15, 2019 at 3:51 PM Ben Price ***@***.***> wrote: @trevorstephens <https://github.com/trevorstephens>, any input on this? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#131 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AE-C4DOKh1UhYywdADGh-IBEkWDqRnWqks5vNjzGgaJpZM4ay1Ud> .

trevorstephens · 2019-02-17T10:23:57Z

gplearn/functions.py

@@ -14,6 +14,7 @@

 __all__ = ['make_function']

+_EPS = np.finfo(np.float64).eps # Used by protected functions


This would mean that results may differ from machine to machine I guess. Don't really like this!

Suggested change

_EPS = np.finfo(np.float64).eps # Used by protected functions

_EPS = 1e-6 # or any other value

# Consider printing a warning if default value of _EPS is smaller than

# machine epsilon of whatever dtype is being used (e.g. np.float16)

This solution still provides the benefit of returning more reasonable values as we approach singularities. I could also then remove the explicit conversion to np.float64 in the definitions of the protected functions--inputs would be implicity converted upon addition of _EPS. Testing of protected functions would need to be updated with lower tolerance.

trevorstephens · 2019-02-17T10:33:08Z

Hey @pricebenjamin , appreciate the input. The solution here has the immediate red flag that the _EPS = np.finfo(np.float64).eps line could differ between machines.

But perhaps we should back up a little though and discuss why this solution is better? Is a more mathematically exact formulation around singularities desirable? Feels like close to infinity is still kind of as bad as actual infinity when it comes to fitness in most cases?

trevorstephens · 2019-02-17T10:35:24Z

Also can we remove the documentation changes from this PR and put them in #129 @pricebenjamin ? I am very hesitant about this change due to the potential machine dependencies as it stands, but each PR should be a single issue as well.

pricebenjamin · 2019-02-17T19:58:53Z

Also can we remove the documentation changes from this PR and put them in #129

All of the documentation changes are actually in #129. I made the mistake of not creating the documentation branch separately, and so those changes are automatically in this branch. Should I pull the original documentation into this branch? Or would it be possible to complete #129 first?

trevorstephens · 2019-02-17T20:34:43Z

Ah yeah no worries then

pricebenjamin · 2019-02-17T20:48:42Z

The solution here has the immediate red flag that the _EPS = np.finfo(np.float64).eps line could differ between machines.

Understood. ~~I'll try and look around for other solutions and am open to any other ideas.~~ All of the benefits of this implementation can still be attained with any fixed value of _EPS.

But perhaps we should back up a little though and discuss why this solution is better?

These new protected functions have a much smaller range over which they're degenerate (with 64-bit precision, these regions are nearly non-existent), which consequently decreases the likelihood that programs will contain degenerate nodes. However, it may be the case that [-0.001, 0.001] was already sufficient.

Edit: Perhaps more importantly, see the answer to your last question.

Is a more mathematically exact formulation around singularities desirable?

Having formulas which match expectations is desirable, in my opinion. If it doesn't impact performance (which I could try and test more thoroughly), is there a downside to making them more exact?

Feels like close to infinity is still kind of as bad as actual infinity when it comes to fitness in most cases?

Part of the issue with the old implementation is that the functions were not producing large results when they should have been. For example, log(0.001) should produce a large negative number, but instead produces 0. This completely removes the penalty that should be associated with approaching singularities.

pricebenjamin · 2019-02-17T20:51:19Z

Also, I greatly appreciate the input @trevorstephens!

trevorstephens · 2019-02-24T06:05:06Z

Though I welcome the input and discussion, I am very reluctant to merge this PR.

I'm a bit unconvinced by this "reasonable value" point of view. What is reasonable? Why is the current implementation unreasonable? Does a more pure division mean that the results will be more reasonable? or better? Is there any research on this front? (if so then happy to read it)

While I understand that the change is at first glance minor, it does completely break compatibility for all prior releases, so that needs a bit of pause.

Since the advantages of the change are unknown, disadvantages equally unknown, and this change can be completely implemented by the user easily using the existing custom functions functionality, I think I'm leaning towards a no on this one.

Do appreciate the input, but not convinced that the benefits are proven here, and the negatives are non-trivial

pricebenjamin · 2019-02-24T08:25:50Z

I totally understand the reluctance 👍 I'll see if I can find some resources that discuss the issue of closure.

it does completely break compatibility for all prior releases

Could you elaborate on this? I understand that code written for prior versions would, most likely, produce different results, but I would expect that this method produces fewer degenerate nodes and ultimately better programs (specifically because it correctly penalizes parameters near singularities).

If you're interested, I could work on thoroughly comparing this implementation to the previous. It may take a few weeks, though.

trevorstephens · 2019-02-25T07:43:36Z

@pricebenjamin -- exactly. Code written to run in older versions of the package will produce different results in the new version. This makes people less likely to upgrade versions and thus not adopt new features, or be surprised by different results. I like to avoid these things wherever possible.

You're welcome to make any comparisons that you wish, I can't promise that I will merge this though due to the compatibility issue.

hwulfmeyer · 2019-03-12T14:59:59Z

What might at least be considerable is the way the closure is performed. I personally don't care so much about how large the EPS is but how it is applied.
I find it a bit questionable why a division would return "1" whenever the divisor is too low when in fact the lower a divisor the higher the result.
In the Field Guide to GP they argue:

The decision to return the value 1 provides the GP system with a simple way to
generate the constant 1, via an expression of the form (% x x). This combined with a
similar mechanism for generating 0 via (- x x) ensures that GP can easily construct
these two important constants.

I fail to see the reason why the GP system needs to generate a constant 1 in the first place. It is also not explained in the field guide.

Example A:
Consider GP is trying to learn the following function (5/x^4)+(5/x^3) and x happens to be between 0.1 and 1 with steps of 0.1.
So our training data consists of 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0.
For the function we are trying to learn the current protected division would produce an RMSE of ~15811 and the system proposed in this PR produces an RMSE of ~158.
Not only produces the current method a higher error but because of that it would prefer programs that are less prone to producing denominators lower than 0.001.
For example the RMSE for the program (5/x^3)+(5/x^3) for both protected divisions is ~14250.
For the current protected division the error for the program (5/x^3)+(5/x^3) is lower than for (5/x^4)+(5/x^3) (the function we are trying to learn). The same happens for (5/x^2)+(5/x^3).

In this certain case the current gp system would never be able to learn the original function.

Example B:
If the function we are trying to learn doesn't produce denominators that are lower than 0.001 the current protected division would produce an error of exactly 0.
The proposed method would produce an error larger than 0.0 even if it is the function we are trying to learn.

As Example A and Example B highlight both implementations have their drawbacks.
I therefore propose to use a mixture of both methods.

def _protected_division(x1, x2):
    """Closure of division (x1/x2) for zero denominator."""
    with np.errstate(divide='ignore', invalid='ignore'):
        abs_x2 = np.abs(x2, dtype=np.float64)
        return np.where(np.abs(x2) > _EPS, np.divide(x1, x2), np.sign(x2) * np.divide(x1, abs_x2 + _EPS))

def _protected_log(x1):
    """Closure of log for zero arguments."""
    with np.errstate(divide='ignore', invalid='ignore'):
        abs_x1 = np.abs(x1, dtype=np.float64)
        return np.where(abs_x1 > _EPS, np.log(abs_x1), np.log(abs_x1 + _EPS))

trevorstephens · 2019-03-16T03:18:17Z

That's fairly a persuasive case @wulfihm ... Will think about it

pricebenjamin added 6 commits February 8, 2019 15:29

Minor edits to the introduction.

3570640

Continuing to clarify the introduction.

706a205

* Note that the last sentence in the `Initialization` section was completely removed. It was unclear what this sentence was meant to say, but did not make sense in the context of the surrounding paragraph.

Small changes to last few sections of introduction.

325a7ee

Updated closure for _protected_{division, log, inverse}

218a31b

Remove unnecessary np.errstate context managers

481fc5b

Cast arguments of _protected_{division, log, inverse} as float64

511da4b

Update expected test result when using new protected functions.

f8eaf86

Added testing of protected functions.

70f0c2b

trevorstephens reviewed Feb 17, 2019

View reviewed changes

trevorstephens mentioned this pull request Feb 17, 2019

Improve closure by returning more reasonable values #130

Open

hwulfmeyer mentioned this pull request Nov 10, 2021

Add Analytic Quotient and another protected log #242

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New closure methods #131

New closure methods #131

pricebenjamin commented Feb 10, 2019

pricebenjamin commented Feb 10, 2019 •

edited

Loading

pricebenjamin commented Feb 10, 2019

coveralls commented Feb 10, 2019 •

edited

Loading

pricebenjamin commented Feb 15, 2019

trevorstephens commented Feb 15, 2019 via email

trevorstephens Feb 17, 2019

pricebenjamin Feb 18, 2019 •

edited

Loading

trevorstephens commented Feb 17, 2019

trevorstephens commented Feb 17, 2019

pricebenjamin commented Feb 17, 2019

trevorstephens commented Feb 17, 2019

pricebenjamin commented Feb 17, 2019 •

edited

Loading

pricebenjamin commented Feb 17, 2019

trevorstephens commented Feb 24, 2019

pricebenjamin commented Feb 24, 2019

trevorstephens commented Feb 25, 2019

hwulfmeyer commented Mar 12, 2019 •

edited

Loading

trevorstephens commented Mar 16, 2019

		@@ -14,6 +14,7 @@

		__all__ = ['make_function']

		_EPS = np.finfo(np.float64).eps # Used by protected functions

-_EPS = np.finfo(np.float64).eps # Used by protected functions
+_EPS = 1e-6 # or any other value
+# Consider printing a warning if default value of _EPS is smaller than
+# machine epsilon of whatever dtype is being used (e.g. np.float16)

New closure methods #131

Are you sure you want to change the base?

New closure methods #131

Conversation

pricebenjamin commented Feb 10, 2019

pricebenjamin commented Feb 10, 2019 • edited Loading

pricebenjamin commented Feb 10, 2019

coveralls commented Feb 10, 2019 • edited Loading

pricebenjamin commented Feb 15, 2019

trevorstephens commented Feb 15, 2019 via email

trevorstephens Feb 17, 2019

Choose a reason for hiding this comment

pricebenjamin Feb 18, 2019 • edited Loading

Choose a reason for hiding this comment

trevorstephens commented Feb 17, 2019

trevorstephens commented Feb 17, 2019

pricebenjamin commented Feb 17, 2019

trevorstephens commented Feb 17, 2019

pricebenjamin commented Feb 17, 2019 • edited Loading

pricebenjamin commented Feb 17, 2019

trevorstephens commented Feb 24, 2019

pricebenjamin commented Feb 24, 2019

trevorstephens commented Feb 25, 2019

hwulfmeyer commented Mar 12, 2019 • edited Loading

trevorstephens commented Mar 16, 2019

pricebenjamin commented Feb 10, 2019 •

edited

Loading

coveralls commented Feb 10, 2019 •

edited

Loading

pricebenjamin Feb 18, 2019 •

edited

Loading

pricebenjamin commented Feb 17, 2019 •

edited

Loading

hwulfmeyer commented Mar 12, 2019 •

edited

Loading