New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix binary crossentropy double epsilon #41595
Fix binary crossentropy double epsilon #41595
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the PR!
Here are the internal errors, @fhennecker can you please verify ? Traceback (most recent call last): |
@fhennecker, Any update on this PR? Please. Thanks! |
@pavithrasv @gbaned sorry was away for a couple of weeks. I changed the expected values for all binary crossentropy tests that were affected and updated explanatory comments. They were pretty explicitly "using" double epsilons, I hope the behaviour I'm proposing is still the expected one. |
Thank you @fhennecker . The change looks correct to me, however since this could be a big breaking change let me run a few tests to see how much impact this has. |
As suspected this will be a breaking change breaking a lot of other Google users. I will need to make all of these changes together. Will make sure to keep this PR updated. |
@pavithrasv Any update on this PR? Please. Thanks! |
@gbaned Sorry about the delay, i have been busy working on other items. I will try to make this change sometime within the next couple of weeks. |
Will try to look at it today, or next week. |
0231053
to
a7bcb71
Compare
There are ongoing test failures -- please fix. |
I fixed what I could fix, but if the tests fail again I might need a bit of help. I couldn't access the details of the failures for some of the github checks, and I ran into quite a lot of trouble trying to run the tests locally, but at least the Ubuntu CPU check should be fixed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's try this again.
Ok, so it looks like there are 3 last failing steps: |
The PR says "merged", but in practice it isn't merged. What happened? |
When I compute a binary crossentropy with Keras betweeen a target value == 1 and a prediction value == 0, the crossentropy formula contains a
log(0)
which does not exist and is avoided adding a small epsilon value.The epsilon has always been
1e-7
so far, and so the numerical value of the binary crossentropy in the case described above is supposed to be equal to-log(1e-7) == 16.11809539794922
In previous versions of Keras this was true. However, on (at least) tensorflow 2.2, it's equal to
15.424948470398375
which is the same as2e-7
. This is because the binary crossentropy code:(epsilon, 1-epsilon)
This results in a clipping between
(2*epsilon, 1)
which I believe is not the expected behaviour.This pull request removes the second step.
Here is a snippet to reproduce:
If this is indeed a bug, I can also open an issue for easier traceability.