Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LeakyReLU #81
LeakyReLU #81
Changes from 27 commits
1e27430
180a527
0f5200a
7d36fde
533b86e
fd2e33b
8b69cab
ba299ff
9b4d0d6
dbeb414
7771155
6e9d9f7
18e30f9
c5b058b
c72fd11
68d3518
c14a5db
4a21b73
b4119fd
1b97a0f
1ebd9ba
07cc632
f2f167e
003dfb6
e17d5d7
a83cb7b
28158ae
6077578
410166b
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO that's not a good fix since it destroys the whole point of just having one broadcast expression that is fused together. Now we would allocate two additional vectors of ones and zeros just for Tracker.
Did you check if we can work around the Tracker bug by using a separate function
ispositive(x) = x > zero(x)
andisnegative(x) = x < zero(x)
that we can include in the broadcast expression instead of the explicitx < zero(x)
etc statements?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not two vectors; just two numbers, right? (I realize the above impl was wrong, it was supposed to contain
eltype
)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, even if we use
isnegative
, we're running into the same issue because we're broadcasting overone(x)
, so we would have to also have this in a separate function, e.g.mul1(x, y) = x * one(y)
or something 😕There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With
eltype
it's just a number - IMO it's still unfortunate since there would be no need for this type-based computation if Tracker would be fixed (I opened a PR: FluxML/Tracker.jl#85). In general it's better to not use types if not needed since the instances contain more information (similar argument for why generated functions should be used only if needed). It should just work with this single broadcast expression here.Apart from that, are the
let
blocks actually needed?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW it seems the tests still fail?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It still breaks, but now the only one failing is
LeakyReLU{..., 1}
. Again it hits a snag atmaterlialize
:There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
100% agree! And nice!
Nah, I just added them to make explicit that it's only needed for this particular statement, and not to clutter the names in the full function. Was nicer IMO, but ofc subjective.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think my only real comment here is on testing other AD backends like ReverseDiff, but I'm not sure how important that is here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are some tests for it in
test/bijectors/interface.jl
. But yeah, testing in Bijectors is honestly a bit of mess atm. In a couple of the other PRs I've added some functionality which makes it easier to use a "standardized" testing suite for a newBijector
, so the plan is to use that in the future 👍