You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I wish I could label this with "good first issue", but this issue is a bit meatier than the label would suggest. That being said, this has the potential to be a really fun issue for an eager developer to take on. It:
will speeds some things up, which is about the most exciting thing you can do in mygrad 😜
requires a simple, but potentially elegant revision of Operation
entails writing some slick automated tests
If there is anyone, or multiple people, who would like to participate in this, please let me know. I think it would be a great learning experience. I will happily provide guidance ranging from: "very hands on", to: "give general insights and review", depending on your needs/preferences.
Obviously if there is not any interest, I will end up taking this on myself in a few weeks or so. However, if you do want to take this on but at a later date, just let me know.
Math
f(x, y) is symmetric if f(x, y) = f(y, x). Thus the following is true for the derivatives for a symmetric function f:
(1)
Suppose we want to compute the total derivative of a symmetric function with identical inputs. I.e.:
(2)
given the relationship deduced above, this can be reduced to a single partial derivative.
(3)
Obviously, this reduction extends trivially to symmetric functions of N inputs, where the factor of 2 becomes a factor of N.
Current State of MyGrad
Presently, MyGrad will always compute its derivatives in long-form (equation 2), even in the instance that it is dealing with a symmetric function that may receive identical inputs.
An exception to this is EinSum, which implements its own backprop so that common optimized sum-reduction cases like einsum("..., ...", x, x) don't drag during backprop; it implements the logic of equation 3 when it has a symmetric reduction case and identical inputs.
Proposal
Operation should have a symmetries attribute that allows individual operations identify symmetry relationships among its inputs. This would mean that those operations with symmetries would check for identical inputs (as enforced by is, not ==), and would compute the total derivative using the reduced form (equation 3) where possible.
The outcome of this is some nice, simple optimizations so that users can freely write things like logaddexp(x, x) without incurring redundant computations during backprop.
The text was updated successfully, but these errors were encountered:
It is unclear how useful this is in general. Preliminary timings for special casing of (x * x).backward() did not indicate any appreciable speedup.
It might be good to support (x/x).backward() special casing because that would save a substantial amount of compute as well as potential numerical stability issues. Although, it is also potentially problematic for backprop through x / x to differ from x / +x in any appreciable way.
I wish I could label this with "good first issue", but this issue is a bit meatier than the label would suggest. That being said, this has the potential to be a really fun issue for an eager developer to take on. It:
Operation
If there is anyone, or multiple people, who would like to participate in this, please let me know. I think it would be a great learning experience. I will happily provide guidance ranging from: "very hands on", to: "give general insights and review", depending on your needs/preferences.
Obviously if there is not any interest, I will end up taking this on myself in a few weeks or so. However, if you do want to take this on but at a later date, just let me know.
Math
f(x, y) is symmetric if f(x, y) = f(y, x). Thus the following is true for the derivatives for a symmetric function f:
(1)
Suppose we want to compute the total derivative of a symmetric function with identical inputs. I.e.:
(2)
given the relationship deduced above, this can be reduced to a single partial derivative.
(3)
Obviously, this reduction extends trivially to symmetric functions of
N
inputs, where the factor of 2 becomes a factor ofN
.Current State of MyGrad
Presently, MyGrad will always compute its derivatives in long-form (equation 2), even in the instance that it is dealing with a symmetric function that may receive identical inputs.
An exception to this is
EinSum
, which implements its ownbackprop
so that common optimized sum-reduction cases likeeinsum("..., ...", x, x)
don't drag during backprop; it implements the logic of equation 3 when it has a symmetric reduction case and identical inputs.Proposal
Operation
should have asymmetries
attribute that allows individual operations identify symmetry relationships among its inputs. This would mean that those operations with symmetries would check for identical inputs (as enforced byis
, not==
), and would compute the total derivative using the reduced form (equation 3) where possible.The outcome of this is some nice, simple optimizations so that users can freely write things like
logaddexp(x, x)
without incurring redundant computations during backprop.The text was updated successfully, but these errors were encountered: