-
Notifications
You must be signed in to change notification settings - Fork 6.8k
Conversation
} | ||
|
||
template<typename xpu> | ||
void PlusBroadcastBackward_(const OutputGrad& out_grad, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we reuse backward code for different operation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only plus and minus can reuse the same codes
Very nice! I would suggest an add on to our shape inference mechanism here: we should allow the user to attach shape information to a |
@pluskid i didn't get your question. currently the shape inference is done by |
@mli For example, this currently works import mxnet as mx
a = mx.sym.Variable('a')
b = mx.sym.Variable('b')
c = a+b
c.infer_shape(a=(10,20)) But if we replace |
the only way works is >>> d = mx.sym.BroadcastPlus(a,b)
>>> d.infer_shape(a=(10,20), b=(10,1)) so your use case is that the shape of |
@mli in my use case, a = mx.sym.Variable('a')
b = mx.sym.Variable('b', shape=(1, 10))
c = mx.sym.BroadcastPlus(a,b)
c.simple_bind(a=(5, 10)) |
understood now. i'm agree that adding a shape attribution is cleaner than the possible solution that passes additional shape info to the data iterator |
I guess dont broadcast unless you have to and chiyuan's example still works.
|
@piiswrong My problem is that I know that the expected behavior is to be shared. For example, if I want to implement the fully connected layer with low level arithmetics, I could say |
} | ||
|
||
|
||
template<typename xpu, typename LHS_OP, typename RHS_OP> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you feed both left hand right to both LHD_OP and RHS_OP mulbackward can be merged right? This shouldn't increase time or memory.
please check why the python test fails. Note that if bcast do not support inplace, inplace declaration should be disabled |
.describe("lhs minus rhs with broadcast"); | ||
|
||
MXNET_REGISTER_SIMPLE_OP(_broadcast_mul, XPU) | ||
.set_symbol_op_name("BroadcastMul") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for new operators, let us keep the name consistent, with lower cases, so mx.sym and mx.nd have exact the same function.
@mli Any updates on this? |
Really looking forward to some updates. |
@mli is probably busy lately. Can someone takeover this? |
closing this since we are moving to new one |
add a new op
_broadcast_plus
, both forward and backward are tested.before extending to other elemental binary operators easily, i'd like to PR first because the implementation is unexpected complicate, i may be get something wrong..