Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

[MXNET-524] Broadcast like operator #11820

Merged
merged 7 commits into from
Jul 20, 2018
Merged

Conversation

ifeherva
Copy link
Contributor

Description

Operator, which can output a broadcasted array for the given target. This allows easier broadcasting and hybridization.

@lanking520
Copy link
Member

Thanks for your contribution! @zhanghang1989, @eric-haibin-lin for review

@taliesinb
Copy link
Contributor

There is a generalization that would be extremely useful for this operator to have. The generalization is very similar to one that was discussed at https://discuss.mxnet.io/t/reshaping-broadcasting-without-hardcoding-target-dimensions/851/6 (you can skip to the last 4 comments, the thread contains an irrelevant proposal although the motivation is relevant).

In short, the generalization would allow only specific dimensions to be copied from the 'other' tensor. For example:

input.shape = (1, 2, 1, 3)
other.shape = (5, 6, 7, 8)
output = broadcast_like(input, other, input_axes:(0,2), other_axes:(1,3))
output.shape = (6, 2, 8, 3)

In other words, what's happening here is that the you can pick exactly which axes of the other tensor you want to use to "fill in" axes of the input tensor. This is how broadcast_axes works, except instead of providing the values via a size parameter, you are providing them from specific axes in the other tensor.

The reason this is so valuable is that it is common to have another tensor that contains the dimension you want to broadcast amongst a set of irrelevant dimensions. There is simply no other way of "extracting" the relevant dimension from elsewhere in the net, so currently you have to hardcode that dimension into a parameter list, which forces expensive workarounds like bucketing where otherwise cheap reshaping would work to make a net that is compatible with multiple sequence lengths, for example.

The current behavior of broadcast_like in the PR would be consistent with this generalization if the default value of input_axis is the empty tuple, which means "all axes".

Copy link
Contributor

@zhanghang1989 zhanghang1989 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -207,6 +207,7 @@ Composite multiple symbols into a new one by an operator.

Symbol.broadcast_to
Symbol.broadcast_axes
Symbol.broadcast_like
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@ifeherva
Copy link
Contributor Author

@taliesinb Very interesting proposal indeed. The implementation is quite straighforward and I am happy to do it if this is something that is planned to happen. Is there a JIRA ticket open for this? I propose to have it in a separate PR.

@taliesinb
Copy link
Contributor

@ifeherva if you're enthusiastic about this proposal that's great! yes, another PR might make sense. i'm not aware of a JIRA ticket, but the design of reshape_like is very similar to this proposal and that design was proposed by @piiswrong with the goal of solving the same kind of problem (my colleague @sbodenstein is going to submit a PR for that reshape_like extension in the next few days).

@ifeherva
Copy link
Contributor Author

@taliesinb Great! Once that one is merged I can adapt broadcast_like as well.

@szha szha merged commit b16f875 into apache:master Jul 20, 2018
KellenSunderland pushed a commit to KellenSunderland/incubator-mxnet that referenced this pull request Jul 21, 2018
* Registered the broadcast_like operator with GPU and CPU

Added appropriate shape inference

* Added python interface to ndarray and symbol

* Added python api documentation

* Fixed backward operation

* Added unit tests

* Fixed linting issues

* Added missing api doc
XinYao1994 pushed a commit to XinYao1994/incubator-mxnet that referenced this pull request Aug 29, 2018
* Registered the broadcast_like operator with GPU and CPU

Added appropriate shape inference

* Added python interface to ndarray and symbol

* Added python api documentation

* Fixed backward operation

* Added unit tests

* Fixed linting issues

* Added missing api doc
@ifeherva ifeherva deleted the broadcast_like_symbol branch February 10, 2019 04:45
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants