Need ability to call input shape from GraphBuilder #5019

crockpotveggies · 2018-04-30T17:18:48Z

Issue Description

When building a network such as NASNet, calling the shape of the input tensor is necessary. For example: https://github.com/keras-team/keras/blob/master/keras/applications/nasnet.py#L498

As per conversation with @AlexDBlack

hm... so I've been staring at this for quite a while (and the paper - figure 4 - https://arxiv.org/pdf/1707.07012.pdf)
seems like yes, they are dynamically looking at the current output size after adding each block
for doing this in DL4J: I see that we have 3 options
One-off manual calculation in this model's builder code: i.e., input/output sizes. Shouldn't be too hard, just tedious (we expose the calcs via ConvolutionUtils)
We augment GraphBuilder with "give me the output size of all layers as a map, for this input size" functionality. Doable, a little messy though (I'd want to keep GraphBuilder simple)... that sort of thing is better in ComputationGraphConfiguration.
We augment ComputationGraphConfiguration with "add layer" type functionality, and expose the existing InputType functionality for each layer (to get input/output shapes)
Option 3 seems like the most reasonable to me, and would allow you to follow the same sort of "inspect and add to an existing config" type pattern there. I guess give me an issue for that and I can try to squeeze it in this week

Version Information

v1.0.0-snapshot

AlexDBlack · 2018-05-03T00:24:34Z

OK, so adding methods like addLayer etc to ComputationGraphConfiguration is straightforward - no issues there. (And is underway here: #5031)

One issue however is the idea of the global configuration overrides. Note that in the graph builder, we specify the global config defaults before the .graphBuilder() call. However, this default information isn't available in the instantiated ComputationGraphConfiguration (only in the GraphBuilder - or more accurately, the NeuralNetConfiguration.Builder in the GraphBuilder).

Now, I see basically 3 options here:

Don't allow global config overrides when using ComputationGraphConfiguration.addLayer (downside: error prone - easy for users to not understand this unless they the javadoc (most won't for such a simple method))
Store the global config in ComputationGraphConfiguration in case we need it (downside: not present in all pre-1.0.0-beta nets, hence also error prone for users)
Allow users to (optionally) pass in a global configuration object

Not sure what the best option is, but for option 3, that would make the API basically:
.addLayer(String name, Layer layer, SomeGlobalConfigObj globalConfig, String... inputNames)

I'm somewhat hesitant to have a "no global config" method in addition for option 3, like .addLayer(String name, Layer layer, String... inputNames), due to the rather different behaviour with the GraphBuilder method of the same signature.

Alternatively: we look at option 2, but with something like:
.addLayer(String name, Layer layer, String... inputNames)
.addLayer(String name, Layer layer, boolean applyGlobalConfig, String... inputNames)
with the former calling the latter with a default value of true... then, if the global config isn't present (i.e., a pre-1.0.0-beta net, or maybe an imported Keras net) we simply throw an exception saying it's absent, and how they can set it.

The global config object would basically be like FineTuneConfiguration for transfer learning.

crockpotveggies · 2018-05-03T00:57:30Z

I'm actually wondering that, while it may seem like more work, the first original option will be the cleanest when it comes to users

One-off manual calculation in this model's builder code: i.e., input/output sizes. Shouldn't be too hard, just tedious (we expose the calcs via ConvolutionUtils)

I'm happy to assist if needed on this since I think the payoff would be worthwhile. This would avoid any future confusion with global configs I think?

AlexDBlack · 2018-05-03T01:26:05Z

OK, so following up on our gitter conversation:

To avoid this (possibly somewhat error-prone for users) "global configuration for ComputationGraphConfiguration" issue, we've decided that:

We won't have addLayer etc methods on ComputationGraphConfiguration
We'll add methods to get output sizes on the builder to get output sizes, and to avoid duplicate functionality, we'll simply instantiate a temporary ComputationGraphConfiguration for the output size calcs

AlexDBlack · 2018-05-04T06:24:04Z

#5031

crockpotveggies · 2018-05-04T06:40:44Z

Excellent work :)

lock · 2018-09-22T05:24:32Z

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

crockpotveggies added the Java label Apr 30, 2018

crockpotveggies assigned AlexDBlack Apr 30, 2018

raver119 added the Enhancement New features and other enhancements label Apr 30, 2018

AlexDBlack added this to the DL4J 1.0.0-beta Release Issues milestone May 1, 2018

AlexDBlack mentioned this issue May 3, 2018

Builder and configuration shapes #5031

Merged

AlexDBlack closed this as completed May 4, 2018

lock bot locked and limited conversation to collaborators Sep 22, 2018

eclipsewebmaster unassigned AlexDBlack Jun 12, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Need ability to call input shape from GraphBuilder #5019

Need ability to call input shape from GraphBuilder #5019

crockpotveggies commented Apr 30, 2018

AlexDBlack commented May 3, 2018

crockpotveggies commented May 3, 2018

AlexDBlack commented May 3, 2018

AlexDBlack commented May 4, 2018

crockpotveggies commented May 4, 2018

lock bot commented Sep 22, 2018

Need ability to call input shape from GraphBuilder #5019

Need ability to call input shape from GraphBuilder #5019

Comments

crockpotveggies commented Apr 30, 2018

Issue Description

Version Information

AlexDBlack commented May 3, 2018

crockpotveggies commented May 3, 2018

AlexDBlack commented May 3, 2018

AlexDBlack commented May 4, 2018

crockpotveggies commented May 4, 2018

lock bot commented Sep 22, 2018