Manipulating nn.Dense(...) layer parameters #11133

lu4 · 2018-06-04T01:27:38Z

I'm trying to implement my own optimization algorithm for MxNet (Imperative / Gluon) that does not rely gradients. My question is pretty simple is there a simple way to create new nn.Dense(...) layer initialized with parameters (i.e. Biases and Weights) represented by just two nd.array() instances?

Thank you in advance!

The text was updated successfully, but these errors were encountered:

anirudhacharya · 2018-06-04T17:46:56Z

@lu4 check here for resources on creating custom gluon layers - https://gluon.mxnet.io/chapter03_deep-neural-networks/custom-layer.html#Craft-a-bespoke-fully-connected-gluon-layer

@sandeep-krishnamurthy please label - "Gluon", "Question"

lu4 · 2018-06-04T20:38:49Z

@anirudhacharya please let me rephrase my question, since there were two parts to it:

First part:
Problem: It is hard to train large network. It's simpler to start with smaller layer size and incrementally increase the size along the training. In such way it will also reduce the chance of overfitting.

Motivation: It's straightforward and safe to augment weight matrix for hidden layers with zero matrix, as it won't affect the state of training.

Please, consider the following example: https://i.imgur.com/RSPJgAo.png

Please note that augmenting zeros (or weights very close to zero) won't affect the state of existing training if subject ActivationFunction at 0 is 0.

Second part:
Is there a way to not perform (re)implementation of all existing layer types but instead just pass the weights I desire to put into a net?

thomelane · 2018-06-06T00:15:26Z

Hi @lu4,

You can create a clone of your network, and then make adjustments during the copy. If you're using a Sequential Block as a container for your network, you could create another Sequential Block and add all of the layers from one network to the other, which would save redefining the network. You would make changes to the necessary layers before adding them to the new Sequential Block.

As I understand the problem, you'll need to change the weights and biases for the layer you want to expand, and the weights for the next dense layer (as the weights shape depends on the units in the layer before which has changed). After constructing the news weights and biases (i.e. padding with 0s), you can then use set_data on the parameters of interest before adding to the new Sequential Block.

Unfortunately I don't think you can't mutate the original network like this, because you're changing the shape of the parameters. You'll hit shape assertion errors. And you can't just swap out a single layer in the original Sequential Block because they don't support assignment.

ThomasDelteil · 2018-07-05T19:35:53Z

@lu4
To expand on @thomelane, here is a practical example of how you can set the data:

# Create a layer
net = gluon.nn.Dense(2, in_units=100, use_bias=True)
net.initialize()

# Update the weights of a layer
net.weight.set_data(mx.nd.ones((2,100)))
net.bias.set_data(mx.nd.ones((2)))

net(mx.nd.ones((1,100)))

[[101. 101.]]
<NDArray 1x2 @cpu(0)>

To expand on his warning, you'd need to initialize the network at the maximum size first, because you can't reshape the parameters but you can indeed fill them with weights padded with zeros.

@indhub Could you please close the issue? Thanks!
@lu4 if that doesn't answer your question and you would like to follow up, please create a post on https://discuss.mxnet.io Thanks!

sandeep-krishnamurthy added Question Gluon labels Jun 4, 2018

indhub closed this as completed Jul 6, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Manipulating nn.Dense(...) layer parameters #11133

Manipulating nn.Dense(...) layer parameters #11133

lu4 commented Jun 4, 2018 •

edited

Loading

anirudhacharya commented Jun 4, 2018

lu4 commented Jun 4, 2018 •

edited

Loading

thomelane commented Jun 6, 2018

ThomasDelteil commented Jul 5, 2018 •

edited

Loading

Manipulating nn.Dense(...) layer parameters #11133

Manipulating nn.Dense(...) layer parameters #11133

Comments

lu4 commented Jun 4, 2018 • edited Loading

anirudhacharya commented Jun 4, 2018

lu4 commented Jun 4, 2018 • edited Loading

thomelane commented Jun 6, 2018

ThomasDelteil commented Jul 5, 2018 • edited Loading

lu4 commented Jun 4, 2018 •

edited

Loading

lu4 commented Jun 4, 2018 •

edited

Loading

ThomasDelteil commented Jul 5, 2018 •

edited

Loading