-
Notifications
You must be signed in to change notification settings - Fork 6.8k
-
Notifications
You must be signed in to change notification settings - Fork 6.8k
Manipulating nn.Dense(...) layer parameters #11133
Comments
@lu4 check here for resources on creating custom gluon layers - https://gluon.mxnet.io/chapter03_deep-neural-networks/custom-layer.html#Craft-a-bespoke-fully-connected-gluon-layer @sandeep-krishnamurthy please label - "Gluon", "Question" |
@anirudhacharya please let me rephrase my question, since there were two parts to it: First part: Motivation: It's straightforward and safe to augment weight matrix for hidden layers with zero matrix, as it won't affect the state of training. Please, consider the following example: https://i.imgur.com/RSPJgAo.png Please note that augmenting zeros (or weights very close to zero) won't affect the state of existing training if subject ActivationFunction at 0 is 0. Second part: |
Hi @lu4, You can create a clone of your network, and then make adjustments during the copy. If you're using a Sequential Block as a container for your network, you could create another Sequential Block and add all of the layers from one network to the other, which would save redefining the network. You would make changes to the necessary layers before adding them to the new Sequential Block. As I understand the problem, you'll need to change the weights and biases for the layer you want to expand, and the weights for the next dense layer (as the weights shape depends on the units in the layer before which has changed). After constructing the news weights and biases (i.e. padding with 0s), you can then use Unfortunately I don't think you can't mutate the original network like this, because you're changing the shape of the parameters. You'll hit shape assertion errors. And you can't just swap out a single layer in the original Sequential Block because they don't support assignment. |
@lu4 # Create a layer
net = gluon.nn.Dense(2, in_units=100, use_bias=True)
net.initialize()
# Update the weights of a layer
net.weight.set_data(mx.nd.ones((2,100)))
net.bias.set_data(mx.nd.ones((2)))
net(mx.nd.ones((1,100)))
To expand on his warning, you'd need to initialize the network at the maximum size first, because you can't reshape the parameters but you can indeed fill them with weights padded with zeros. @indhub Could you please close the issue? Thanks! |
I'm trying to implement my own optimization algorithm for MxNet (Imperative / Gluon) that does not rely gradients. My question is pretty simple is there a simple way to create new
nn.Dense(...)
layer initialized with parameters (i.e. Biases and Weights) represented by just two nd.array() instances?Thank you in advance!
The text was updated successfully, but these errors were encountered: