You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.
it fails after the first batch and gives the following error message:
Traceback (most recent call last):
File "/Users/jdchoi/workspace/elit/elit/component/postag.py", line 519, in <module>
trainer.step(data.shape[0])
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/mxnet/gluon/trainer.py", line 147, in step
%(param.name, str(data.context)))
UserWarning: Gradient of Parameter `dummyblock0_conv0_weight` on context cpu(0) has not been updated by backward since last `step`. This could mean a bug in your model that maked it only use a subset of the Parameters (Blocks) for this iteration. If you are intentionally only using a subset, call step with ignore_stale_grad=True to suppress this warning and skip updating of Parameters with stale gradient
In fact, it gives the same error message if I make a copy of x and pass it to the fully connected layer:
This is very hacky and not efficient. Could someone explain to me why the first two approaches fail? I often need to transpose the output of the convolution (or even concatenate another vector with the output), and feed into the next layer, so it will be great to know if I could do with with Gluon. Thank you.
The text was updated successfully, but these errors were encountered:
This is a question that preferably is asked on discuss.mxnet.io. To answer your question, you cannot use asnumpy() in the middle of computational graph because autograd can only record operations performed on ndarray.
I created a dummy
Block
, which takes a 2D array, performs a 2D convolution, and feeds the convoluted output to a fully connected layer :I tested
DummyBlock
using the following code:Besides the fact that it doesn't do anything useful, this runs fine without any error. When I transpose
x
and feed it into the fully connected layer:it fails after the first batch and gives the following error message:
In fact, it gives the same error message if I make a copy of
x
and pass it to the fully connected layer:When I reshape
x
and copy transposed values tox
, it runs fine:This is very hacky and not efficient. Could someone explain to me why the first two approaches fail? I often need to transpose the output of the convolution (or even concatenate another vector with the output), and feed into the next layer, so it will be great to know if I could do with with
Gluon
. Thank you.The text was updated successfully, but these errors were encountered: