-
Notifications
You must be signed in to change notification settings - Fork 6.8k
Add grad_req parameter to Block that is passed to ParameterDict.get #7261
Conversation
You can pass in params when creating block or modify grad_req after block is created |
Sure, but thats in my opinion just a workaround. In that case one needs to manually create all parameter arrays. The alternative way requires an extra block.collect_params() and then modifying the grad_req attribute of all parameter arrays, which doesn't feel nice/polished to me either. I think setting grad_req should directly be supported by the mxnet API when creating a Block. |
There are too many attributes for each parameter and too many parameters. Adding one argument for each case is not sustainable. conv.weight.xxx = yyy is a better solution |
Arguably though the library should take care of this and one shouldn't rely on the user setting it for each parameter separately. In the end then the complexity will be on the userside and not in the library. But the role of the library is to hide the complexity. Another workaround may be to specify a default |
you can do it with |
we can also add a setattr method to parameterdict that sets the attributes for each parameter |
The idea behind this PR is that as Blocks can be arbitrarily composed, and as each block abstracts away the inner complexity (e.g. of an RNN Cell or a whole Network) the user may not know which parameters are initialized by the Block, which names they have, etc. Also changing the inner workings of a block (e.g. adding more operators, introducing extra parameters) should at best not make any changes outside of the block necessary. Your workaround does not fulfil these criteria as far as I understand. So there should be some way to specify the |
Regarding setattr for parameterdict, it should work if we always want to change the attributes of all parameters. I am wondering if there is a case in which the user may want to create a Block, that internally uses (lets say) 2 Blocks but each of them has a different grad_req. In that case, with the API introduced by this PR, the user may add 2 parameters to the init function of his block, and pass the respective one to the Blocks that he uses to compose his block. On the other hand for setattr one could introduce a prefix filter, that sets only the attributes of parameters whose name starts with the given prefix, which would result in the same behaviour. |
I reverted the grad_req parameter and added the setattr function. If you reopen the PR github will show the changes. |
python/mxnet/gluon/parameter.py
Outdated
def setattr(self, attr, value, prefix=""): | ||
"""Sets p.attr = value for all params p if their name starts with prefix. | ||
|
||
Can be useful to change the grad_req attribute of a set of parameters. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add explanation for arguments
Parameters
-----------
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added the documentation. Please let me know if you see any issues.
To match the behaviour of setattr.
This needs some refactoring to support batch norm. Currently parameters with |
This seems too complicated. It's easier to write a 3 line for loop than reading the comment and try to understand what condition does. I think we should just avoid this problem by making sure aux parameters always have grad_req=null. I'll probably add a marker like is_aux_state and ignore grad_req setting |
closed since an alternative fix has been merged |
This allows specifying the grad_req for Blocks that is then used when creating Parameters.