Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Add grad_req parameter to Block that is passed to ParameterDict.get #7261

Closed
wants to merge 5 commits into from

Conversation

leezu
Copy link
Contributor

@leezu leezu commented Jul 31, 2017

This allows specifying the grad_req for Blocks that is then used when creating Parameters.

@piiswrong
Copy link
Contributor

You can pass in params when creating block or modify grad_req after block is created

@piiswrong piiswrong closed this Jul 31, 2017
@leezu
Copy link
Contributor Author

leezu commented Jul 31, 2017

Sure, but thats in my opinion just a workaround. In that case one needs to manually create all parameter arrays. The alternative way requires an extra block.collect_params() and then modifying the grad_req attribute of all parameter arrays, which doesn't feel nice/polished to me either. I think setting grad_req should directly be supported by the mxnet API when creating a Block.

@piiswrong
Copy link
Contributor

There are too many attributes for each parameter and too many parameters. Adding one argument for each case is not sustainable.

conv.weight.xxx = yyy is a better solution

@leezu
Copy link
Contributor Author

leezu commented Jul 31, 2017

Arguably though the library should take care of this and one shouldn't rely on the user setting it for each parameter separately. In the end then the complexity will be on the userside and not in the library. But the role of the library is to hide the complexity.

Another workaround may be to specify a default grad_req in ParameterDict. Thereby one can leverage the existing code that handles the passing of the self.params object and neverthelss integrate grad_req nicely in the API.

@piiswrong
Copy link
Contributor

you can do it with
params = ParameterDict('conv0')
params.get('weight', grad_req = 'null')
conv0 = nn.Conv2D(prefix='conv0', params=params)

@piiswrong
Copy link
Contributor

we can also add a setattr method to parameterdict that sets the attributes for each parameter

@leezu
Copy link
Contributor Author

leezu commented Jul 31, 2017

The idea behind this PR is that as Blocks can be arbitrarily composed, and as each block abstracts away the inner complexity (e.g. of an RNN Cell or a whole Network) the user may not know which parameters are initialized by the Block, which names they have, etc. Also changing the inner workings of a block (e.g. adding more operators, introducing extra parameters) should at best not make any changes outside of the block necessary. Your workaround does not fulfil these criteria as far as I understand.

So there should be some way to specify the grad_req without destroying the abstraction that the introduction of Blocks allowed us to have.

@leezu
Copy link
Contributor Author

leezu commented Jul 31, 2017

Regarding setattr for parameterdict, it should work if we always want to change the attributes of all parameters. I am wondering if there is a case in which the user may want to create a Block, that internally uses (lets say) 2 Blocks but each of them has a different grad_req.

In that case, with the API introduced by this PR, the user may add 2 parameters to the init function of his block, and pass the respective one to the Blocks that he uses to compose his block.

On the other hand for setattr one could introduce a prefix filter, that sets only the attributes of parameters whose name starts with the given prefix, which would result in the same behaviour.

@leezu
Copy link
Contributor Author

leezu commented Jul 31, 2017

I reverted the grad_req parameter and added the setattr function. If you reopen the PR github will show the changes.

@piiswrong piiswrong reopened this Jul 31, 2017
def setattr(self, attr, value, prefix=""):
"""Sets p.attr = value for all params p if their name starts with prefix.

Can be useful to change the grad_req attribute of a set of parameters.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add explanation for arguments

Parameters
-----------
...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added the documentation. Please let me know if you see any issues.

@leezu
Copy link
Contributor Author

leezu commented Aug 1, 2017

This needs some refactoring to support batch norm. Currently parameters with grad_req == 'null' are also overwritten, which leads the python interface to diverge from the true grad_req.

@piiswrong
Copy link
Contributor

piiswrong commented Aug 1, 2017

This seems too complicated. It's easier to write a 3 line for loop than reading the comment and try to understand what condition does.

I think we should just avoid this problem by making sure aux parameters always have grad_req=null. I'll probably add a marker like is_aux_state and ignore grad_req setting

@piiswrong piiswrong closed this Aug 30, 2017
@piiswrong
Copy link
Contributor

closed since an alternative fix has been merged

@leezu leezu deleted the gluon_grad_req branch September 28, 2020 18:34
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants