Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Change: refactor skorch for more consistency when adding custom modul…
…es etc. (#751) * Don't reinitialize uninitialized net bc set_params Previously, when a parameter on, say, the module was changed via set_params (e.g. net.set_params(module__hidden_units=123)), set_params would always trigger (re-)initialization of the module. However, when the net was not initialized in the first place, this is unnecessary. It is sufficient to set the new attribute and wait for the net to be initialized later. Fortunately, this change doesn't seem to have any further impact, i.e. we didn't implicitly rely on this behavior anywhere. The only exceptions are 2 tests in test_cli.py, but those can easily be adjusted and this shouldn't have any user impact. * Simplify initialize_* methods These methods started to become complicated because they did the following: 1. Check if there is anything to initialize at all 2. Print message about reason for potential re-initialization 3. Moving to device That made it quite difficult to override them without forgetting about some aspect. With this change, there are now corresponding _intialize_* methods that are called by net.initialize() and net.set_params. These new methods now take care of the points above and call the initialize_* methods inside. Now, we can more easily make sure that the user can override initialize_* without anything important being forgotten. * Further clean up of set_params re-initialization Removed code for states that could not be reached because of virtual params. This simplifies the logic considerably. * Rework logic of creating custom modules/optimizers So far, the logic for creating custom modules or optimizers was separate from the logic that created the default module, criterion and optimizer. E.g., the "prefixes_" attribute was prefilled with 'module_', 'criterion_' and 'optimizer_'. This makes dealing with custom modules/optimizers (e.g. creating a second module called 'mymodule_') more difficult, because the logic for treating those was completely disjoint from the logic of how the default modules/optimizer were treated. This change actually removes most of the "special status" of module/criterion/optimizer. Therefore, the logic to treat those is now the same as for any custom module. So for instance, they are no longer pre-registered but instead are only registered later during their initialize_* methods. This is implemented by moving the registration to the respective initialize_* methods. This is because during __init__, we don't actually know if we deal with a module or optimizer yet (passed argument for 'module' can, for instance, be a function, so we cannot type check). But during 'initialize', when the actual instances are created, we can check if we deal with a nn.Module or optim.Optimizer. If we do, we register them. So overall, the logic and role of 'initialize' have changed. Users will be expected to set custom modules/optimizers during their respective 'initialize_*' methods from now on (stricter checks and doc updates will be added). This affords us to no longer rely on the name to infer the function (remember that previously, a custom module needed to contain the substring 'module', which is an ugly restriction). As more of a side effect to these changes, the '_check_kwargs' call was moved to 'initialize' as well, since we cannot really check for faulty kwargs as long as we don't know what modules and optimizers will be registered. * Add battery of tests for custom modules/optimizers Right now, there is a big hole in the treatment of custom modules/optimizers that distinguishes them from the assumed ones ('module', 'criterion', 'optimizer'). This battery of unit tests covers behaviors that will fail but really shouldn't: - custom module parameters should be passed to the optimizer - set_params on a custom module should trigger re-initialization of criterion and optimizer - set_params on a custom criterion should trigger re-initialization of optimizer - custom modules and criteria are not automatically moved to cuda * Remove _PYTORCH_COMPONENTS global Since custom components are no longer matched by name, this became obsolete. * All optimizers perform updates automatically Before this, only the default "optimizer_" was used and all others were being ignored. With this change, "zero_grad" and "step" are called on all optimizers automatically. * Fix corner case with pre-initialized modules This case had to be covered yet: When the module/criterion is already initialized and none of it's parameters changed, initialize_module/criterion was not called. However, what if a custom module/criterion does need to be initialized? In that case, not calling initialize_module/criterion is bad. With this fix, this bad behavior no longer occurs. Tests were added to cover this. In order to achieve this change, we had to unfortunately push down the checking whether module/criterion is already initialized from _initialize_module/criterion to initialize_module/criterion. There was no other way of checking this, since at first, we cannot know which attributes are modules/criteria. For the user, this means a little more work if they want to implement initialize_module/criterion absolutely correctly. However, that's not so bad because it is only important if the user wants to work with pre-initialized modules/criteria and with custom modules/criteria, which should happen very rarely in practice. And even if it does, the user can just copy the default skorch code and will end up with a correct implementation. * Custom modules are set to train/eval mode Until now, only module_ and criterion_ were automatically set into training/evaluation mode, now custom modules are also set automatically. This was implemented through a new method, net._set_training. It is private for now, maybe consider adding a public one. Also, the name could be changed to "train" as in PyTorch, but that name could be confusing. * Reviewer comment: Consider virtual params I did not correctly handle virtual params with custom optimizers. This has been fixed now. The ambiguous 'lr' parameter is only associated with the default 'optimizer', not any custom optimizer, which need to be addressed by 'myoptimizer__lr'. * For completeness, the text from the PR is copied below: Motivation The initial reason why I wanted to work on this is that I'm currently implementing a gpytorch integration (see this branch). For this, a big part is adding a new custom module called "likelihood". Doing this correctly was actually not trivial and required a lot of more or less duplicated code. Putting such a burden on a user with less experience with skorch would not be possible. The main reason for this difficulty is that module, criterion and optimizer are treated "special" so far. We assume that they are already there and build everything else around this. If a custom module is added, the user needs to be aware of all the places where this is relevant, which is too error prone. Previous work Some changes to facilitate adding custom modules were already implemented in #597. However, they don't go far enough. Main changes With this PR, we remove the special status of module, criterion and optimizer. Instead, all the work that needs to be done when adding any of them to the net is now implemented in a generic manner. This way, custom modules etc. can re-use the same functionality and can therefore expect to be treated the same as these "first class" components. Here is a list of changed that were added to that effect: * Until now, custom module parameters were not trained by the optimizer, now they are * Until now, custom modules/criteria were not automatically moved to the indicated device, now they are * Until now, custom modules/criteria were not automatically set into train/eval mode, now they are * Simplified implementation of initialize_module et al. - they contained a lot of stuff that was irrelevant for the user, like messaging about why something was re-initialized; now this stuff is done inside the newly added methods _initialize_module etc., which are called by initialize() and shouldn't be a bother to the user * Adding a custom module no longer requires the attribute name to contain the substring "module" (which was really not a nice solution), same for criterion and optimizer * Re-initialization logic was changed: When any module is changed (via set_params), this triggers re-initialization of all modules, criteria and optimizers; when any criterion is changed, this triggers re-initialization of all optimizers (but not modules); this is a bit defensive since it could trigger unnecessary inits but it's better than missing any inits Additions * There is now a get_learnable_params method on the net to retrieve all learnable parameters (instead of just those of module_); it is meant to be overridable by the user (e.g. when they have two optimizers for two modules) * Added attributes modules_, criteria_ and optimizers_ to the net to keep track of those; first started as OrderedDicts to mirror nn.Modules, but that was flaky, as the values in the dict would often get out of sync with the attributes on the net * If the criterion/criteria have learnable params, those are now passed to the optimizer as well (think GANs) Minor changes * net.set_params(...) no longer initializes the net if it's not yet initialized - this was simply unnecessary and could lead to some unexpected behavior * custom module instances now need to be set inside initialize_module (and the name must end on an underscore), else the user will get an appropriate error message; same logic for criterion and optimizer * added a bunch of unit tests for the custom modules etc. that cover the cases not covered so far * checking of kwargs is now done during initialize, not during __init__ anymore, since at that point, we don't know yet what custom modules could exist * run a _check_kwargs during set_params - previously, this was a loophole that allowed users to set params with typos etc. * two unconditional print statements are now conditioned on verbosity level Notes I took extra effort to write the code as clearly as possible and add lots of comments, since this touches some complicated parts of the code base. But if something is not obvious, please tell me so that I can improve the code since now it's still fresh in my mind. You will see that a few of the existing tests have been changed to now call initialize on the net when previously they didn't. The reason is that some work like checking kwargs is now moved to initialize. Also, you will see that some tests now use mocked modules to check for device calls. I found this preferable to actually moving to 'cuda' since this will also work without a cuda device (e.g. during CI).
- Loading branch information