New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New mixin-based template structure #1092
Conversation
@AntonioCarta There is also another point regarding plugins: We can either have one general base plugin class that consists of all possible callbacks that trigger plugins, or we can extend What do you think about the extension of plugins? we should also consider that we might need to update it for every new "common" template as well. |
This doesn't make much sense from an inheritance perspective. Childs can add callbacks but they cannot remove them, i.e. they need to support all the callbacks of the parents. Notice that this is what we do now, where |
|
||
# ==================================================================> NEW | ||
|
||
def maybe_adapt_model_and_make_optimizer(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need this additional method. It's already covered by _before_training_exp
. In a subclass you would do something like:
def _before_training_exp(...):
# do stuff like model adapt and optimizer init
# can also define new callbacks here if needed
super()._before_training_exp(...)
|
||
|
||
class OnlineObservation: | ||
def _train_exp( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doesn't look specific to OCL. Can we move it to BaseSGD or up the hierarchy?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here we don't have a loop over train_epochs unlike batch_observations
.
self.loss = 0 | ||
|
||
# Fast updates | ||
self._before_fast_update(**kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you also need the fast_update
method between the before/after calls. Same problem for slow update
from avalanche.training.utils import trigger_plugins | ||
|
||
|
||
class BaseGeneralSGDTemplate(BaseTemplate): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not call it just SGDTemplate? it's taking an optimizer as argument, so it must be some kind of sgd-based training.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
alternatively, you can call it IterativeTemplate
. In this case, I would consider removing the optimizer
from this class and pushing it down the hierarchy
** I've added the prefix |
avalanche/NEW_core.py
Outdated
|
||
# ====================================================================> NEW | ||
|
||
def before_inner_updates( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we have inner/outer updates here? SGD doesn't have an inner/outer loop
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is what I discussed previously. The new base SGD template is supposed to be the base class for all possible sorts of SGD-based strategies, and the meta-learning template is one of them (as defined in common_tamplates.py
). So the idea is to add all callbacks that can be triggered by any of the SGD-based templates. Or are you suggesting splitting it into a separate plugin class for meta-learning-based strategies?
self.rough_sz = math.ceil(bsize_data / self.n_inner_updates) | ||
self.meta_losses = [0 for _ in range(self.n_inner_updates)] | ||
|
||
def single_inner_update(self, x, y, t, criterion): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
plugins usually implent only before/after methods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I explained this in the new plugin comments
avalanche/NEW_core.py
Outdated
"""Called before `_inner_updates` by the `BaseTemplate`.""" | ||
pass | ||
|
||
def inner_updates( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not a before/after method, shouldn't be here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's true, but the reason I added the inner_updates
callback as a plugin trigger is that, unlike supervised strategies, we don't have any "Naive" type of updates that I can set as default for meta-learning-based strategies.
More precisely, in supervised strategies, we have the training_epoch
function that is implemented in its most basic form (which is Naive fine-tuning), and you can augment it by adding new plugins. We don't have such a general structure similar totraining_epoch
for inner updates in meta-learning, and it can be completely different from method to method. That's why I added it as a plugin trigger that has to be implemented by the user. Do you have other suggestions?
avalanche/NEW_core.py
Outdated
"""Called before `_outer_updates` by the `BaseTemplate`.""" | ||
pass | ||
|
||
def outer_update( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as inner_update. Shouldn't be here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same answer as the previous one.
|
||
# ==================================================================> NEW | ||
|
||
def maybe_adapt_model_and_make_optimizer(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need this method? strategies can override _before_training_exp
for the same behavior
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you need a call that can be implemented by observation_type
I would use a more general name, something like prepare_training_exp
or similar.
Use [] if you do not want to evaluate during training. | ||
:param kwargs: custom arguments. | ||
""" | ||
if eval_streams is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
L19-L26 should be factored out of this method since it's common for all strategies
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The loop over training_epochs
and the plugin triggers can be different in different observations, that's why I explicitly defined them here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then it's not clear to me what is the difference between observation type and templates. I expected templates to modify the loop and observation types to only add new attributes/methods that are called by the loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have a clear definition of template/observation/problem/update? For example a clear interface that they must specify
…o new_templates
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As mentioned in the comments, let me know your opinion about how to create a more "general" plugin class that would cover all possible callbacks. Thanks!
I think you are breaking the inheritance hierarchy with this approach. For example, in your general template you have inner and outer loops, but these do not make sense for some of the child classes. This is undesirable for many reasons. To give a more concrete example: imagine that you want to create a hierarchy of animals (BaseAnimal is the parent, FlyingAnimal and SeaAnimal the childs). You are saying that all the animals should implement both |
@AntonioCarta the updates are made according to our last discussion. |
|
||
self._after_training_iteration(**kwargs) | ||
# Should be implemented in Update Type | ||
raise NotADirectoryError() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be NotImplementedError
@@ -271,8 +216,142 @@ def eval_epoch(self, **kwargs): | |||
|
|||
self._after_eval_iteration(**kwargs) | |||
|
|||
# ==================================================================> NEW | |||
|
|||
def maybe_adapt_model_and_make_optimizer(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is this method? why do we need it? we already have before_training_exp
which covers model adaptation and optimizer init.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe change the name to prepare_for_observation
, or something similar. It doesn't have to be about model adaptation or optimizers
Use [] if you do not want to evaluate during training. | ||
:param kwargs: custom arguments. | ||
""" | ||
if eval_streams is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then it's not clear to me what is the difference between observation type and templates. I expected templates to modify the loop and observation types to only add new attributes/methods that are called by the loop.
Use [] if you do not want to evaluate during training. | ||
:param kwargs: custom arguments. | ||
""" | ||
if eval_streams is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have a clear definition of template/observation/problem/update? For example a clear interface that they must specify
|
||
|
||
class MetaUpdate: | ||
def training_epoch(self, **kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use the same training_epoch
method for all and define only the iteration here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because the epoch is the same, what changes is the single iteration
@HamedHemati can you explain the difference between a template and an observation type? update and problem type are clear to me. I think there isn't a clear separation between template and observation. |
Update:
|
…o new_templates � Conflicts: � examples/lamaml_cifar100.py
…o new_templates � Conflicts: � avalanche/evaluation/metrics/images_samples.py � avalanche/training/supervised/lamaml.py � avalanche/training/templates/base_online_sgd.py � avalanche/training/templates/base_sgd.py � avalanche/training/templates/online_supervised.py � avalanche/training/templates/supervised.py � examples/online_naive.py � tests/training/test_online_strategies.py
I'm creating this draft PR for the new mixin-based template structure.
For now, the main goal is to implement working version of the templates that we want and get feedback to improve the structure's simplicity and fix potential bugs.
Current status and next step:
Naive
andOnlineNaive
strategies and both work fine.Here is a summary of the changes that I've made:
BaseGeneralSGDTemplate
that combines the callbacks from the existingBaseSGDTemplate
andSupervisedTemplate
in order to create a more general template.NewTemplate(ObservationType, ProblemType, UpdateType, BaseGeneralSGDTemplate)
common_templates.py
I've added the templates for supervised and online supervised strategies and insidestrategy_wrappers_temp.py
you can seeNaive
andOnlineNaive
strategies.