Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New mixin-based template structure #1092

Merged
merged 16 commits into from Oct 20, 2022

Conversation

HamedHemati
Copy link
Collaborator

I'm creating this draft PR for the new mixin-based template structure.
For now, the main goal is to implement working version of the templates that we want and get feedback to improve the structure's simplicity and fix potential bugs.

Current status and next step:

  • Current changes are completely separated from the main templates and do not affect existing strategies.
  • I've tested the current changes for Naive and OnlineNaive strategies and both work fine.
  • Next step: add meta-learning template.

Here is a summary of the changes that I've made:

  • I added a new template called BaseGeneralSGDTemplate that combines the callbacks from the existingBaseSGDTemplate and SupervisedTemplate in order to create a more general template.
  • As discussed in ADD Meta Learning Template #1079 a new template can be implemented with the following structure:
    NewTemplate(ObservationType, ProblemType, UpdateType, BaseGeneralSGDTemplate)
  • In common_templates.py I've added the templates for supervised and online supervised strategies and inside strategy_wrappers_temp.py you can see Naive and OnlineNaive strategies.

@coveralls
Copy link

coveralls commented Jul 12, 2022

Pull Request Test Coverage Report for Build 3284238644

  • 221 of 278 (79.5%) changed or added relevant lines in 44 files are covered.
  • 2 unchanged lines in 2 files lost coverage.
  • Overall coverage decreased (-0.3%) to 73.124%

Changes Missing Coverage Covered Lines Changed/Added Lines %
avalanche/evaluation/metric_definitions.py 0 1 0.0%
avalanche/evaluation/metric_utils.py 0 1 0.0%
avalanche/evaluation/metrics/checkpoint.py 0 1 0.0%
avalanche/evaluation/metrics/confusion_matrix.py 0 1 0.0%
avalanche/evaluation/metrics/forgetting_bwt.py 0 1 0.0%
avalanche/evaluation/metrics/forward_transfer.py 0 1 0.0%
avalanche/evaluation/metrics/gpu_usage.py 0 1 0.0%
avalanche/evaluation/metrics/images_samples.py 0 1 0.0%
avalanche/evaluation/metrics/labels_repartition.py 0 1 0.0%
avalanche/evaluation/metrics/mean_scores.py 0 1 0.0%
Files with Coverage Reduction New Missed Lines %
avalanche/benchmarks/utils/data_loader.py 1 81.17%
avalanche/training/templates/base_sgd.py 1 89.19%
Totals Coverage Status
Change from base Build 3282871086: -0.3%
Covered Lines: 12932
Relevant Lines: 17685

💛 - Coveralls

@HamedHemati
Copy link
Collaborator Author

HamedHemati commented Jul 14, 2022

@AntonioCarta There is also another point regarding plugins:
I thought about implementing an existing meta-learning-based strategy as a plugin for SupervisedMetaLearningTemplate and realized that we also need to include inner/outer callbacks to the base plugin class.

We can either have one general base plugin class that consists of all possible callbacks that trigger plugins, or we can extend BaseSGDPlugin as below:
BaseSGDPlugin->SupervisedPlugin *which already exists
BaseSGDPlugin->MetaLearningPlugin *this needs to be added

What do you think about the extension of plugins? we should also consider that we might need to update it for every new "common" template as well.

@AntonioCarta
Copy link
Collaborator

We can either have one general base plugin class that consists of all possible callbacks

This doesn't make much sense from an inheritance perspective. Childs can add callbacks but they cannot remove them, i.e. they need to support all the callbacks of the parents.

Notice that this is what we do now, where Base -> BaseSGD -> Supervised progressively add new callbacks.


# ==================================================================> NEW

def maybe_adapt_model_and_make_optimizer(self):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need this additional method. It's already covered by _before_training_exp. In a subclass you would do something like:

def _before_training_exp(...):
    # do stuff like model adapt and optimizer init
    # can also define new callbacks here if needed
    super()._before_training_exp(...)



class OnlineObservation:
def _train_exp(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't look specific to OCL. Can we move it to BaseSGD or up the hierarchy?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we don't have a loop over train_epochs unlike batch_observations.

self.loss = 0

# Fast updates
self._before_fast_update(**kwargs)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you also need the fast_update method between the before/after calls. Same problem for slow update

from avalanche.training.utils import trigger_plugins


class BaseGeneralSGDTemplate(BaseTemplate):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not call it just SGDTemplate? it's taking an optimizer as argument, so it must be some kind of sgd-based training.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alternatively, you can call it IterativeTemplate. In this case, I would consider removing the optimizer from this class and pushing it down the hierarchy

@HamedHemati
Copy link
Collaborator Author

  • The meta-update class is updated.
  • @AntonioCarta I applied some of your comments, for the other comments I need to discuss them with you in the next meeting.
  • Working examples for Naive, OnlineNaive, and LaMAML (as a plugin) are added.
  • We also need to discuss potential conflicts that can happen with the distributed training PR from Lorenzo

** I've added the prefix NEW_ for the files related to this PR.


# ====================================================================> NEW

def before_inner_updates(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we have inner/outer updates here? SGD doesn't have an inner/outer loop

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what I discussed previously. The new base SGD template is supposed to be the base class for all possible sorts of SGD-based strategies, and the meta-learning template is one of them (as defined in common_tamplates.py). So the idea is to add all callbacks that can be triggered by any of the SGD-based templates. Or are you suggesting splitting it into a separate plugin class for meta-learning-based strategies?

self.rough_sz = math.ceil(bsize_data / self.n_inner_updates)
self.meta_losses = [0 for _ in range(self.n_inner_updates)]

def single_inner_update(self, x, y, t, criterion):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

plugins usually implent only before/after methods.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I explained this in the new plugin comments

"""Called before `_inner_updates` by the `BaseTemplate`."""
pass

def inner_updates(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not a before/after method, shouldn't be here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's true, but the reason I added the inner_updates callback as a plugin trigger is that, unlike supervised strategies, we don't have any "Naive" type of updates that I can set as default for meta-learning-based strategies.
More precisely, in supervised strategies, we have the training_epoch function that is implemented in its most basic form (which is Naive fine-tuning), and you can augment it by adding new plugins. We don't have such a general structure similar totraining_epoch for inner updates in meta-learning, and it can be completely different from method to method. That's why I added it as a plugin trigger that has to be implemented by the user. Do you have other suggestions?

"""Called before `_outer_updates` by the `BaseTemplate`."""
pass

def outer_update(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as inner_update. Shouldn't be here

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same answer as the previous one.


# ==================================================================> NEW

def maybe_adapt_model_and_make_optimizer(self):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need this method? strategies can override _before_training_exp for the same behavior

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you need a call that can be implemented by observation_type I would use a more general name, something like prepare_training_exp or similar.

Use [] if you do not want to evaluate during training.
:param kwargs: custom arguments.
"""
if eval_streams is None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

L19-L26 should be factored out of this method since it's common for all strategies

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The loop over training_epochs and the plugin triggers can be different in different observations, that's why I explicitly defined them here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then it's not clear to me what is the difference between observation type and templates. I expected templates to modify the loop and observation types to only add new attributes/methods that are called by the loop.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a clear definition of template/observation/problem/update? For example a clear interface that they must specify

Copy link
Collaborator Author

@HamedHemati HamedHemati left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned in the comments, let me know your opinion about how to create a more "general" plugin class that would cover all possible callbacks. Thanks!

@AntonioCarta
Copy link
Collaborator

AntonioCarta commented Aug 23, 2022

let me know your opinion about how to create a more "general" plugin class that would cover all possible callbacks

I think you are breaking the inheritance hierarchy with this approach. For example, in your general template you have inner and outer loops, but these do not make sense for some of the child classes. This is undesirable for many reasons.

To give a more concrete example: imagine that you want to create a hierarchy of animals (BaseAnimal is the parent, FlyingAnimal and SeaAnimal the childs). You are saying that all the animals should implement both swim and fly (methods in BaseAnimal), even though SeaAnimal will never fly.

@HamedHemati
Copy link
Collaborator Author

@AntonioCarta the updates are made according to our last discussion.


self._after_training_iteration(**kwargs)
# Should be implemented in Update Type
raise NotADirectoryError()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be NotImplementedError

@@ -271,8 +216,142 @@ def eval_epoch(self, **kwargs):

self._after_eval_iteration(**kwargs)

# ==================================================================> NEW

def maybe_adapt_model_and_make_optimizer(self):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is this method? why do we need it? we already have before_training_exp which covers model adaptation and optimizer init.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe change the name to prepare_for_observation, or something similar. It doesn't have to be about model adaptation or optimizers

Use [] if you do not want to evaluate during training.
:param kwargs: custom arguments.
"""
if eval_streams is None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then it's not clear to me what is the difference between observation type and templates. I expected templates to modify the loop and observation types to only add new attributes/methods that are called by the loop.

Use [] if you do not want to evaluate during training.
:param kwargs: custom arguments.
"""
if eval_streams is None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a clear definition of template/observation/problem/update? For example a clear interface that they must specify



class MetaUpdate:
def training_epoch(self, **kwargs):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use the same training_epoch method for all and define only the iteration here?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because the epoch is the same, what changes is the single iteration

@AntonioCarta
Copy link
Collaborator

@HamedHemati can you explain the difference between a template and an observation type? update and problem type are clear to me. I think there isn't a clear separation between template and observation.

@HamedHemati
Copy link
Collaborator Author

Update:

  • Old template files are removed, and all imports are updated
  • _train_exp is removed from observation_typenow it's a part of the BasedSGDTemplate again
  • observation_type now only checks the model adaptation and optimizer resetting. The behavior is currently different for batch and online updates because of the experience boundary checks (currently specific to OnlineCLExperience)

…o new_templates

� Conflicts:
�	examples/lamaml_cifar100.py
…o new_templates

� Conflicts:
�	avalanche/evaluation/metrics/images_samples.py
�	avalanche/training/supervised/lamaml.py
�	avalanche/training/templates/base_online_sgd.py
�	avalanche/training/templates/base_sgd.py
�	avalanche/training/templates/online_supervised.py
�	avalanche/training/templates/supervised.py
�	examples/online_naive.py
�	tests/training/test_online_strategies.py
@AntonioCarta AntonioCarta marked this pull request as ready for review October 20, 2022 12:24
@AntonioCarta AntonioCarta merged commit 4a54a2a into ContinualAI:master Oct 20, 2022
@HamedHemati HamedHemati deleted the new_templates branch March 16, 2023 11:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants