Skip to content

Conversation

@ryanlindeborg
Copy link
Contributor

This PR addresses this issue: #212.

It is a work-in-progress still. @AndreaCossu would you be able to provide some feedback on whether I am headed in the right direction here so far? I still have to figure out the best way to compute the predictions at random initialization, given that at other points in the strategy we just call "self._current_accuracy.update(strategy.mb_y, strategy.logits)" to calculate the accuracy at that point.

Also, what is the best way to think about testing an additional metric like this? Thanks for the guidance!

@coveralls
Copy link

Pull Request Test Coverage Report for Build 716166603

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage remained the same at 76.039%

Totals Coverage Status
Change from base Build 714569273: 0.0%
Covered Lines: 6493
Relevant Lines: 8539

💛 - Coveralls

@vlomonaco vlomonaco requested a review from AndreaCossu April 6, 2021 08:18
@AndreaCossu
Copy link
Collaborator

Hi @ryanlindeborg , thank you for the work! You raised two interesting points.

  • How to compute accuracy at random initialization: at the moment I don't think we support it. The user can explicitly run a eval loop before training but I think we shouldn't rely on this. We currently have a periodic_eval call which evaluate the model after a fixed number of epochs. If we allow this method do be called also before training starts, we can implement forward transfer easily. The user has only to enable the periodic evaluation (but that's acceptable in my opinion). I'll contact @AntonioCarta to see if he agrees with this modification to the training module.
  • How to test PluginMetrics: we were also thinking about it. I opened an issue which I think can help in solving the problem Add TestPlugin to test metrics and other features within callbacks #498 . For the moment, you can add tests only for the standalone ForwardTransfer metric, as already done for the other standalone metrics.

I'll keep you posted on both points!

@AntonioCarta
Copy link
Collaborator

In general, right now it is not straightforward to implement metrics that assume access to the entire stream. Regarding forward transfer, the main problem is the access to future tasks. I think the plugin should:
1 - take as input the entire stream during construction (to have access to future experiences).
2 - call eval inside before_train_exp to compute the accuracy of future experiences.

Ideally, the plugin should keep track of the experiences already used for training and evaluation. For example, if I have:

for i, exp in enumerate(scenario.train_stream):
    strategy.train(exp)

Then, the plugin must call eval inside before_train_exp. However, if I have:

for i, exp in enumerate(scenario.train_stream):
    strategy.eval(scenario.test_stream)  # or scenario.test_stream[i+1:] fur future exps
    strategy.train(exp)

Then, the plugin does not need to call eval again. It's not wrong to call it, but it's inefficient.

@AntonioCarta
Copy link
Collaborator

Keep in mind that if we go ahead with this solution I need to make some changes to allow calling eval inside train. It's not a problem but you need to give me some time to work on this.

@vlomonaco vlomonaco marked this pull request as draft April 20, 2021 08:48
@vlomonaco vlomonaco added the Evaluation Related to the Evaluation module label Jun 6, 2021
@AndreaCossu
Copy link
Collaborator

Sorry for the long delay @ryanlindeborg . I collected your changes (so that your contribution is not lost) in the PR #695 and converted your code to be compliant to the changes done to the metrics during the last months. Feel free to check the PR out and suggest any changes. Thanks again for your effort 😄
The discussion about forward transfer can be moved to #695

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Evaluation Related to the Evaluation module

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants