Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metrics and EvalProtocol API #60

Closed
AntonioCarta opened this issue May 20, 2020 · 6 comments
Closed

Metrics and EvalProtocol API #60

AntonioCarta opened this issue May 20, 2020 · 6 comments
Assignees
Labels
Evaluation Related to the Evaluation module Feature - High Priority New feature or request, high priority

Comments

@AntonioCarta
Copy link
Collaborator

Metrics and EvalProtocol are a little bit unclear to me.

  • What is EvalProtocol's job? Most of the code implements Tensorboard logging operations but the name hints to something more than that.
  • Right now metrics do not have a uniform API, and each one takes different argument for the compute method. Each time we add a new metric, we also have to add a new if case inside EvalProtocol's get_results.

I would prefer a generic EvalProtocol that controls printing and logging and only delegates the computations to the metrics (e.g. instead of printing inside compute EvalProtocol calls the __str__ method). I would also prefer to be able to choose where to print the metrics (output file, tensorboard, stdout).

@vlomonaco
Copy link
Member

Yes, that would be ideal!

@vlomonaco vlomonaco added the Feature - Medium Priority New feature or request, medium priority label May 22, 2020
@AntonioCarta
Copy link
Collaborator Author

Actually, it was more of a question than a proposal. I still don't have a clear idea on how to solve this problem.

@vlomonaco
Copy link
Member

vlomonaco commented May 26, 2020

I think the EvalProtocol job should be exactly what you said:

I would prefer a generic EvalProtocol that controls printing and logging and only delegates the computations to the metrics (e.g. instead of printing inside compute EvalProtocol calls the str method). I would also prefer to be able to choose where to print the metrics (output file, tensorboard, stdout).

As for the metrics, as soon as we have all the the metrics implemented we can define an unique signature for the compute method... let's wait a little on this!

@akshitac8
Copy link
Collaborator

akshitac8 commented Jun 1, 2020

@vlomonaco Can I work on this and also it would be helpful if you could suggest some initial metrics that are needed and also is issue #51 solved?

@AntonioCarta
Copy link
Collaborator Author

#51 is closed.

Regarding the Metrics, maybe we can use Flows to define how and when to compute each metric? The problem that we have right now is that each metrics requires different computations and arguments. This could be easily solved with a callback system, like the flows used for training and test.

Take as an example the memory usage MU. It should be printed only once, instead it is printed everywhere because the EvaluationProtocol does not know how to print it:

Training completed
Computing accuracy on the whole test set
Task 0 - CF: 0.0000
Train Task 0 - MU: 0.259 GB
Confusion matrix, without normalization
[Evaluation] Task 0: Avg Loss 0.002227343698385049; Avg Acc 0.9242105484008789
Task 0 - CF: 0.9242
Train Task 0 - MU: 0.265 GB
Confusion matrix, without normalization
[Evaluation] Task 0: Avg Loss 0.06294100686298648; Avg Acc 0.0
Task 0 - CF: 0.9242
Train Task 0 - MU: 0.265 GB
Confusion matrix, without normalization
[Evaluation] Task 0: Avg Loss 0.0536595421147698; Avg Acc 0.0
Task 0 - CF: 0.9242
Train Task 0 - MU: 0.268 GB
Confusion matrix, without normalization
[Evaluation] Task 0: Avg Loss 0.05200100937101282; Avg Acc 0.0
Task 0 - CF: 0.9242
Train Task 0 - MU: 0.270 GB
Confusion matrix, without normalization
[Evaluation] Task 0: Avg Loss 0.06933177224355726; Avg Acc 0.0
Start of step  1

Each metric will probably need to implement a couple of methods, while the others will stay empty, so the implementation of new metrics should not be more complex than it is right now.

@vlomonaco
Copy link
Member

vlomonaco commented Oct 20, 2020

Hi @AntonioCarta you are totally right. Indeed now the metrics are called through the EvaluationPlugin!

Being it a plugin, it can implement all the callbacks independently from the main strategy (all the plugin methods will be called before the main strategy methods). This also means that the calls he make can be fine-tuned to specific metric needs. Does it make sense or am I missing something?

@vlomonaco vlomonaco added Feature - High Priority New feature or request, high priority Evaluation Related to the Evaluation module and removed Feature - Medium Priority New feature or request, medium priority labels Nov 25, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Evaluation Related to the Evaluation module Feature - High Priority New feature or request, high priority
Projects
None yet
Development

No branches or pull requests

4 participants