Metrics and EvalProtocol API #60

AntonioCarta · 2020-05-20T14:39:05Z

Metrics and EvalProtocol are a little bit unclear to me.

What is EvalProtocol's job? Most of the code implements Tensorboard logging operations but the name hints to something more than that.
Right now metrics do not have a uniform API, and each one takes different argument for the compute method. Each time we add a new metric, we also have to add a new if case inside EvalProtocol's get_results.

I would prefer a generic EvalProtocol that controls printing and logging and only delegates the computations to the metrics (e.g. instead of printing inside compute EvalProtocol calls the __str__ method). I would also prefer to be able to choose where to print the metrics (output file, tensorboard, stdout).

The text was updated successfully, but these errors were encountered:

vlomonaco · 2020-05-20T15:30:20Z

Yes, that would be ideal!

AntonioCarta · 2020-05-26T10:46:38Z

Actually, it was more of a question than a proposal. I still don't have a clear idea on how to solve this problem.

vlomonaco · 2020-05-26T12:55:20Z

I think the EvalProtocol job should be exactly what you said:

I would prefer a generic EvalProtocol that controls printing and logging and only delegates the computations to the metrics (e.g. instead of printing inside compute EvalProtocol calls the str method). I would also prefer to be able to choose where to print the metrics (output file, tensorboard, stdout).

As for the metrics, as soon as we have all the the metrics implemented we can define an unique signature for the compute method... let's wait a little on this!

akshitac8 · 2020-06-01T17:53:10Z

@vlomonaco Can I work on this and also it would be helpful if you could suggest some initial metrics that are needed and also is issue #51 solved?

AntonioCarta · 2020-10-20T09:44:32Z

#51 is closed.

Regarding the Metrics, maybe we can use Flows to define how and when to compute each metric? The problem that we have right now is that each metrics requires different computations and arguments. This could be easily solved with a callback system, like the flows used for training and test.

Take as an example the memory usage MU. It should be printed only once, instead it is printed everywhere because the EvaluationProtocol does not know how to print it:

Training completed
Computing accuracy on the whole test set
Task 0 - CF: 0.0000
Train Task 0 - MU: 0.259 GB
Confusion matrix, without normalization
[Evaluation] Task 0: Avg Loss 0.002227343698385049; Avg Acc 0.9242105484008789
Task 0 - CF: 0.9242
Train Task 0 - MU: 0.265 GB
Confusion matrix, without normalization
[Evaluation] Task 0: Avg Loss 0.06294100686298648; Avg Acc 0.0
Task 0 - CF: 0.9242
Train Task 0 - MU: 0.265 GB
Confusion matrix, without normalization
[Evaluation] Task 0: Avg Loss 0.0536595421147698; Avg Acc 0.0
Task 0 - CF: 0.9242
Train Task 0 - MU: 0.268 GB
Confusion matrix, without normalization
[Evaluation] Task 0: Avg Loss 0.05200100937101282; Avg Acc 0.0
Task 0 - CF: 0.9242
Train Task 0 - MU: 0.270 GB
Confusion matrix, without normalization
[Evaluation] Task 0: Avg Loss 0.06933177224355726; Avg Acc 0.0
Start of step  1

Each metric will probably need to implement a couple of methods, while the others will stay empty, so the implementation of new metrics should not be more complex than it is right now.

vlomonaco · 2020-10-20T10:53:03Z

Hi @AntonioCarta you are totally right. Indeed now the metrics are called through the EvaluationPlugin!

Being it a plugin, it can implement all the callbacks independently from the main strategy (all the plugin methods will be called before the main strategy methods). This also means that the calls he make can be fine-tuned to specific metric needs. Does it make sense or am I missing something?

vlomonaco added the Feature - Medium Priority New feature or request, medium priority label May 22, 2020

akshitac8 mentioned this issue Jun 11, 2020

Add MAC metric #51

Closed

vlomonaco mentioned this issue Nov 25, 2020

General Update of the Evaluation Module #199

Closed

vlomonaco added Feature - High Priority New feature or request, high priority Evaluation Related to the Evaluation module and removed Feature - Medium Priority New feature or request, medium priority labels Nov 25, 2020

vlomonaco assigned lrzpellegrini Nov 30, 2020

lrzpellegrini mentioned this issue Dec 14, 2020

New evaluation module #222

Merged

vlomonaco closed this as completed Dec 15, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Metrics and EvalProtocol API #60

Metrics and EvalProtocol API #60

AntonioCarta commented May 20, 2020

vlomonaco commented May 20, 2020

AntonioCarta commented May 26, 2020

vlomonaco commented May 26, 2020 •

edited

Loading

akshitac8 commented Jun 1, 2020 •

edited

Loading

AntonioCarta commented Oct 20, 2020

vlomonaco commented Oct 20, 2020 •

edited

Loading

Metrics and EvalProtocol API #60

Metrics and EvalProtocol API #60

Comments

AntonioCarta commented May 20, 2020

vlomonaco commented May 20, 2020

AntonioCarta commented May 26, 2020

vlomonaco commented May 26, 2020 • edited Loading

akshitac8 commented Jun 1, 2020 • edited Loading

AntonioCarta commented Oct 20, 2020

vlomonaco commented Oct 20, 2020 • edited Loading

vlomonaco commented May 26, 2020 •

edited

Loading

akshitac8 commented Jun 1, 2020 •

edited

Loading

vlomonaco commented Oct 20, 2020 •

edited

Loading