Add early stopping and custom metrics #165

johann-petrak · 2019-12-02T14:58:31Z

This should fix most of #160
It allows to set up an EarlyStopping instance that knows which metric to use, where to save the best models, how much patience to have etc. and pass it on to the trainer.
It also allows to register additional new metrics functions.

There is a separation between using metrics and reporting and probably some work is needed there which I have not looked into yet. Thought it may be best to look at what we have here now and merge and test if it looks good or figure out how to improve if there are still issues.

Allow specification of metric, use saved best model for any test evaluation.

At least for text classification, did not test for NER. Also, still need to check what exactly gets logged now and what else we should log if we do early stopping.

This can use a user-defined metrics function and any metric from the result for early stopping. The early stopping metric can either be the key from the first head evaluation result or a fcuntion that calculates a value from the whole evaluation result. User defined metrics functions need to get registered under a string name so they can get accessed easily after storing and re-loading the processor.

…e160

tholor

Thanks a lot for your work @johann-petrak! Looks good!
I only added a few minor comments / questions.
After resolving the comments and some minor clean ups, we should be good to merge. 🙂

There is a separation between using metrics and reporting and probably some work is needed there [...]

I don't fully understand your comment. Could you please elaborate what your concern is regarding the reporting. Do you mean that there's now support for custom metrics but not for custom reports?

farm/train.py

farm/metrics.py

examples/doc_classification.py

johann-petrak · 2019-12-09T13:06:55Z

OK, I pushed the changes that address your review.

Ignore my comment about the reporting, I was just confused by the fact that the "report" uses a different way to calculate f1 than what is predefined in the metrics or could now get defined and registered, but I guess it is good to always have something included that is specific to the kind of head used.

johann-petrak · 2019-12-09T13:07:40Z

The travis CI job is failing but it does not look like something related to my changes?

tholor · 2019-12-09T14:31:11Z

Great! Thanks a lot for rounding everything up!

The travis CI is currently only failing due to low memory on the worker. So definitely unrelated to your changes. On a bigger worker all tests pass. I will merge it into master now.

johann-petrak added 8 commits November 21, 2019 17:19

Initial code for early stopping, issue 160.

d19f2ce

More on early stopping.

9b69e91

Mode on early stopping.

c5f0bc3

More on early stopping.

e443bc5

Allow specification of metric, use saved best model for any test evaluation.

Add comment

8211023

This should mostly work.

6e59a82

At least for text classification, did not test for NER. Also, still need to check what exactly gets logged now and what else we should log if we do early stopping.

Merge branch 'master' of https://github.com/deepset-ai/FARM into issu…

e41ed6f

…e160

tholor changed the title ~~Issue160~~ Add early stopping and custom metrics Dec 3, 2019

tholor self-requested a review December 3, 2019 13:52

tholor added enhancement New feature or request part: evaluator Evaluator part: trainer Trainer labels Dec 3, 2019

tholor mentioned this pull request Dec 3, 2019

Add more flexibility to eval metrics by allowing custom functions #164

Closed

tholor reviewed Dec 3, 2019

View reviewed changes

farm/train.py Outdated Show resolved Hide resolved

farm/train.py Outdated Show resolved Hide resolved

farm/metrics.py Show resolved Hide resolved

examples/doc_classification.py Outdated Show resolved Hide resolved

Address pull request review comments.

ffe087c

change max_seq_len in example back to original

471b44f

tholor approved these changes Dec 9, 2019

View reviewed changes

tholor merged commit 3b0c521 into deepset-ai:master Dec 9, 2019

tholor mentioned this pull request Dec 9, 2019

Make it easier to use own metrics, log more metrics #152

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add early stopping and custom metrics #165

Add early stopping and custom metrics #165

johann-petrak commented Dec 2, 2019

tholor left a comment

johann-petrak commented Dec 9, 2019

johann-petrak commented Dec 9, 2019

tholor commented Dec 9, 2019

Add early stopping and custom metrics #165

Add early stopping and custom metrics #165

Conversation

johann-petrak commented Dec 2, 2019

tholor left a comment

Choose a reason for hiding this comment

johann-petrak commented Dec 9, 2019

johann-petrak commented Dec 9, 2019

tholor commented Dec 9, 2019