New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add early stopping and custom metrics #165
Conversation
Allow specification of metric, use saved best model for any test evaluation.
At least for text classification, did not test for NER. Also, still need to check what exactly gets logged now and what else we should log if we do early stopping.
This can use a user-defined metrics function and any metric from the result for early stopping. The early stopping metric can either be the key from the first head evaluation result or a fcuntion that calculates a value from the whole evaluation result. User defined metrics functions need to get registered under a string name so they can get accessed easily after storing and re-loading the processor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for your work @johann-petrak! Looks good!
I only added a few minor comments / questions.
After resolving the comments and some minor clean ups, we should be good to merge. 🙂
There is a separation between using metrics and reporting and probably some work is needed there [...]
I don't fully understand your comment. Could you please elaborate what your concern is regarding the reporting. Do you mean that there's now support for custom metrics but not for custom reports?
OK, I pushed the changes that address your review. Ignore my comment about the reporting, I was just confused by the fact that the "report" uses a different way to calculate f1 than what is predefined in the metrics or could now get defined and registered, but I guess it is good to always have something included that is specific to the kind of head used. |
The travis CI job is failing but it does not look like something related to my changes? |
Great! Thanks a lot for rounding everything up! The travis CI is currently only failing due to low memory on the worker. So definitely unrelated to your changes. On a bigger worker all tests pass. I will merge it into master now. |
This should fix most of #160
It allows to set up an EarlyStopping instance that knows which metric to use, where to save the best models, how much patience to have etc. and pass it on to the trainer.
It also allows to register additional new metrics functions.
There is a separation between using metrics and reporting and probably some work is needed there which I have not looked into yet. Thought it may be best to look at what we have here now and merge and test if it looks good or figure out how to improve if there are still issues.