-
Notifications
You must be signed in to change notification settings - Fork 437
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improvement in evaluation framework #280
Conversation
… averages in the Stats class in order to be able to test it directly.
I've implemented something... however when I'm testing the date model, I get quite suspiciously high results:
|
OK I've found some more bugs, the worst was that the evaluation of the n-fold was using the model in |
I think there are too many thing here, what about having this in a separate feature? I haven't really modified the end 2 end validation interface.
as well it should be separated. Right now the evaluation is stored in class / set of classes, so we could add any output writer afterwards. |
(cherry picked from commit 67243f4)
One more detail, I've moved the method |
Some remarks about the current version. I think this is almost good and great addition!
It should be:
|
Gasp I tested an old version! I update my above remarks. |
Thanks for the review, I've implemented everything, but I have a question on fold = 1. |
Ah yes sorry, it's not confusing. In this case, no evaluation is possible, so just one training with the fold. |
I'm still not understanding. To be sure I've checked other libraries, like scikitlearn and it looks like they allow only n >= 2: https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.KFold.html |
Yes you're right, it is redundant with the train only option so it's useless. So let's keep n >1! |
Thanks! I've pushed the change. |
Improvement in evaluation framework Former-commit-id: 38489ed
In my way down to check if everything was ok, I've rewritten the evaluation in a way that is testable and tested but keeping the same efficiency.
Perhaps some changes could be further improved but that's a nice base to start from
Update: Work in progress to implement #453