New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Final K-Fold Score #60
Comments
If you want to have a cv-score for the final predictions of the ensemble, then your simply need to run a I haven't implemented a similar function in mlens, but you should be able to use the You can also pass other estimators in the I'm not sure how much of a speedup you're likely to see, could be anything from just a little to quite a bit. If you give it a try, please report back. Would be interesting to know : ) |
@flennerhag - Yes, using SKLearns cvs seemed to give this issue. Training time was around 14 minutes (compared to 1 min for LightGBM and upto 3 mins for XGBoost). IMHO an ensemble CV score (and maybe OOF prediction) would be one of the most useful features to add to the package. Ultimately, it would tell you if your ensemble was working well or not :) Evaluate seems to work for the purpose of CV score (but seemed an unnatural way to do this). It gave a good speed up and the whole model ran in under 4 mins vs the previous 14 mins. |
@JoshuaC3 thanks for the report and glad it sped things up. Agree with you that the Evaluator is not ideal for standard benchmarking. Should actually be pretty easy to just pull out the code from the Evaluator and create a You can get cv scores for the base learners by passing a scoring function to the ensemble when you create it or in the It's actually possible to get cv-scores for the final ensemble as well during fitting if you don't declare the final layer as a meta layer (e.g. just use the standard This would fit the a copy of the meta learner on all folds and once on all data. The fold jobs are unnecessary from a training point of view, but it does allow you to get CV scores for the ensemble. |
@flennerhag - This sounds like a much more intuitive way to get that as an output. I gave the non-meta layer a go and it returned cv-score for the last layer as expected. It ran in 1.5 mins (about half the time!) so my absolute gut feeling is that it doesn't do the In addition, I have started doing some benchmarking of mlens against some other stacking/ensembling packages. The functionality included is far superior but on my relatively simple models, as shown below, it performs somewhat poorly.
cv results: 238.91 MAE With MLXtend StackingRegressor
cv results: 209.04 MAE The results of the MLXtend on the hold out set also perform much better. Am I using mlens correctly? i.e. fitting this on the whole set correctly? Or is mlens doing something rather different here? Many thanks again. |
Great that i helped! I'm not sure I quite follow what you mean by default cv? When you use a stacked layer as meta, the default number of folds is |
@JoshuaC3 really appreciate the benchmarking! I've been meaning to do that for some time but well, there's only 24hrs in a day. I'm surprised mlens fails to outperform mlxtend. mlxtend doesn't actually do stacking (despite the name) - it merely fits the base learners on all data, and then the meta learner on those predictions. Hence the meta learner is trained on base learner training errors, but at test time faces test errors. The two reasons for your results that I can think of is (a) if you have very little data the folds will be too noisy or (b) if the data is not i.i.d. the folds will be biased. I hit upon (b) when doing the MNIST benchmark, since creating folds without shuffling the data won't cover all classes. As a code integrity check, I spun up a simple mlxtend-vs-mlens benchmark with your models (but the default objective function) on the
As a sanity check, if you use the
So to me it looks like it should. For your benchmark, would you mind trying
|
In reply to your first post, yes, I mean exactly that. The 2 folds were taken into account so it took ~x2 as long. I will reply again comment #_2 in due course :) |
Sorry, I have been busy the last two weeks (finishing an ml competition which I won with mlens help!! :) ) but final done some investigating into this. In the end the problem was very simple. It was due to mlens reacting sensitively to some Mlens checks these before level-1 but can turn this off with I have started an ensemble comparison/benchmarking notebook and expect to have this first version finished today. I am comparing several python based ml ensemble packages, including mlxtend and mlens and there different methods of ensembling. Hopefully, this will be of some use. |
@flennerhag Thank you! Yes, however, the competition was not public; it was run through the company I work for. I will ask if it is OK to release the data, or at the least, my results and method. I can say that it was a regression task for predicting wind generation for each turbine on a wind farm, given >24h forecast weather data. I have made a Jupyter Notebook and directory for this (hopefully this is ok. I think they are much more helpful for analyses!). You can see it on my fork, here. It started off as a benchmark against other packages but they all scored very similarly so in the end it became more of a quick comparison of packages. Is my use of I have started a second Jupyter Notebook to compare preprocessing functionality, deeper layer stacking and scores. This to come shortly. |
One thing I have been unable to figure out is how to get a k-fold cross validation score for the whole ensemble.
I have used sklearns built in cross_valid_score but this is very slow (I think because it ends up doing cvs whilst in another cv loop!).
How can I get a final k-fold cross validation score for the final ensemble please? (great package btw :) )
The text was updated successfully, but these errors were encountered: