You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
for example, with random forests, we could use criterion='gini' or criterion='entropy'. when we had this combined, and had gridsearch choose which of those two was best, it doubled the size of the space that gridsearch had to process through, and gave us a classifier that placed 250th on kaggle give credit.
when i broke those out to each be their own separate classifier (holding all the other hyperparameters to test the same, but now having two separate grid searches, one for entropy and one for gini), the training time was probably equivalent (we have cut the training space in half for each classifier, but doubled the number of classifiers), but we got way better results.
turns out that gini generalizes super well here, despite not scoring as well as entropy on gridsearch. gini by itself placed 133, entropy continued to score around 238 (slight improvement even, it seems), and the ensembler placed in between (164).
so we close-to-doubled our placing simply by breaking out classifiers, and i would not expect this to have any kind of a time penalty.
The text was updated successfully, but these errors were encountered:
for example, with random forests, we could use criterion='gini' or criterion='entropy'. when we had this combined, and had gridsearch choose which of those two was best, it doubled the size of the space that gridsearch had to process through, and gave us a classifier that placed 250th on kaggle give credit.
when i broke those out to each be their own separate classifier (holding all the other hyperparameters to test the same, but now having two separate grid searches, one for entropy and one for gini), the training time was probably equivalent (we have cut the training space in half for each classifier, but doubled the number of classifiers), but we got way better results.
turns out that gini generalizes super well here, despite not scoring as well as entropy on gridsearch. gini by itself placed 133, entropy continued to score around 238 (slight improvement even, it seems), and the ensembler placed in between (164).
so we close-to-doubled our placing simply by breaking out classifiers, and i would not expect this to have any kind of a time penalty.
The text was updated successfully, but these errors were encountered: