According to OpenML, I'm getting a really strange performance boost when using warm start RandomForests:
Using OpenML:
from sklearn.ensemble import RandomForestClassifier
from openml import tasks,runs,datasets
task = tasks.get_task(145677)
clfs = [RandomForestClassifier(n_estimators=64,warm_start=True,n_jobs=-1),
RandomForestClassifier(n_estimators=64,warm_start=False,n_jobs=-1)]
for clf in clfs:
run = runs.run_task(task, clf)
p = run.publish()
print("Uploaded run 1 with id %s. Check it at www.openml.org/r/%s" %(run.run_id,run.run_id))
Output:
Uploaded run 1 with id 1852507. Check it at www.openml.org/r/1852507
Uploaded run 1 with id 1852508. Check it at www.openml.org/r/1852508
Looking up the results on OpenML, the first has an AUC of 0.9959, the second has AUC 0.8764.
This is a huge performance gap. Moreover, the classifier was not trained incrementally, so why does the warm start make a difference?
Just to check whether this is OpenML-specific, here is the same experiment doing 10xCV locally:
task = tasks.get_task(145677)
dataset = task.get_dataset()
X, y, categorical = dataset.get_data(target=dataset.default_target_attribute,return_categorical_indicator=True)
clfs = [RandomForestClassifier(n_estimators=64,warm_start=True,n_jobs=-1),
RandomForestClassifier(n_estimators=64,warm_start=False,n_jobs=-1)]
for clf in clfs:
scores = cross_val_score(clf,X,y,cv=10,scoring="roc_auc",n_jobs=-1)
print("10xCV Score (AUC): %s" %(np.mean(scores)))
Output:
10xCV Score (AUC): 0.87225053074
10xCV Score (AUC): 0.87225053074
Any clue about what's going on? Are we somehow maintaining information from classifiers trained in previous runs? Even then, a score of 0.9959 would be unexpected. Is there some form of information leakage going on?
Seems like a case for @amueller @mfeurer @janvanrijn ;)
According to OpenML, I'm getting a really strange performance boost when using warm start RandomForests:
Using OpenML:
Output:
Looking up the results on OpenML, the first has an AUC of 0.9959, the second has AUC 0.8764.
This is a huge performance gap. Moreover, the classifier was not trained incrementally, so why does the warm start make a difference?
Just to check whether this is OpenML-specific, here is the same experiment doing 10xCV locally:
Output:
Any clue about what's going on? Are we somehow maintaining information from classifiers trained in previous runs? Even then, a score of 0.9959 would be unexpected. Is there some form of information leakage going on?
Seems like a case for @amueller @mfeurer @janvanrijn ;)