Overall Model Performance Over Time: Error #544

henrylinarundo · 2018-09-19T18:52:19Z

The script in Issue #337 Overall Model Performance Over Time given by @mfeurer won't run now. Some problems are:

Requires tmp/.auto_sklearn/prediction_test/* which is only there when X_test, y_test are given.
Temp and output folders cannot be the same (tried to fix myself but there are several places).
Assumes unknown datasets uses classification metric.
Can someone provide an updated version? Thanks.

mfeurer · 2018-09-25T07:49:57Z

Please check if the script given in the #543 solves your issue.

henrylinarundo · 2018-09-25T16:41:34Z

That code works, thanks. I believe that's the error rate of each classifier generated by Bayesian Optimization / SMAC, not the ensemble (n-best so far). That's fine but what I need is to reproduce figure 4 of the NIPS2015 paper, or the overall model performance over time as in #337 whose script no longer works. The only solution I found is to move predictions_ensemble_1_xx.npy and predictions_test_1_xx.npy into corresponding folders one by one, run automl.fit_ensemble() and automl.predict() in a script. Is there any pitfall of this solution?

P.S. I fixed some bugs in the script provided in #337 but I think the ensemble building is no longer correct anyway.

henrylinarundo · 2018-09-25T22:39:31Z

My solution seems to work, but I'm doing automl.refit() and automl.predict() at every iteration, which is slow. Since the test predictions are already there, can I get the ensemble prediction directly as in the script in #337 ? Thanks!

mfeurer · 2018-09-26T07:40:16Z

That code works, thanks. I believe that's the error rate of each classifier generated by Bayesian Optimization / SMAC, not the ensemble (n-best so far).

Correct.

That's fine but what I need is to reproduce figure 4 of the NIPS2015 paper

@herilalaina was working on scripts to accomodate this. Did you make any progress?

The only solution I found is to move predictions_ensemble_1_xx.npy and predictions_test_1_xx.npy into corresponding folders one by one

That is basically the solution we used to produce the Figure back then.

My solution seems to work, but I'm doing automl.refit() and automl.predict() at every iteration, which is slow. Since the test predictions are already there, can I get the ensemble prediction directly as in the script in #337?

If you copy the model files as well it should work without.

henrylinarundo · 2018-09-26T15:45:44Z

That code works, thanks. I believe that's the error rate of each classifier generated by Bayesian Optimization / SMAC, not the ensemble (n-best so far).

Correct.

That's fine but what I need is to reproduce figure 4 of the NIPS2015 paper

@herilalaina was working on scripts to accomodate this. Did you make any progress?

My script below seems to work fine but slow. Trying to make it faster (see below).

The only solution I found is to move predictions_ensemble_1_xx.npy and predictions_test_1_xx.npy into corresponding folders one by one

That is basically the solution we used to produce the Figure back then.

My solution seems to work, but I'm doing automl.refit() and automl.predict() at every iteration, which is slow. Since the test predictions are already there, can I get the ensemble prediction directly as in the script in #337?

If you copy the model files as well it should work without.

I left the model files 1.xx.model inside .auto-sklearn/models/ , would that work?

Moving prediction files one by one is fine, but refit() thousands of iterations is slow. EnsembleBuilder already reads .auto-sklearn/predictions_test and EnsembleBuilder.predict() reuses them directly, without calling the expensive refit(). Do I have to call EnsembleBuilder instead of AutoSklearnClassifier for this, and what's the proper way doing this?

Here's my current script, assuming tmp_folder and output_folder as well as training and testing data are given.

seed = 1  # Focus on 1 seed; could extend to * in shared mode.
tmp_autoskl_dir = os.path.join(tmp_folder,  '.auto-sklearn')
start_time_file = os.path.join(tmp_autoskl_dir,
                            'start_time_%d' % seed)
with open(start_time_file, 'r') as fh:
    starttime = float(fh.read())
test_mtime0 = starttime
time_perf = []
pred_ensemble_bak_pattern = os.path.join(
    tmp_autoskl_dir, 'predictions_ensemble.bak',
    'predictions_ensemble_%d_*.npy' % seed)
pred_ensemble_bak_files = [(os.path.getmtime(f), f)
                        for f in glob.glob(pred_ensemble_bak_pattern)]
pred_ensemble_bak_files.sort()
for ensemble_mtime, ensemble_bak_path in pred_ensemble_bak_files:
    # Prepare ensemble and test file names.
    ensemble_basename = os.path.basename(ensemble_bak_path)
    split = ensemble_basename.split('_')
    seed = int(split[-2])
    iteration = int(split[-1].split('.')[0])
    test_basename = 'predictions_test_%d_%d.npy' % (seed, iteration)
    ensemble_path = os.path.join(
        tmp_autoskl_dir, 'predictions_ensemble', ensemble_basename)
    test_bak_path = os.path.join(
        tmp_autoskl_dir, 'predictions_test.bak', test_basename)
    test_path = os.path.join(
        tmp_autoskl_dir, 'predictions_test', test_basename)
    # Assert that pred test files have the same temporal order.
    test_mtime1 = os.path.getmtime(test_bak_path)
    if test_mtime1 <= test_mtime0:
        print(test_mtime1 - test_mtime0)
    test_mtime0 = test_mtime1
    # Move 2 prediction files into Auto-Sklearn folders.
    os.rename(ensemble_bak_path, ensemble_path)
    os.rename(test_bak_path, test_path)
    # Evaluate performance of this iteration.
    test_acc = eval_autoskl_perf(tmp_folder, output_folder,
                                X_train, y_train, X_test, y_test)
    result = {'seconds': ensemble_mtime - starttime,
            'seed': seed,
            'iteration': iteration,
            'test_acc': test_acc}
    time_perf.append(result)
    print(result)
time_perf_df = pd.DataFrame(time_perf)

herilalaina · 2018-09-26T22:25:51Z

I submit a PR as a starting point of reproducing figure 3. I just adapt script from @mfeurer here to the current version of Auto-Sklearn. Please let me know if I missed something.

henrylinarundo · 2018-09-28T16:18:04Z

@herilalaina Thanks, score_ensemble.py is adapted from @mfeurer 's script in #337 , and here are the issues I run into with that.

Regression is not supported, and although it's easy to add r2 or mean_squared_error, the numbers I get are all over the place. Each iteration's predictions_test*.npy look reasonable, but y_hat_ensemble and y_hat_test (final predictions at that time) keep increasing its mean and variance.
To follow up on the above, y_hat_ensemble and y_hat_test are derived from validation_predictions and test_predictions with linear combinations but ensemble_selection.weights_ has a different dimension, which seems wrong.
Classification results are in reasonable range (e.g. 0.7~0.9), but test set performance is always better than training set performance, and quite different from what I get by calling AutoSklearnClassifier directly.

mfeurer · 2020-09-02T14:01:01Z

Hey, we do finally have a good interface and an example for doing this without any extra scripts: https://automl.github.io/auto-sklearn/master/examples/40_advanced/example_pandas_train_test.html#sphx-glr-examples-40-advanced-example-pandas-train-test-py

Please reopen if this doesn't solve the issue or you have further questions/requests.

herilalaina mentioned this issue Sep 26, 2018

Reproducing experiments of NIPS'15 paper #550

Closed

mfeurer mentioned this issue Nov 19, 2018

paper results reproduction #583

Closed

mfeurer closed this as completed Sep 2, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Overall Model Performance Over Time: Error #544

Overall Model Performance Over Time: Error #544

henrylinarundo commented Sep 19, 2018 •

edited

mfeurer commented Sep 25, 2018

henrylinarundo commented Sep 25, 2018

henrylinarundo commented Sep 25, 2018

mfeurer commented Sep 26, 2018

henrylinarundo commented Sep 26, 2018

herilalaina commented Sep 26, 2018

henrylinarundo commented Sep 28, 2018

mfeurer commented Sep 2, 2020

Overall Model Performance Over Time: Error #544

Overall Model Performance Over Time: Error #544

Comments

henrylinarundo commented Sep 19, 2018 • edited

mfeurer commented Sep 25, 2018

henrylinarundo commented Sep 25, 2018

henrylinarundo commented Sep 25, 2018

mfeurer commented Sep 26, 2018

henrylinarundo commented Sep 26, 2018

herilalaina commented Sep 26, 2018

henrylinarundo commented Sep 28, 2018

mfeurer commented Sep 2, 2020

henrylinarundo commented Sep 19, 2018 •

edited