Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Overall Model Performance Over Time: Error #544

Closed
henrylinarundo opened this issue Sep 19, 2018 · 8 comments
Closed

Overall Model Performance Over Time: Error #544

henrylinarundo opened this issue Sep 19, 2018 · 8 comments

Comments

@henrylinarundo
Copy link

henrylinarundo commented Sep 19, 2018

The script in Issue #337 Overall Model Performance Over Time given by @mfeurer won't run now. Some problems are:

  1. Requires tmp/.auto_sklearn/prediction_test/* which is only there when X_test, y_test are given.
  2. Temp and output folders cannot be the same (tried to fix myself but there are several places).
  3. Assumes unknown datasets uses classification metric.
    Can someone provide an updated version? Thanks.
@mfeurer
Copy link
Contributor

mfeurer commented Sep 25, 2018

Please check if the script given in the #543 solves your issue.

@henrylinarundo
Copy link
Author

That code works, thanks. I believe that's the error rate of each classifier generated by Bayesian Optimization / SMAC, not the ensemble (n-best so far). That's fine but what I need is to reproduce figure 4 of the NIPS2015 paper, or the overall model performance over time as in #337 whose script no longer works. The only solution I found is to move predictions_ensemble_1_xx.npy and predictions_test_1_xx.npy into corresponding folders one by one, run automl.fit_ensemble() and automl.predict() in a script. Is there any pitfall of this solution?

P.S. I fixed some bugs in the script provided in #337 but I think the ensemble building is no longer correct anyway.

@henrylinarundo
Copy link
Author

My solution seems to work, but I'm doing automl.refit() and automl.predict() at every iteration, which is slow. Since the test predictions are already there, can I get the ensemble prediction directly as in the script in #337 ? Thanks!

@mfeurer
Copy link
Contributor

mfeurer commented Sep 26, 2018

That code works, thanks. I believe that's the error rate of each classifier generated by Bayesian Optimization / SMAC, not the ensemble (n-best so far).

Correct.

That's fine but what I need is to reproduce figure 4 of the NIPS2015 paper

@herilalaina was working on scripts to accomodate this. Did you make any progress?

The only solution I found is to move predictions_ensemble_1_xx.npy and predictions_test_1_xx.npy into corresponding folders one by one

That is basically the solution we used to produce the Figure back then.

My solution seems to work, but I'm doing automl.refit() and automl.predict() at every iteration, which is slow. Since the test predictions are already there, can I get the ensemble prediction directly as in the script in #337?

If you copy the model files as well it should work without.

@henrylinarundo
Copy link
Author

That code works, thanks. I believe that's the error rate of each classifier generated by Bayesian Optimization / SMAC, not the ensemble (n-best so far).

Correct.

That's fine but what I need is to reproduce figure 4 of the NIPS2015 paper

@herilalaina was working on scripts to accomodate this. Did you make any progress?

My script below seems to work fine but slow. Trying to make it faster (see below).

The only solution I found is to move predictions_ensemble_1_xx.npy and predictions_test_1_xx.npy into corresponding folders one by one

That is basically the solution we used to produce the Figure back then.

My solution seems to work, but I'm doing automl.refit() and automl.predict() at every iteration, which is slow. Since the test predictions are already there, can I get the ensemble prediction directly as in the script in #337?

If you copy the model files as well it should work without.

I left the model files 1.xx.model inside .auto-sklearn/models/ , would that work?

Moving prediction files one by one is fine, but refit() thousands of iterations is slow. EnsembleBuilder already reads .auto-sklearn/predictions_test and EnsembleBuilder.predict() reuses them directly, without calling the expensive refit(). Do I have to call EnsembleBuilder instead of AutoSklearnClassifier for this, and what's the proper way doing this?

Here's my current script, assuming tmp_folder and output_folder as well as training and testing data are given.

seed = 1  # Focus on 1 seed; could extend to * in shared mode.
tmp_autoskl_dir = os.path.join(tmp_folder,  '.auto-sklearn')
start_time_file = os.path.join(tmp_autoskl_dir,
                            'start_time_%d' % seed)
with open(start_time_file, 'r') as fh:
    starttime = float(fh.read())
test_mtime0 = starttime
time_perf = []
pred_ensemble_bak_pattern = os.path.join(
    tmp_autoskl_dir, 'predictions_ensemble.bak',
    'predictions_ensemble_%d_*.npy' % seed)
pred_ensemble_bak_files = [(os.path.getmtime(f), f)
                        for f in glob.glob(pred_ensemble_bak_pattern)]
pred_ensemble_bak_files.sort()
for ensemble_mtime, ensemble_bak_path in pred_ensemble_bak_files:
    # Prepare ensemble and test file names.
    ensemble_basename = os.path.basename(ensemble_bak_path)
    split = ensemble_basename.split('_')
    seed = int(split[-2])
    iteration = int(split[-1].split('.')[0])
    test_basename = 'predictions_test_%d_%d.npy' % (seed, iteration)
    ensemble_path = os.path.join(
        tmp_autoskl_dir, 'predictions_ensemble', ensemble_basename)
    test_bak_path = os.path.join(
        tmp_autoskl_dir, 'predictions_test.bak', test_basename)
    test_path = os.path.join(
        tmp_autoskl_dir, 'predictions_test', test_basename)
    # Assert that pred test files have the same temporal order.
    test_mtime1 = os.path.getmtime(test_bak_path)
    if test_mtime1 <= test_mtime0:
        print(test_mtime1 - test_mtime0)
    test_mtime0 = test_mtime1
    # Move 2 prediction files into Auto-Sklearn folders.
    os.rename(ensemble_bak_path, ensemble_path)
    os.rename(test_bak_path, test_path)
    # Evaluate performance of this iteration.
    test_acc = eval_autoskl_perf(tmp_folder, output_folder,
                                X_train, y_train, X_test, y_test)
    result = {'seconds': ensemble_mtime - starttime,
            'seed': seed,
            'iteration': iteration,
            'test_acc': test_acc}
    time_perf.append(result)
    print(result)
time_perf_df = pd.DataFrame(time_perf)

@herilalaina
Copy link
Contributor

I submit a PR as a starting point of reproducing figure 3. I just adapt script from @mfeurer here to the current version of Auto-Sklearn. Please let me know if I missed something.

@henrylinarundo
Copy link
Author

@herilalaina Thanks, score_ensemble.py is adapted from @mfeurer 's script in #337 , and here are the issues I run into with that.

  1. Regression is not supported, and although it's easy to add r2 or mean_squared_error, the numbers I get are all over the place. Each iteration's predictions_test*.npy look reasonable, but y_hat_ensemble and y_hat_test (final predictions at that time) keep increasing its mean and variance.

  2. To follow up on the above, y_hat_ensemble and y_hat_test are derived from validation_predictions and test_predictions with linear combinations but ensemble_selection.weights_ has a different dimension, which seems wrong.

  3. Classification results are in reasonable range (e.g. 0.7~0.9), but test set performance is always better than training set performance, and quite different from what I get by calling AutoSklearnClassifier directly.

@mfeurer
Copy link
Contributor

mfeurer commented Sep 2, 2020

Hey, we do finally have a good interface and an example for doing this without any extra scripts: https://automl.github.io/auto-sklearn/master/examples/40_advanced/example_pandas_train_test.html#sphx-glr-examples-40-advanced-example-pandas-train-test-py

Please reopen if this doesn't solve the issue or you have further questions/requests.

@mfeurer mfeurer closed this as completed Sep 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants