Skip to content

Conversation

@mfeurer
Copy link
Contributor

@mfeurer mfeurer commented Apr 9, 2021

This PR fixes a weird edge case in iterative CV: Assume we have 3 folds. Two folds converge early on to a bad score and are terminated via early stopping, while the 3rd fold goes on and reaches a really good score. Due to a bug, Auto-sklearn would ignore the bad scores and report this unstable configuration as a really good one. This PR correctly computes the average of the scores.

Can be approximately reproduced via:

if __name__ == '__main__':

    import sys
    import autosklearn.metrics
    from autosklearn.classification import AutoSklearnClassifier

    sys.path.append('../scripts')
    from update_metadata_util import load_task

    X_train, y_train, X_test, y_test, cat, task_type, dataset_name = load_task(189905)

    automl = AutoSklearnClassifier(
        per_run_time_limit=360,
        metric=autosklearn.metrics.balanced_accuracy,
    )

    # config = {
    #   "balancing:strategy": "weighting",
    #   "classifier:__choice__": "gradient_boosting",
    #   "data_preprocessing:categorical_transformer:categorical_encoding:__choice__": "no_encoding",
    #   "data_preprocessing:categorical_transformer:category_coalescence:__choice__": "minority_coalescer",
    #   "data_preprocessing:numerical_transformer:imputation:strategy": "most_frequent",
    #   "data_preprocessing:numerical_transformer:rescaling:__choice__": "robust_scaler",
    #   "feature_preprocessor:__choice__": "no_preprocessing",
    #   "classifier:gradient_boosting:early_stop": "train",
    #   "classifier:gradient_boosting:l2_regularization": 5.205915999429948e-09,
    #   "classifier:gradient_boosting:learning_rate": 0.861159596651958,
    #   "classifier:gradient_boosting:loss": "auto",
    #   "classifier:gradient_boosting:max_bins": 255,
    #   "classifier:gradient_boosting:max_depth": "None",
    #   "classifier:gradient_boosting:max_leaf_nodes": 195,
    #   "classifier:gradient_boosting:min_samples_leaf": 16,
    #   "classifier:gradient_boosting:scoring": "loss",
    #   "classifier:gradient_boosting:tol": 1e-07,
    #   "data_preprocessing:categorical_transformer:category_coalescence:minority_coalescer:minimum_fraction": 0.06191658766042969,
    #   "data_preprocessing:numerical_transformer:rescaling:robust_scaler:q_max": 0.7493495185188641,
    #   "data_preprocessing:numerical_transformer:rescaling:robust_scaler:q_min": 0.2995969400106461,
    #   "classifier:gradient_boosting:n_iter_no_change": 19
    # }
    config = {
      "balancing:strategy": "weighting",
      "classifier:__choice__": "gradient_boosting",
      "data_preprocessing:categorical_transformer:categorical_encoding:__choice__": "no_encoding",
      "data_preprocessing:categorical_transformer:category_coalescence:__choice__": "minority_coalescer",
      "data_preprocessing:numerical_transformer:imputation:strategy": "median",
      "data_preprocessing:numerical_transformer:rescaling:__choice__": "robust_scaler",
      "feature_preprocessor:__choice__": "no_preprocessing",
      "classifier:gradient_boosting:early_stop": "train",
      "classifier:gradient_boosting:l2_regularization": 5.205915999429948e-09,
      "classifier:gradient_boosting:learning_rate": 0.861159596651958,
      "classifier:gradient_boosting:loss": "auto",
      "classifier:gradient_boosting:max_bins": 255,
      "classifier:gradient_boosting:max_depth": "None",
      "classifier:gradient_boosting:max_leaf_nodes": 204,
      "classifier:gradient_boosting:min_samples_leaf": 16,
      "classifier:gradient_boosting:scoring": "loss",
      "classifier:gradient_boosting:tol": 1e-07,
      "data_preprocessing:categorical_transformer:category_coalescence:minority_coalescer:minimum_fraction": 0.0668155830580962,
      "data_preprocessing:numerical_transformer:rescaling:robust_scaler:q_max": 0.7495444113816928,
      "data_preprocessing:numerical_transformer:rescaling:robust_scaler:q_min": 0.2995969400106461,
      "classifier:gradient_boosting:n_iter_no_change": 20
    }

    pipeline, run_info, run_value = automl.fit_pipeline(
        X=X_train,
        y=y_train,
        dataset_name=dataset_name,
        resampling_strategy='cv-iterative-fit',
        folds=3,
        feat_type=cat,
        X_test=X_test,
        y_test=y_test,
        pynisher_context='spawn',
        cutoff=3600,
        config=config,
        disable_file_output=True,
    )
    print(pipeline)
    print(run_info)
    print(run_value)

@codecov
Copy link

codecov bot commented Apr 9, 2021

Codecov Report

Merging #1121 (66700cd) into development (f518e9a) will increase coverage by 0.03%.
The diff coverage is 0.00%.

Impacted file tree graph

@@               Coverage Diff               @@
##           development    #1121      +/-   ##
===============================================
+ Coverage        85.46%   85.50%   +0.03%     
===============================================
  Files              137      137              
  Lines            10557    10557              
===============================================
+ Hits              9023     9027       +4     
+ Misses            1534     1530       -4     
Impacted Files Coverage Δ
autosklearn/evaluation/train_evaluator.py 73.46% <0.00%> (ø)
...n/pipeline/components/classification/libsvm_svc.py 89.28% <0.00%> (+1.19%) ⬆️
...eline/components/feature_preprocessing/fast_ica.py 97.82% <0.00%> (+6.52%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f518e9a...66700cd. Read the comment docs.

@mfeurer mfeurer merged commit 4d3bb06 into development Apr 9, 2021
@mfeurer mfeurer deleted the fix_iterative_cv branch April 9, 2021 16:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants