Changelog

Future Releases

Enhancements
- Added accuracy as an standard objective 624
- Added verbose parameter to load_fraud 560
- Added Balanced Accuracy metric for binary, multiclass 612 661
- Added XGBoost regressor and XGBoost regression pipeline 666
- Added Accuracy metric for multiclass 672
- Added objective name in AutoBase.describe_pipeline 686
Fixes
- Removed direct access to cls.component_graph 595
- Add testing files to .gitignore 625
- Remove circular dependencies from Makefile 637
- Add error case for normalize_confusion_matrix() 640
- Fixed XGBoostClassifier and XGBoostRegressor bug with feature names that contain [, ], or < 659
- Update make_pipeline_graph to not accidentally create empty file when testing if path is valid 649
- Fix pip installation warning about docsutils version, from boto dependency 664
- Removed zero division warning for F1/precision/recall metrics 671
- Fixed summary for pipelines without estimators 707
Changes
- Updated default objective for binary/multiseries classification to log loss 613
- Created classification and regression pipeline subclasses and removed objective as an attribute of pipeline classes 405
- Changed the output of score to return one dictionary 429
- Created binary and multiclass objective subclasses 504
- Updated objectives API 445
- Removed call to get_plot_data from AutoML 615
- Set raise_error to default to True for AutoML classes 638
- Remove unnecessary "u" prefixes on some unicode strings 641
- Changed one-hot encoder to return uint8 dtypes instead of ints 653
- Pipeline _name field changed to custom_name 650
- Removed graphs.py and moved methods into PipelineBase 657, 665
- Remove s3fs as a dev dependency 664
- Changed requirements-parser to be a core dependency 673
- Replace supported_problem_types field on pipelines with problem_type attribute on base classes 678
- Changed AutoML to only show best results for a given pipeline template in rankings, added full_rankings property to show all 682
- Update ModelFamily values: don't list xgboost/catboost as classifiers now that we have regression pipelines for them 677
- Changed AutoML's describe_pipeline to get problem type from pipeline instead 685
- Standardize import_or_raise error messages 683
- Updated argument order of objectives to align with sklearn's 698
- Renamed pipeline.feature_importance_graph to pipeline.graph_feature_importances 700
- Moved ROC and confusion matrix methods to evalml.pipelines.plot_utils 704
- Renamed MultiClassificationObjective to MulticlassClassificationObjective, to align with pipeline naming scheme 715
Documentation Changes
- Fixed some sphinx warnings 593
- Fixed docstring for AutoClassificationSearch with correct command 599
- Limit readthedocs formats to pdf, not htmlzip and epub 594 600
- Clean up objectives API documentation 605
- Fixed function on Exploring search results page 604
- Update release process doc 567
- AutoClassificationSearch and AutoRegressionSearch show inherited methods in API reference 651
- Fixed improperly formatted code in breaking changes for changelog 655
- Added configuration to treat Sphinx warnings as errors 660
- Removed separate plotting section for pipelines in API reference 657, 665
- Have leads example notebook load S3 files using https, so we can delete s3fs dev dependency 664
- Categorized components in API reference and added descriptions for each category 663
- Fixed Sphinx warnings about BalancedAccuracy objective 669
- Updated API reference to include missing components and clean up pipeline docstrings 689
- Reorganize API ref, and clarify pipeline sub-titles 688
- Add and update preprocessing utils in API reference 687
- Added inheritance diagrams to API reference 695
- Documented which default objective AutoML optimizes for 699
- Create seperate install page 701
- Include more utils in API ref, like import_or_raise 704
- Add more color to pipeline documentation 705
Testing Changes
- Matched install commands of check_latest_dependencies test and it's GitHub action 578
- Added Github app to auto assign PR author as assignee 477
- Removed unneeded conda installation of xgboost in windows checkin tests 618
- Update graph tests to always use tmpfile dir 649
- Changelog checkin test workaround for release PRs: If 'future release' section is empty of PR refs, pass check 658

Warning

Breaking Changes

Pipelines will now no longer take an objective parameter during instantiation, and will no longer have an objective attribute.
fit() and predict() now use an optional objective parameter, which is only used in binary classification pipelines to fit for a specific objective.
score() will now use a required objectives parameter that is used to determine all the objectives to score on. This differs from the previous behavior, where the pipeline's objective was scored on regardless.
score() will now return one dictionary of all objective scores.
ROC and ConfusionMatrix plot methods via Auto(*).plot have been removed by 615 and are replaced by roc_curve and confusion_matrix in evamlm.pipelines.plot_utils in :pr:`704
normalize_confusion_matrix has been moved to evalml.pipelines.plot_utils 704
Pipelines _name field changed to custom_name
Pipelines supported_problem_types field is removed because it is no longer necessary 678
Updated argument order of objectives' objective_function to align with sklearn 698
pipeline.feature_importance_graph has been renamed to pipeline.graph_feature_importances in 700
Removed unsupported MSLE objective 704

v0.8.0 Apr. 1, 2020

Enhancements
- Add normalization option and information to confusion matrix 484
- Add util function to drop rows with NaN values 487
- Renamed PipelineBase.name as PipelineBase.summary and redefined PipelineBase.name as class property 491
- Added access to parameters in Pipelines with PipelineBase.parameters (used to be return of PipelineBase.describe) 501
- Added fill_value parameter for SimpleImputer 509
- Added functionality to override component hyperparameters and made pipelines take hyperparemeters from components 516
- Allow numpy.random.RandomState for random_state parameters 556
Fixes
- Removed unused dependency matplotlib, and move category_encoders to test reqs 572
Changes
- Undo version cap in XGBoost placed in 402 and allowed all released of XGBoost 407
- Support pandas 1.0.0 486
- Made all references to the logger static 503
- Refactored model_type parameter for components and pipelines to model_family 507
- Refactored problem_types for pipelines and components into supported_problem_types 515
- Moved pipelines/utils.save_pipeline and pipelines/utils.load_pipeline to PipelineBase.save and PipelineBase.load 526
- Limit number of categories encoded by OneHotEncoder 517
Documentation Changes
- Updated API reference to remove PipelinePlot and added moved PipelineBase plotting methods 483
- Add code style and github issue guides 463 512
- Updated API reference for to surface class variables for pipelines and components 537
- Fixed README documentation link 535
- Unhid PR references in changelog 656
Testing Changes
- Added automated dependency check PR 482, 505
- Updated automated dependency check comment 497
- Have build_docs job use python executor, so that env vars are set properly 547
- Added simple test to make sure OneHotEncoder's top_n works with large number of categories 552
- Run windows unit tests on PRs 557

Warning

Breaking Changes

AutoClassificationSearch and AutoRegressionSearch's model_types parameter has been refactored into allowed_model_families
ModelTypes enum has been changed to ModelFamily
Components and Pipelines now have a model_family field instead of model_type
get_pipelines utility function now accepts model_families as an argument instead of model_types
PipelineBase.name no longer returns structure of pipeline and has been replaced by PipelineBase.summary
PipelineBase.problem_types and Estimator.problem_types has been renamed to supported_problem_types
pipelines/utils.save_pipeline and pipelines/utils.load_pipeline moved to PipelineBase.save and PipelineBase.load

v0.7.0 Mar. 9, 2020

Enhancements
- Added emacs buffers to .gitignore 350
- Add CatBoost (gradient-boosted trees) classification and regression components and pipelines 247
- Added Tuner abstract base class 351
- Added n_jobs as parameter for AutoClassificationSearch and AutoRegressionSearch 403
- Changed colors of confusion matrix to shades of blue and updated axis order to match scikit-learn's 426
- Added PipelineBase graph and feature_importance_graph methods, moved from previous location 423
- Added support for python 3.8 462
Fixes
- Fixed ROC and confusion matrix plots not being calculated if user passed own additional_objectives 276
- Fixed ReadtheDocs FileNotFoundError exception for fraud dataset 439
Changes
- Added n_estimators as a tunable parameter for XGBoost 307
- Remove unused parameter ObjectiveBase.fit_needs_proba 320
- Remove extraneous parameter component_type from all components 361
- Remove unused rankings.csv file 397
- Downloaded demo and test datasets so unit tests can run offline 408
- Remove _needs_fitting attribute from Components 398
- Changed plot.feature_importance to show only non-zero feature importances by default, added optional parameter to show all 413
- Refactored PipelineBase to take in parameter dictionary and moved pipeline metadata to class attribute 421
- Dropped support for Python 3.5 438
- Removed unused apply.py file 449
- Clean up requirements.txt to remove unused deps 451
- Support installation without all required dependencies 459
Documentation Changes
- Update release.md with instructions to release to internal license key 354
Testing Changes
- Added tests for utils (and moved current utils to gen_utils) 297
- Moved XGBoost install into it's own separate step on Windows using Conda 313
- Rewind pandas version to before 1.0.0, to diagnose test failures for that version 325
- Added dependency update checkin test 324
- Rewind XGBoost version to before 1.0.0 to diagnose test failures for that version 402
- Update dependency check to use a whitelist 417
- Update unit test jobs to not install dev deps 455

Warning

Breaking Changes

Python 3.5 will not be actively supported.

v0.6.0 Dec. 16, 2019

Enhancements
- Added ability to create a plot of feature importances 133
- Add early stopping to AutoML using patience and tolerance parameters 241
- Added ROC and confusion matrix metrics and plot for classification problems and introduce PipelineSearchPlots class 242
- Enhanced AutoML results with search order 260
- Added utility function to show system and environment information 300
Fixes
- Lower botocore requirement 235
- Fixed decision_function calculation for FraudCost objective 254
- Fixed return value of Recall metrics 264
- Components return self on fit 289
Changes
- Renamed automl classes to AutoRegressionSearch and AutoClassificationSearch 287
- Updating demo datasets to retain column names 223
- Moving pipeline visualization to PipelinePlots class 228
- Standarizing inputs as pd.Dataframe / pd.Series 130
- Enforcing that pipelines must have an estimator as last component 277
- Added ipywidgets as a dependency in requirements.txt 278
- Added Random and Grid Search Tuners 240
Documentation Changes
- Adding class properties to API reference 244
- Fix and filter FutureWarnings from scikit-learn 249, 257
- Adding Linear Regression to API reference and cleaning up some Sphinx warnings 227
Testing Changes
- Added support for testing on Windows with CircleCI 226
- Added support for doctests 233

Warning

Breaking Changes

The fit() method for AutoClassifier and AutoRegressor has been renamed to search().
AutoClassifier has been renamed to AutoClassificationSearch
AutoRegressor has been renamed to AutoRegressionSearch
AutoClassificationSearch.results and AutoRegressionSearch.results now is a dictionary with pipeline_results and search_order keys. pipeline_results can be used to access a dictionary that is identical to the old .results dictionary. Whereas, search_order returns a list of the search order in terms of pipeline_id.
Pipelines now require an estimator as the last component in component_list. Slicing pipelines now throws an NotImplementedError to avoid returning pipelines without an estimator.

v0.5.2 Nov. 18, 2019

Enhancements
- Adding basic pipeline structure visualization 211
Documentation Changes
- Added notebooks to build process 212

v0.5.1 Nov. 15, 2019

Enhancements
- Added basic outlier detection guardrail 151
- Added basic ID column guardrail 135
- Added support for unlimited pipelines with a max_time limit 70
- Updated .readthedocs.yaml to successfully build 188
Fixes
- Removed MSLE from default additional objectives 203
- Fixed random_state passed in pipelines 204
- Fixed slow down in RFRegressor 206
Changes
- Pulled information for describe_pipeline from pipeline's new describe method 190
- Refactored pipelines 108
- Removed guardrails from Auto(*) 202, 208
Documentation Changes
- Updated documentation to show max_time enhancements 189
- Updated release instructions for RTD 193
- Added notebooks to build process 212
- Added contributing instructions 213
- Added new content 222

v0.5.0 Oct. 29, 2019

Enhancements
- Added basic one hot encoding 73
- Use enums for model_type 110
- Support for splitting regression datasets 112
- Auto-infer multiclass classification 99
- Added support for other units in max_time 125
- Detect highly null columns 121
- Added additional regression objectives 100
- Show an interactive iteration vs. score plot when using fit() 134
Fixes
- Reordered describe_pipeline 94
- Added type check for model_type 109
- Fixed s units when setting string max_time 132
- Fix objectives not appearing in API documentation 150
Changes
- Reorganized tests 93
- Moved logging to its own module 119
- Show progress bar history 111
- Using cloudpickle instead of pickle to allow unloading of custom objectives 113
- Removed render.py 154
Documentation Changes
- Update release instructions 140
- Include additional_objectives parameter 124
- Added Changelog 136
Testing Changes
- Code coverage 90
- Added CircleCI tests for other Python versions 104
- Added doc notebooks as tests 139
- Test metadata for CircleCI and 2 core parallelism 137

v0.4.1 Sep. 16, 2019

Enhancements
- Added AutoML for classification and regressor using Autobase and Skopt 7 9
- Implemented standard classification and regression metrics 7
- Added logistic regression, random forest, and XGBoost pipelines 7
- Implemented support for custom objectives 15
- Feature importance for pipelines 18
- Serialization for pipelines 19
- Allow fitting on objectives for optimal threshold 27
- Added detect label leakage 31
- Implemented callbacks 42
- Allow for multiclass classification 21
- Added support for additional objectives 79
Fixes
- Fixed feature selection in pipelines 13
- Made random_seed usage consistent 45
Documentation Changes
- Documentation Changes
- Added docstrings 6
- Created notebooks for docs 6
- Initialized readthedocs EvalML 6
- Added favicon 38
Testing Changes
- Added testing for loading data 39

v0.2.0 Aug. 13, 2019

Enhancements
- Created fraud detection objective 4

v0.1.0 July. 31, 2019

First Release
Enhancements
- Added lead scoring objecitve 1
- Added basic classifier 1
Documentation Changes
- Initialized Sphinx for docs 1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

changelog.rst

changelog.rst

Changelog

Files

changelog.rst

Latest commit

History

changelog.rst

File metadata and controls

Changelog