09 Dec 18:07

chukarsten

389d4f6

v0.39.0

v0.39.0 Dec. 9, 2021

Enhancements

Renamed DelayedFeatureTransformer to TimeSeriesFeaturizer and enhanced it to compute rolling features #3028
Added ability to impute only specific columns in PerColumnImputer #3123
Added TimeSeriesParametersDataCheck to verify the time series parameters are valid given the number of splits in cross validation #3111

Fixes

Default parameters for RFRegressorSelectFromModel and RFClassifierSelectFromModel has been fixed to avoid selecting all features #3110

Changes

Removed reliance on a datetime index for ARIMARegressor and ProphetRegressor #3104
Included target leakage check when fitting ARIMARegressor to account for the lack of TimeSeriesFeaturizer in ARIMARegressor based pipelines #3104
Cleaned up and refactored InvalidTargetDataCheck implementation and docstring #3122
Removed indices information from the output of HighlyNullDataCheck's validate() method #3092
Added ReplaceNullableTypes component to prepare for handling pandas nullable types. #3090
Removed unused EnsembleMissingPipelinesError exception definition #3131

Documentation Changes

Testing Changes

Refactored tests to avoid using importorskip #3126
Added skip_during_conda test marker to skip tests that are not supposed to run during conda build #3127
Added skip_if_39 test marker to skip tests that are not supposed to run during python 3.9 #3133

Breaking Changes

Renamed DelayedFeatureTransformer to TimeSeriesFeaturizer #3028
ProphetRegressor now requires a datetime column in X represented by the date_index parameter #3104
Renamed module evalml.data_checks.invalid_target_data_check to evalml.data_checks.invalid_targets_data_check #3122
Removed unused EnsembleMissingPipelinesError exception definition #3131

Assets 2

29 Nov 19:36

chukarsten

v0.38.0

5de7049

v0.38.0

v0.38.0 Nov. 29, 2021

Enhancements

Added data_check_name attribute to the data check action class #3034
Added NumWords and NumCharacters primitives to TextFeaturizer and renamed TextFeaturizer` to NaturalLanguageFeaturizer`` #3030
Added support for scikit-learn > 1.0.0 #3051
Required the date_index parameter to be specified for time series problems in AutoMLSearch #3041
Allowed time series pipelines to predict on test datasets whose length is less than or equal to the forecast_horizon. Also allowed the test set index to start at 0. #3071
Enabled time series pipeline to predict on data with features that are not known-in-advanced #3094

Fixes

Added in error message when fit and predict/predict_proba data types are different #3036
Fixed bug where ensembling components could not get converted to JSON format #3049
Fixed bug where components with tuned integer hyperparameters could not get converted to JSON format #3049
Included confusion matrix at the pipeline threshold for find_confusion_matrix_per_threshold #3080
Fixed bug where One Hot Encoder would error out if a non-categorical feature had a missing value #3083
Fixed bug where features created from categorical columns by Delayed Feature Transformer would be inferred as categorical #3083

Changes

Delete predict_uses_y estimator attribute #3069
Change DateTimeFeaturizer to use corresponding Featuretools primitives #3081
Updated TargetDistributionDataCheck to return metadata details as floats rather strings #3085
Removed dependency on psutil package #3093

Documentation Changes

Updated docs to use data check action methods rather than manually cleaning data #3050

Testing Changes

Updated integration tests to use make_pipeline_from_actions instead of private method #3047

Breaking Changes

Added data_check_name attribute to the data check action class #3034
Renamed TextFeaturizer` to NaturalLanguageFeaturizer`` #3030
Updated the Pipeline.graph_json function to return a dictionary of "from" and "to" edges instead of tuples #3049
Delete predict_uses_y estimator attribute #3069
Changed time series problems in AutoMLSearch to need a not-None date_index #3041
Changed the DelayedFeatureTransformer to throw a ValueError during fit if the date_index is None #3041
Passing X=None to DelayedFeatureTransformer is deprecated #3041

Assets 2

10 Nov 17:30

chukarsten

v0.37.0

25808fb

v0.37.0

v0.37.0 Nov. 10, 2021

Enhancements

Added find_confusion_matrix_per_threshold to Model Understanding #2972
Limit computationally-intensive models during AutoMLSearch for certain multiclass problems, allow for opt-in with parameter allow_long_running_models #2982
Added support for stacked ensemble pipelines to prediction explanations module #2971
Added integration tests for data checks and data checks actions workflow #2883
Added a change in pipeline structure to handle categorical columns separately for pipelines in DefaultAlgorithm #2986
Added an algorithm to DelayedFeatureTransformer to select better lags #3005
Added AutoML function to access ensemble pipeline's input pipelines IDs #3011

Fixes

Fixed bug where Oversampler didn't consider boolean columns to be categorical #2980
Fixed permutation importance failing when target is categorical #3017
Updated estimator and pipelines' predict, predict_proba, transform, inverse_transform methods to preserve input indices #2979
Updated demo dataset link for daily min temperatures #3023

Changes

Updated OutliersDataCheck and UniquenessDataCheck and allow for the suspension of the Nullable types error #3018

Documentation Changes

Fixed cost benefit matrix demo formatting #2990
Update ReadMe.md with new badge links and updated installation instructions for conda #2998
Added more comprehensive doctests #3002

Assets 2

27 Oct 22:14

chukarsten

v0.36.0

59b6664

v0.36.0

v0.36.0 Oct. 27, 2021

Enhancements

Added LIME as an algorithm option for explain_predictions and explain_predictions_best_worst #2905
Standardized data check messages and added default "rows" and "columns" to data check message details dictionary #2869
Added rows_of_interest to pipeline utils #2908
Added support for woodwork version 0.8.2 #2909
Enhanced the DateTimeFeaturizer to handle NaNs in date features #2909
Added support for woodwork logical types PostalCode, SubRegionCode, and CountryCode in model understanding tools #2946
Added Vowpal Wabbit regressor and classifiers #2846

Fixes

Fixed bug where partial dependence was not respecting the ww schema #2929
Fixed calculate_permutation_importance for datetimes on StandardScaler #2938
Fixed SelectColumns to only select available features for feature selection in DefaultAlgorithm #2944
Fixed DropColumns component not receiving parameters in DefaultAlgorithm #2945
Fixed bug where trained binary thresholds were not being returned by get_pipeline or clone #2948
Fixed bug where Oversampler selected ww logical categorical instead of ww semantic category #2946

Changes

Changed make_pipeline function to place the DateTimeFeaturizer prior to the Imputer so that NaN dates can be imputed #2909
Refactored OutliersDataCheck and HighlyNullDataCheck to add more descriptive metadata #2907

Documentation Changes

Added back Future Release section to release notes #2927
Updated CI to run doctest (docstring tests) and apply necessary fixes to docstrings #2933
Added documentation for BinaryClassificationPipeline thresholding #2937

Testing Changes

Fixed dependency checker to catch full names of packages #2930
Refactored build_conda_pkg to work from a local recipe #2925

Breaking Changes

Standardized data check messages and added default "rows" and "columns" to data check message details dictionary. This may change the number of messages returned from a data check. #2869

Assets 2

15 Oct 02:32

chukarsten

v0.35.0

c4475d9

v0.35.0

v0.35.0 Oct. 14, 2021

Enhancements

Added human-readable pipeline explanations to model understanding #2861
Updated to support Featuretools 1.0.0 and nlp-primitives 2.0.0 #2848

Fixes

Fixed bug where long mode for the top level search method was not respected #2875
Pinned cmdstan to 0.28.0 in cmdstan-builder to prevent future breaking of support for Prophet #2880
Added Jarque-Bera to the TargetDistributionDataCheck #2891

Changes

Updated pipelines to use a label encoder component instead of doing encoding on the pipeline level #2821
Deleted scikit-learn ensembler #2819
Refactored pipeline building logic out of AutoMLSearch and into IterativeAlgorithm #2854
Refactored names for methods in ComponentGraph and PipelineBase #2902

Documentation Changes

Updated install.ipynb to reflect flexibility for cmdstan version installation #2880
Updated the conda section of our contributing guide #2899

Testing Changes

Updated test_all_estimators to account for Prophet being allowed for Python 3.9 #2892
Updated linux tests to use cmdstan-builder==0.0.8 #2880

Breaking Changes

Updated pipelines to use a label encoder component instead of doing encoding on the pipeline level. This means that pipelines will no longer automatically encode non-numerical targets. Please use a label encoder if working with classification problems and non-numeric targets. #2821
Deleted scikit-learn ensembler #2819
IterativeAlgorithm now requires X, y, problem_type as required arguments as well as sampler_name, allowed_model_families, allowed_component_graphs, max_batches, and verbose as optional arguments #2854
Changed method names of fit_features and compute_final_component_features to fit_and_transform_all_but_final and transform_all_but_final in ComponentGraph, and compute_estimator_features to transform_all_but_final in pipeline classes #2902

Assets 2

01 Oct 19:22

chukarsten

v0.34.1rc1

8dc4d47

v0.34.1rc1 Pre-release

Pre-release

v0.34.1rc1 Oct. 1, 2021

Enhancements

Updated to support Featuretools 1.0.0 and nlp-primitives 2.0.0 #2848

Assets 2

01 Oct 16:46

chukarsten

v0.34.0

40ad6f5

v0.34.0

v0.34.0 Oct. 1, 2021

Enhancements

Updated to work with Woodwork 0.8.1 #2783
Added validation that training_data and training_target are not None in prediction explanations #2787
Added support for training-only components in pipelines and component graphs #2776
Added default argument for the parameters value for ComponentGraph.instantiate #2796
Added TIME_SERIES_REGRESSION to LightGBMRegressor's supported problem types #2793
Added validation to holdout data passed to predict and predict_proba for time series #2804
Added information about which row indices are outliers in OutliersDataCheck #2818
Added verbose flag to top level search() method #2813
Added support for linting jupyter notebooks and clearing the executed cells and empty cells #2829 #2837
Added "DROP_ROWS" action to output of OutliersDataCheck.validate() #2820
Added the ability of AutoMLSearch to accept a SequentialEngine instance as engine input #2838
Added new label encoder component to EvalML #2853
Added our own partial dependence implementation #2834

Fixes

Fixed bug where calculate_permutation_importance was not calculating the right value for pipelines with target transformers #2782
Fixed bug where transformed target values were not used in fit for time series pipelines #2780
Fixed bug where score_pipelines method of AutoMLSearch would not work for time series problems #2786
Removed TargetTransformer class #2833
Added tests to verify ComponentGraph support by pipelines #2830
Fixed incorrect parameter for baseline regression pipeline in AutoMLSearch #2847

Changes

Changed woodwork initialization to use partial schemas #2774
Made Transformer.transform() an abstract method #2744
Deleted EmptyDataChecks class #2794
Removed data check for checking log distributions in make_pipeline #2806
Changed the minimum woodwork version to 0.8.0 #2783
Pinned woodwork version to 0.8.0 #2832
Removed model_family attribute from ComponentBase and transformers #2828
Limited scikit-learn until new features and errors can be addressed #2842
Show DeprecationWarning when Sklearn Ensemblers are called #2859

Testing Changes

Updated matched assertion message regarding monotonic indices in polynomial detrender tests #2811
Added a test to make sure pip versions match conda versions #2851

Breaking Changes

Made Transformer.transform() an abstract method #2744
Deleted EmptyDataChecks class #2794
Removed data check for checking log distributions in make_pipeline #2806

Assets 2

15 Sep 20:25

chukarsten

v0.33.0

ce3fc7a

v0.33.0

v0.33.0 Sep. 15, 2021

Enhancements

Fixes

Fixed bug where warnings during make_pipeline were not being raised to the user #2765

Changes

Refactored and removed SamplerBase class #2775

Documentation Changes

Added docstring linting packages pydocstyle and darglint to make-lint command #2670

Assets 2

10 Sep 21:03

chukarsten

v0.32.1

ca2bd17

v0.32.1

v0.32.1 Sep. 10, 2021

Enhancements

Added verbose flag to AutoMLSearch to run search in silent mode by default #2645
Added label encoder to XGBoostClassifier to remove the warning #2701
Set eval_metric to logloss for XGBoostClassifier #2741
Added support for woodwork versions 0.7.0 and 0.7.1 #2743
Changed explain_predictions functions to display original feature values #2759
Added X_train and y_train to graph_prediction_vs_actual_over_time and get_prediction_vs_actual_over_time_data #2762
Added forecast_horizon as a required parameter to time series pipelines and AutoMLSearch #2697
Added predict_in_sample and predict_proba_in_sample methods to time series pipelines to predict on data where the target is known, e.g. cross-validation #2697

Fixes

Fixed bug where _catch_warnings assumed all warnings were PipelineNotUsed #2753
Fixed bug where Imputer.transform would erase ww typing information prior to handing data to the SimpleImputer #2752
Fixed bug where Oversampler could not be copied #2755

Changes

Deleted drop_nan_target_rows utility method #2737
Removed default logging setup and debugging log file #2645
Changed the default n_jobs value for XGBoostClassifier and XGBoostRegressor to 12 #2757
Changed TimeSeriesBaselineEstimator to only work on a time series pipeline with a DelayedFeaturesTransformer #2697
Added X_train and y_train as optional parameters to pipeline predict, predict_proba. Only used for time series pipelines #2697
Added training_data and training_target as optional parameters to explain_predictions and explain_predictions_best_worst to support time series pipelines #2697
Changed time series pipeline predictions to no longer output series/dataframes padded with NaNs. A prediction will be returned for every row in the X input #2697

Documentation Changes

Specified installation steps for Prophet #2713
Added documentation for data exploration on data check actions #2696
Added a user guide entry for time series modelling #2697

Testing Changes

Fixed flaky TargetDistributionDataCheck test for very_lognormal distribution #2748

Breaking Changes

Removed default logging setup and debugging log file #2645
Added X_train and y_train to graph_prediction_vs_actual_over_time and get_prediction_vs_actual_over_time_data #2762
Added forecast_horizon as a required parameter to time series pipelines and AutoMLSearch #2697
Changed TimeSeriesBaselineEstimator to only work on a time series pipeline with a DelayedFeaturesTransformer #2697
Added X_train and y_train as required parameters for predict and predict_proba in time series pipelines #2697
Added training_data and training_target as required parameters to explain_predictions and explain_predictions_best_worst for time series pipelines #2697

Assets 2

02 Sep 00:42

chukarsten

v0.32.0

9352922

v0.32.0

v0.32.0 Sep. 1, 2021

Enhancements

Allow string for engine parameter for AutoMLSearch#2667
Add ProphetRegressor to AutoML #2619
Integrated DefaultAlgorithm into AutoMLSearch #2634
Removed SVM "linear" and "precomputed" kernel hyperparameter options, and improved default parameters #2651
Updated ComponentGraph initalization to raise ValueError when user attempts to use .y for a component that does not produce a tuple output #2662
Updated to support Woodwork 0.6.0 #2690
Updated pipeline graph() to distingush X and y edges #2654
Added DropRowsTransformer component #2692
Added DROP_ROWS to _make_component_list_from_actions and clean up metadata #2694

Fixes

Updated Oversampler logic to select best SMOTE based on component input instead of pipeline input #2695
Added ability to explicitly close DaskEngine resources to improve runtime and reduce Dask warnings #2667
Fixed partial dependence bug for ensemble pipelines #2714
Updated TargetLeakageDataCheck to maintain user-selected logical types #2711

Changes

Replaced SMOTEOversampler, SMOTENOversampler and SMOTENCOversampler with consolidated Oversampler component #2695
Removed LinearRegressor from the list of default AutoMLSearch estimators due to poor performance #2660

Documentation Changes

Updated documentation to make parallelization of AutoML clearer #2667

Testing Changes

Removes the process-level parallelism from the test_cancel_job test #2666
Installed numba 0.53 in windows CI to prevent problems installing version 0.54 #2710

Breaking Changes

Renamed the current top level search method to search_iterative and defined a new search method for the DefaultAlgorithm #2634
Replaced SMOTEOversampler, SMOTENOversampler and SMOTENCOversampler with consolidated Oversampler component #2695
Removed LinearRegressor from the list of default AutoMLSearch estimators due to poor performance #2660

Assets 2

Releases: alteryx/evalml

v0.39.0

v0.39.0 Dec. 9, 2021

Enhancements

Fixes

Changes

Documentation Changes

Testing Changes

Breaking Changes

v0.38.0

v0.38.0 Nov. 29, 2021

Enhancements

Fixes

Changes

Documentation Changes

Testing Changes

Breaking Changes

v0.37.0

v0.37.0 Nov. 10, 2021

Enhancements

Fixes

Changes

Documentation Changes

v0.36.0

v0.36.0 Oct. 27, 2021

Enhancements

Fixes

Changes

Documentation Changes

Testing Changes

Breaking Changes

v0.35.0

v0.35.0 Oct. 14, 2021

Enhancements

Fixes

Changes

Documentation Changes

Testing Changes

Breaking Changes

v0.34.1rc1

v0.34.1rc1 Oct. 1, 2021

Enhancements

v0.34.0

v0.34.0 Oct. 1, 2021

Enhancements

Fixes

Changes

Testing Changes

Breaking Changes

v0.33.0

v0.33.0 Sep. 15, 2021

Enhancements

Fixes

Changes

Documentation Changes

v0.32.1

v0.32.1 Sep. 10, 2021

Enhancements

Fixes

Changes

Documentation Changes

Testing Changes

Breaking Changes

v0.32.0

v0.32.0 Sep. 1, 2021

Enhancements

Fixes

Changes

Documentation Changes

Testing Changes

Breaking Changes