Skip to content

v0.15.0

Compare
Choose a tag to compare
@dsherry dsherry released this 29 Oct 22:59
· 1446 commits to main since this release
1ec2ee4

v0.15.0 Oct. 29, 2020

Enhancements

  • Added stacked ensemble component classes (StackedEnsembleClassifier, StackedEnsembleRegressor) #1134
  • Added stacked ensemble components to AutoMLSearch #1253
  • Added DecisionTreeClassifier and DecisionTreeRegressor to AutoML #1255
  • Added graph_prediction_vs_actual in model_understanding for regression problems #1252
  • Added parameter to OneHotEncoder to enable filtering for features to encode for #1249
  • Added percent-better-than-baseline for all objectives to automl.results #1244
  • Added HighVarianceCVDataCheck and replaced synonymous warning in AutoMLSearch #1254
  • Added PCA Transformer component for dimensionality reduction #1270
  • Added generate_pipeline_code and generate_component_code to allow for code generation given a pipeline or component instance #1306
  • Added PCA Transformer component for dimensionality reduction #1270
  • Updated AutoMLSearch to support Woodwork data structures #1299
  • Added cv_folds to ClassImbalanceDataCheck and added this check to DefaultDataChecks #1333
  • Make max_batches argument to AutoMLSearch.search public #1320
  • Added text support to automl search #1062
  • Added _pipelines_per_batch as a private argument to AutoMLSearch #1355

Fixes

  • Fixed ML performance issue with ordered datasets: always shuffle data in automl's default CV splits #1265
  • Fixed broken evalml info CLI command #1293
  • Fixed boosting type='rf' for LightGBM Classifier, as well as num_leaves error #1302
  • Fixed bug in explain_predictions_best_worst where a custom index in the target variable would cause a ValueError #1318
  • Added stacked ensemble estimators to to evalml.pipelines.__init__ file #1326
  • Fixed bug in OHE where calls to transform were not deterministic if top_n was less than the number of categories in a column #1324
  • Fixed LightGBM warning messages during AutoMLSearch #1342
  • Fix warnings thrown during AutoMLSearch in HighVarianceCVDataCheck #1346
  • Fixed bug where TrainingValidationSplit would return invalid location indices for dataframes with a custom index #1348
  • Fixed bug where the AutoMLSearch random_state was not being passed to the created pipelines #1321

Changes

  • Allow add_to_rankings to be called before AutoMLSearch is called #1250
  • Removed Graphviz from test-requirements to add to requirements.txt #1327
  • Removed max_pipelines parameter from AutoMLSearch #1264
  • Include editable installs in all install make targets #1335
  • Made pip dependencies featuretools and nlp_primitives core dependencies #1062
  • Removed PartOfSpeechCount from TextFeaturizer transform primitives #1062
  • Added warning for partial_dependency when the feature includes null values #1352

Documentation Changes

  • Fixed and updated code blocks in Release Notes #1243
  • Added DecisionTree estimators to API Reference #1246
  • Changed class inheritance display to flow vertically #1248
  • Updated cost-benefit tutorial to use a holdout/test set #1159
  • Added evalml info command to documentation #1293
  • Miscellaneous doc updates #1269
  • Removed conda pre-release testing from the release process document #1282
  • Updates to contributing guide #1310
  • Added Alteryx footer to docs with Twitter and Github link #1312
  • Added documentation for evalml installation for Python 3.6 #1322
  • Added documentation changes to make the API Docs easier to understand #1323
  • Fixed documentation for feature_importance #1353
  • Added tutorial for running AutoML with text data #1357
  • Added documentation for woodwork integration with automl search #1361

Testing Changes

  • Added tests for jupyter_check to handle IPython #1256
  • Cleaned up make_pipeline tests to test for all estimators #1257
  • Added a test to check conda build after merge to main #1247
  • Removed code that was lacking codecov for __main__.py and unnecessary #1293
  • Codecov: round coverage up instead of down #1334
  • Add DockerHub credentials to CI testing environment #1356
  • Add DockerHub credentials to conda testing environment #1363

Breaking Changes

  • Renamed LabelLeakageDataCheck to TargetLeakageDataCheck #1319
  • max_pipelines parameter has been removed from AutoMLSearch. Please use max_iterations instead. #1264
  • AutoMLSearch.search() will now log a warning if the input is not a Woodwork data structure (pandas, numpy) #1299
  • Make max_batches argument to AutoMLSearch.search public #1320
  • Removed unused argument feature_types from AutoMLSearch.search #1062