Skip to content

Releases: google/yggdrasil-decision-forests

Python API 0.4.3

08 May 13:53
Compare
Choose a tag to compare

Python API - Changelog

Feature

  • Add model.to_jax_function() function to convert a YDF model into a JAX
    function that can be combined with other JAX operations.
  • Print warnings when categorical features look like numbers.
  • Add support for Python 3.12.

Fix

  • Fix cross-validation for non-classification learners.
  • Fix missing ydf/model/tree/plotter.js
  • Solve dependency collision of YDF Proto between PYDF and TF-DF.

Python API 0.4.1

19 Apr 13:21
Compare
Choose a tag to compare

Python API - Changelog

Fix

  • Solve dependency collision to YDF between PYDF and TF-DF. If TF-DF is
    installed after PYDF, importing YDF will fails with a has no attribute 'DType' error.
  • Allow for training on cached TensorFlow dataset.

Python API 0.4.0

12 Apr 20:41
Compare
Choose a tag to compare

Python API - 0.4.0 - 2024-04-10

Feature

  • Multi-dimensional features can be selected / configured with the features=
    training argument.
  • Programmatic access to partial dependence plots and variable importances.
  • Add model.to_tensorflow_function() function to convert a YDF model into a
    TensorFlow function that can be combined with other TensorFlow operations.
    This function is compatible with Keras 2 and Keras 3.
  • Add arguments servo_api=False and feed_example_proto=False for
    model.to_tensorflow_function(mode="tf") to export TensorFlow SavedModel
    following respectively the Servo API and consuming serialized TensorFlow
    Example protos.
  • Add pre_processing and post_processing arguments to the
    model.to_tensorflow_function function to pack pre/post processing
    operations in a TensorFlow SavedModel.

Tutorials

Python API 0.3.0

Python API 0.3.0 - 2024-03-15

Breaking

  • Custom losses now require to provide the gradient, instead of the negative
    of the gradient.
  • Clarified that YDF may modify numpy arrays returned by a custom loss
    function.

Features

  • Allow using Jax for custom loss definitions.
  • Allow setting may_trigger_gc on custom losses.
  • Add support for MHLD oblique decision trees.
  • Expose hyperparameter sparse_oblique_max_num_projections.
  • HTML plots for trees with model.plot_tree().
  • Fix protobuf version to 4.24.3 to fix some incompatibilities when using
    conda.
  • Allow to list compatible engines with model.list_compatible_engines().
  • Allow to choose a fast engine with model.force_engine(...).

Fix

  • Fix slow engine creation for some combination of oblique splits.
  • Improve error message when feeding multi-dimensional labels.

Documentation

  • Clarified documentation of hyperparameters for oblique splits.
  • Fix plots, typos.

Release music

Doctor Gradus ad Parnassum from "Children's Corner" (L. 113). Claude Debussy

v1.9.0

12 Mar 16:15
Compare
Choose a tag to compare

1.9.0 - 2024-03-12

Feature

  • Add "parallel_trials" parameter in the hyper-parameter tuner to control the number of trials to run in parallel.
  • Add support for custom losses.

v1.9.0rc0

06 Mar 09:01
Compare
Choose a tag to compare
v1.9.0rc0 Pre-release
Pre-release

1.9.0rc0 - 2024-02-26

Feature

  • Add "parallel_trials" parameter in the hyper-parameter tuner to control the number of trials to run in parallel.
  • Add support for custom losses.

PYDF 0.1.0

0.1.0 - 2024-01-25

Features

  • Added model validation evaluation (for GBTs) and OOB evaluation (for RFs).
  • Expose winner-takes-all for Random Forests.
  • Added model self evaluation.
  • Added ydf.from_tensorflow_decision_forests() for importing TF-DF models.
  • Allow feeding datasets as sequence of strings.

Fixes

  • Fixes a plotting issue for GBTs without validation loss

Release music

Flötenuhren von 1772 und 1793 - Vivace (Hob XIX:13). Joseph Haydn

v1.8.0

19 Jan 09:32
Compare
Choose a tag to compare

1.8.0 - 2023-11-17

Feature

  • Support for GBT distances.
  • Remove old snapshots automatically for GBT training.

Fix

  • Regression with Mean Squared Error loss and Mean Average error loss
    incorrectly clamped the gradients, leading to incorrect predictions.
  • Change dependency from boost to boost_math for faster builds.

Note

The commit associated with this release has a typo in its description.

1.7.0 - 2023-10-20

Feature

  • Add support for Mean average error (MAE) loss for GBT.
  • Add pairwise distance between examples.
  • By default, only keep the last three snapshots when training with a working
    cache to be resilient to training interruptions.

New interface

  • Check out the new Python interface in port/python! It's still experimental
    but you can already install it from PyPi with pip install ydf.

v1.6.0

28 Sep 14:20
Compare
Choose a tag to compare

Breaking changes

  • The dependency to the distributed gradient boosted trees learner is renamed
    from
    //third_party/yggdrasil_decision_forests/learner/distributed_gradient_boosted_trees
    to
    //third_party/yggdrasil_decision_forests/learner/distributed_gradient_boosted_trees:dgbt.
    Note most case, importing the learners with
    //third_party/yggdrasil_decision_forests/learner:all_learners is
    recommended.
  • The training configuration must contain a label. A missing label is no
    longer interpreted as the label being the input feature "".

Feature

  • Add support for monotonic constraints for gradient boosted trees.
  • Improve speed of dataset reading and writing.

Fix

  • Proper error message when using distributed training on more than 2^31
    (i.e., ~2B) examples while compiling YDF with 32-bits example index.
  • Fix Window compilation with Visual Studio 2019
  • Improved error messages for invalid training configuration
  • Replaced outdated dependencies

1.5.0

04 Jul 11:12
Compare
Choose a tag to compare

Feature

  • Rename experimental_analyze_model_and_dataset to analyze_model_and_dataset
  • Add new GBT loss function POISSON for Poisson log likelihood.
  • Go API: Categorical string values available for inspection.
  • Improved training speed for unit-weight datasets.
  • Support for MHLD oblique decision trees.
  • Multi-threaded RMSE computation.
  • Added Uint8 inference engine.
  • Added Multi-task learning where the output of models trained as "secondary"
    are used as input for the models trained as "primary"

Fix

  • Go API: fixed typo on OutOfVocabulary constant.
  • Error messages for Uplift models.
  • Remove owner leakage in the model compiler.
  • Fix buggy restriction for SelGB sampling
  • Improve documentation.