In [None]:
import tensorflow_decision_forests as tfdf
import ydf
import tensorflow as tf
import pandas as pd
import numpy as np

# Migration from TensorFlow Decision Forests

Yggdrasil Decision Forests (YDF) and and TensorFlow Decision Forests (TF-DF) are both front-ends to the same high-performance C++ implementation of Decision Forests algorithms. This guide shows how to migrate a TF-DF pipeline to YDF.

### Benefits

YDF offers a number of benefits over TF-DF:

*   Drastically improved speed of training and inference,
*   Rich analysis and evaluation capabilities,
*   Advanced support for inspecting and modifying models,
*   Export to TensorFlow SavedModel and other model formats,
*   A clean and simple API,
*   Access to all upcoming features of YDF.

### Do I have to migrate?

**TensorFlow Decision Forests will continue to be supported and users are not required to migrate their pipelines!** If TF-DF and the Keras work well for you, feel free to stay with TF-DF. Our team will continue to release new versions and support users through our various support channels.

## Converting TF-DF models to YDF

TF-DF can be imported to YDF **with the model predictions intact**. This means that, given the same dataset, the imported YDF model will generate the same predictions as the TF-DF model.

### Limitations

It is possible to create combined Neural Network / Decision Forest models with TF-DF as outlined in [this tutorial](https://www.tensorflow.org/decision_forests/tutorials/model_composition_colab). Such models cannot be directly converted to YDF since YDF's importer only considers the decision forest model and ignores the Neural Network. Similarly, TF-DF multiple containing multiple decision forests cannot be converted to YDF.

### Step 0: Create a TF-DF model

For the tutorial, we're training and saving a new model, but just loading a model from disk works just as well.

In [None]:
# Download a dataset for the model
!wget -q https://storage.googleapis.com/download.tensorflow.org/data/palmer_penguins/penguins.csv -O /tmp/penguins.csv
dataset_df = pd.read_csv("/tmp/penguins.csv")

def split_dataset(dataset, test_ratio=0.30):
  """Splits a panda dataframe in two."""
  test_indices = np.random.rand(len(dataset)) < test_ratio
  return dataset[~test_indices], dataset[test_indices]

train_ds_pd, test_ds_pd = split_dataset(dataset_df)

train_ds = tfdf.keras.pd_dataframe_to_tf_dataset(dataset_df, label="species")
test_ds = tfdf.keras.pd_dataframe_to_tf_dataset(test_ds_pd, label="species")

# Create, train and save a TensorFlow Decision Forests model.
tfdf_model_path = "/tmp/tfdf_model"
tfdf_model = tfdf.keras.GradientBoostedTreesModel()
tfdf_model.fit(train_ds)
tfdf_model.save(tfdf_model_path)

### Step 1: Import the TF-DF model in YDF

When importing the TF-DF model, it is important to provide the same model path as used when saving - do not, for instance, provide the path to the `assets` subdirectory of the model.

In [None]:
ydf_model = ydf.from_tensorflow_decision_forests(tfdf_model_path)

### Step 2: Check and inspect the model.

The YDF model and the TF-DF model now refer to the same decision forest and therefore generate the same predictions.

**Warning**: In some rare cases, importing a model that should not be imported (e.g., a model combining a neural network and a decision forests) does not raise an error. However, the resulting YDF model then only contains the decision forest, not the neural network. If unsure, please check the TF-DF model with `model.summary()` before converting it.

In [None]:
# Load the TF-DF model
loaded_tfdf_model = tf.keras.models.load_model(tfdf_model_path)
# Check that it is a single-layer TF-DF model. Note that tfdf_model.summary() of
# a freshly trained TF-DF model provides a longer report.
loaded_tfdf_model.summary()

In [None]:
# Check that the predictions of the TF-DF model and YDF model match.
tfdf_predictions = loaded_tfdf_model.predict(test_ds)
ydf_predictions = ydf_model.predict(test_ds_pd)

print(f"The maximum difference of predictions is {np.max(np.abs(tfdf_predictions - ydf_predictions))}")

In [None]:
# Print a rich description of the YDF model.
ydf_model.describe()

## Training models with YDF

YDF and TF-DF use the exact same hyperparameters (down to the random seed). Existing training pipelines can therefore be converted by using the same hyperparameters and following the [Getting Started Guide](getting_started) as well as the other documentation.

## Exporting YDF models to TensorFlow SavedModel format

YDF models can be exported to the TensorFlow SavedModel Format for serving with TF-Serving and other tools. Check out [the detailed guide](tf_serving) for additional information.
**Note**: Categorical columns in the converted model must be saved as strings instead of integers.

In [None]:
ydf_model.to_tensorflow_saved_model('/tmp/new_tf_saved_model')