# Train a Decision tree and neural network together

Previous examples used a pretrained neural network (nn) to process the text features before passing them to the random forest.

We will be training both the neural network and the random forest from scratch

Tf-df decision forests do not back propagate gradient. Training will have to be done in two stages:
1. train the neural network as a standard classification task
2. Replace the neural network head (last layer and the soft max) with a random forest. Train the random forest as usual


In [1]:
import tensorflow_decision_forests as tfdf

import os
import numpy as np
import pandas as pd
import tensorflow as tf
import math

In [2]:
dataset_df = pd.read_csv("../../dataset/penguins.csv")

In [3]:
dataset_df.head()

Unnamed: 0,species,island,bill_length_mm,bill_depth_mm,flipper_length_mm,body_mass_g,sex,year
0,Adelie,Torgersen,39.1,18.7,181.0,3750.0,male,2007
1,Adelie,Torgersen,39.5,17.4,186.0,3800.0,female,2007
2,Adelie,Torgersen,40.3,18.0,195.0,3250.0,female,2007
3,Adelie,Torgersen,,,,,,2007
4,Adelie,Torgersen,36.7,19.3,193.0,3450.0,female,2007


### Prepare the dataset for training

Replace the numerical NaN (representing missing values in pandas dataframe) with 0s.
This is due to Neural nets dont work well with numerical nans

In [6]:
label = "species"

for col in dataset_df.columns:
    if dataset_df[col].dtype not in [str, object]:
        dataset_df[col] = dataset_df[col].fillna(0)

In [5]:
dataset_df.head()

Unnamed: 0,species,island,bill_length_mm,bill_depth_mm,flipper_length_mm,body_mass_g,sex,year
0,Adelie,Torgersen,39.1,18.7,181.0,3750.0,male,2007
1,Adelie,Torgersen,39.5,17.4,186.0,3800.0,female,2007
2,Adelie,Torgersen,40.3,18.0,195.0,3250.0,female,2007
3,Adelie,Torgersen,0.0,0.0,0.0,0.0,,2007
4,Adelie,Torgersen,36.7,19.3,193.0,3450.0,female,2007


Split the dataset into a training and testing dataset

In [7]:
def split_dataset(dataset, test_ratio=0.30):
    """Splits a panda dataframe in two"""
    test_indices = np.random.rand(len(dataset)) < test_ratio
    return dataset[~test_indices], dataset[test_indices]

train_ds_pd, test_ds_pd = split_dataset(dataset_df)
print("{} examples in training, {} examples for testing".format(
    len(train_ds_pd), len(test_ds_pd)
))

257 examples in training, 87 examples for testing


convert the dataset into tensorflow datasets


In [8]:
train_ds = tfdf.keras.pd_dataframe_to_tf_dataset(train_ds_pd, label=label)
test_ds = tfdf.keras.pd_dataframe_to_tf_dataset(test_ds_pd, label=label)

  features_dataframe = dataframe.drop(label, 1)
2023-01-02 10:26:11.351572: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:306] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.


Metal device set to: Apple M1

systemMemory: 8.00 GB
maxCacheSize: 2.67 GB



2023-01-02 10:26:11.352878: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:272] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: <undefined>)
  features_dataframe = dataframe.drop(label, 1)


### Build the Models

Next create the neural network model using keras functional style

We will be using a simple model only using two inputs

In [9]:
input_1 = tf.keras.Input(shape=(1,), name="bill_length_mm", dtype="float")
input_2 = tf.keras.Input(shape=(1,), name="island", dtype="string")

nn_raw_inputs = [input_1, input_2]

Use a preprocessing layers to convert the raw inputs to input appropriate for the neural network

In [10]:
# normalization

Normalization = tf.keras.layers.Normalization
CategoryEncoding = tf.keras.layers.CategoryEncoding
StringLookup = tf.keras.layers.StringLookup

values = train_ds_pd["bill_depth_mm"].values[:, tf.newaxis]
input_1_normalizer = Normalization()
input_1_normalizer.adapt(values)

values = train_ds_pd["island"].values
# 32 islands for the max tokens
input_2_indexer = StringLookup(max_tokens=32)
input_2_indexer.adapt(values)

input_2_onehot = CategoryEncoding(output_mode="binary", max_tokens=32)

normalized_input_1 = input_1_normalizer(input_1)
normalized_input_2 = input_2_onehot(input_2_indexer(input_2))

nn_processed_inputs = [normalized_input_1, normalized_input_2]

2023-01-02 11:38:58.510037: W tensorflow/core/platform/profile_utils/cpu_utils.cc:128] Failed to get CPU frequency: 0 Hz
2023-01-02 11:38:58.730115: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:114] Plugin optimizer for device_type GPU is enabled.
2023-01-02 11:38:58.764763: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:114] Plugin optimizer for device_type GPU is enabled.




2023-01-02 11:38:59.987980: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:114] Plugin optimizer for device_type GPU is enabled.


build the body of the neural network

In [12]:
y = tf.keras.layers.Concatenate()(nn_processed_inputs)
y = tf.keras.layers.Dense(16, activation=tf.nn.relu6)(y)
last_layer = tf.keras.layers.Dense(8, activation=tf.nn.relu, name="last")(y)

"""
3 for the three label classes

if it were a binary classification the output dim would be 1
"""

classification_output = tf.keras.layers.Dense(3)(y)
nn_model = tf.keras.models.Model(nn_raw_inputs, classification_output)

This nn_model directly produces classification logits

Next we will need to create a decision forest model. This will operate on the high level features that the neural network extracts in the last layer before the classification head.

To reduce the risk of mistakes, group both the decision forest and the neural network in a single keras model

In [14]:
nn_without_head = tf.keras.models.Model(inputs=nn_model.inputs, outputs=last_layer)
df_and_nn_model = tfdf.keras.RandomForestModel(preprocessing=nn_without_head)

Use /var/folders/sk/f7k402kx1wvdmcz91gdz6hs00000gn/T/tmp63e61uc4 as temporary training directory


## Train and evaluate the models

Model will be trained in two stages. 

First train the neural network with its own classification head

In [15]:
nn_model.compile(
    optimizer=tf.keras.optimizers.Adam(),
    loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    metrics=["accuracy"]
)

nn_model.fit(x=train_ds, validation_data=test_ds, epochs=10)
nn_model.summary()

Epoch 1/10


  inputs = self._flatten_to_reference_inputs(inputs)
2023-01-02 11:49:29.444748: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:114] Plugin optimizer for device_type GPU is enabled.


Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10


2023-01-02 11:49:31.588203: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:114] Plugin optimizer for device_type GPU is enabled.


Epoch 8/10
Epoch 9/10
Epoch 10/10
Model: "model_1"
__________________________________________________________________________________________________
 Layer (type)                   Output Shape         Param #     Connected to                     
 island (InputLayer)            [(None, 1)]          0           []                               
                                                                                                  
 bill_length_mm (InputLayer)    [(None, 1)]          0           []                               
                                                                                                  
 string_lookup (StringLookup)   (None, 1)            0           ['island[0][0]']                 
                                                                                                  
 normalization (Normalization)  (None, 1)            3           ['bill_length_mm[0][0]']         
                                                          

The neural network layers are shared between the two models. Now that the neural network is trained the decision forest model will be fit to the trained output of the neural network layers.

In [16]:
df_and_nn_model.fit(x=train_ds)

Reading training dataset...


  inputs = self._flatten_to_reference_inputs(inputs)
2023-01-02 11:51:03.706283: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:114] Plugin optimizer for device_type GPU is enabled.


Training dataset read in 0:00:02.564237. Found 257 examples.
Training model...
Model trained in 0:00:00.056126
Compiling model...


[INFO kernel.cc:1176] Loading model from path /var/folders/sk/f7k402kx1wvdmcz91gdz6hs00000gn/T/tmp63e61uc4/model/ with prefix 1fe5533d0a174295
[INFO decision_forest.cc:639] Model loaded with 300 root(s), 4518 node(s), and 6 input feature(s).
[INFO abstract_model.cc:1249] Engine "RandomForestGeneric" built
[INFO kernel.cc:1022] Use fast generic engine


Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: could not get source code


Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: could not get source code


Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: could not get source code
Model compiled.


2023-01-02 11:51:04.905000: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:114] Plugin optimizer for device_type GPU is enabled.
2023-01-02 11:51:04.994367: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:114] Plugin optimizer for device_type GPU is enabled.


<keras.callbacks.History at 0x1398060a0>

Evaluate the coposed model

In [17]:
df_and_nn_model.compile(metrics=["accuracy"])
print("Evaluation:" , df_and_nn_model.evaluate(test_ds))

Evaluation: [0.0, 0.977011501789093]


2023-01-02 11:52:14.701025: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:114] Plugin optimizer for device_type GPU is enabled.


Compare it to neural network alone

In [18]:
print("Evaluation:", nn_model.evaluate(test_ds))

Evaluation: [1.8836822509765625, 0.16091954708099365]
