Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

model.predict() method returning only 0 when unpickling models and using TFDF 1.2.0 #169

Open
waral opened this issue Mar 22, 2023 · 5 comments

Comments

@waral
Copy link

waral commented Mar 22, 2023

Hello,

I have the following problem: I noticed that when using the new version (1.2.0) of Tensorflow Decision Forests, if I pickle a trained model (gradient boosted trees), then load it and run the predict method, I constantly obtain 0.0, independently of the input. Note that running the exact same code with an older version of the package (1.0.1) the problem doesn't seem to occur.

I also noticed that TFDF 1.0.1 gets installed with Tensorflow 2.10.1, while TFDF 1.2.0 with Tensorflow 2.11.1; giving you this info in case this is something more generally related to the Tensorflow version.

Are you aware of this issue? Is pickling/unpickling supposed to be working with TFDF models?

I understand that pickle is not the usual go-to saving method when working with Tensorflow models, however I'm working on a project where different frameworks are used, so I need a general way of saving models, regardless of the framework, that's why I'm exploring this.

Thank you so much and let me know if you have any additional questions.

Best,
Michal

@rstz
Copy link
Collaborator

rstz commented Mar 22, 2023

Hi Michal,

I'm not really familiar with the way pickle works (in fact, I've never used it), but I'm happy to take a closer look if you can give me a minimum working example. So if this is my model

!pip install tensorflow_decision_forests -U -qq
import tensorflow as tf
import tensorflow_decision_forests as tfdf
import pandas as pd

# Download the dataset, load it into a pandas dataframe and convert it to TensorFlow format.
!wget -q https://storage.googleapis.com/download.tensorflow.org/data/palmer_penguins/penguins.csv -O /tmp/penguins.csv
dataset_df = pd.read_csv("/tmp/penguins.csv")
train_ds = tfdf.keras.pd_dataframe_to_tf_dataset(dataset_df, label="species")

# Create, train and save the model
model = tfdf.keras.GradientBoostedTreesModel()
model.fit(train_ds)

What steps do I need to do to pickle the model and unpickle it to get the predictions?

@waral
Copy link
Author

waral commented Mar 22, 2023

Hi,

thanks so much for the quick reply!

So the full minimal example would look like the following. I actually tested it locally and getting the same problem, meaning that with TFDF 1.2.0 I'm getting the array of 0's as the output, while with 1.0.1 the output seems correct. Note that in this specific example (because it's multi-class classification, not sure whether TFDF works properly with multi-class anyway?), the shapes don't match, too. I.e. there are three numbers per one input when using 1.0.1 and only one number (0) when using 1.2.0:

!pip install tensorflow_decision_forests -U -qq
import tensorflow as tf
import tensorflow_decision_forests as tfdf
import pandas as pd

# Download the dataset, load it into a pandas dataframe and convert it to TensorFlow format.
!wget -q https://storage.googleapis.com/download.tensorflow.org/data/palmer_penguins/penguins.csv -O /tmp/penguins.csv
dataset_df = pd.read_csv("/tmp/penguins.csv")
train_ds = tfdf.keras.pd_dataframe_to_tf_dataset(dataset_df, label="species")

# Create, train and save the model
model = tfdf.keras.GradientBoostedTreesModel()
model.fit(train_ds)

import pickle

with open("model.pkl", "wb") as model_file:
    pickle.dump(model, model_file)

with open("model.pkl", "rb") as file:
    model = pickle.load(file)

val_ds = tfdf.keras.pd_dataframe_to_tf_dataset(dataset_df.drop(columns=["species"]))

print(model.predict(val_ds))

Best,
Michal

@rstz
Copy link
Collaborator

rstz commented Mar 22, 2023

Hi Michal,

thank you for the example. Pickling a model is not supported by TF-DF, so proceed at your own risk :) I'll open an internal bug to discuss this further, but the use cases look a bit niche to me (anyone reading it, feel free to express support for pickling via the emojis)

On the positive side, I was able to get your example to work by calling model.compile() on the model before pickling - without looking into it much further, it seems like this solves the problem :)
EDIT: This was unfortunately not a solution :(

TF-DF supports multi-class classification, if the model is correctly loaded, this shouldn't cause issues.

Best,
Richard

@waral
Copy link
Author

waral commented Mar 23, 2023

Hi Richard,

thanks for your help. Unfortunately, compiling the model before pickling doesn't seem to resolve the issue on my side, still getting the same result (I just added model.compile() right before pickling). In any case, I'll be careful when working with saving/loading models like that. Let me know, if this gets resolved internally, though.

Thanks,
Michal

@Arnold1
Copy link

Arnold1 commented Mar 23, 2023

Hi, i also pickle the TF-DF model - same as @waral - would like to increase the priority of that internal ticket...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants