Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Predictions do not function as documented. #139

Open
TheJeran opened this issue Oct 8, 2022 · 2 comments
Open

Predictions do not function as documented. #139

TheJeran opened this issue Oct 8, 2022 · 2 comments

Comments

@TheJeran
Copy link

TheJeran commented Oct 8, 2022

Prediction with TFDF is extremely under documented.

According to https://www.tensorflow.org/decision_forests/api_docs/python/tfdf/keras/RandomForestModel#predict you should be able to predict on numpy arrays, tensors, or datasets. Yet any attempt to do so has failed. It seems PrefetchDatasets are the only option.

On top of this, prediction is dreadfully slow. My current use case is to do ensemble predictions of images. The images are 144k pixels which requires ~20 seconds for one model to make a prediction. Pixelwise predicts with normal TF can be near instantaneous with predict_on_batch which TFDF models are supposed to support. But PrefetchDatasets aren't compatible with it. So the answer is to use Numpy arrays. But that again is incompatible. All of this is said to be supported in the documentation but they appear unimplementable.

I would like to stick with the TFDF method for my work but it is unreasonable slow.

How can I implement faster prediction when it seems it's an under-documented area?

@rstz
Copy link
Collaborator

rstz commented Oct 18, 2022

Hi,

thank you for your input. I agree that this area is under-documented and that the documentation you linked is particularly confusing. We're working actively on improving this (including a "how to predict" colab) which will land soon.

@rstz
Copy link
Collaborator

rstz commented Oct 20, 2022

Our new How to predict colab should explain predictions with TF-DF in more detail.

In essence, TF-DF uses the Keras API for predictions, which allows for some flexibility and interoperability, but also constraints us in terms or speed and input format. For advanced use cases we recommend to check out the Keras API docs.

If speed is crucial for your use case, please take a look at serving APIs offered by Yggdrasil Decision Forests. YDF powers TF-DF and models trained in TF-DF can be served with the other APIs and vice versa. The C++ serving API can be an order of magnitude faster than TF-DF and is very stable (used in production for years).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants