Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

load model from file and predict #11

Closed
headupinclouds opened this issue Jan 8, 2018 · 14 comments · Fixed by #274
Closed

load model from file and predict #11

headupinclouds opened this issue Jan 8, 2018 · 14 comments · Fixed by #274

Comments

@headupinclouds
Copy link

I'm interested in using treelite as a fast and lightweight C/C++ prediction module. It seems the preferred model is to load a standard supported format, compile/bake an optimized c file/library, and then integrate the generated c file into an application for prediction. Is it possible to use treelite to load a model from file at runtime for prediction? (If this isn't possible through the current API would you accept this feature). This would provide some flexibility and code size reductions compared to using xgboost or lightgbm directly.

@hcho3
Copy link
Collaborator

hcho3 commented Jan 8, 2018

I'd see why one would want to load a model at runtime instead of baking it as executable code. One way to achieve this is to have treelite convert a model (in any supported format) into a common format. Then at the time of deployment, you'd deploy the prediction logic that would load the common-format model at runtime. The format conversion is necessary so that the prediction logic would deal with one single format instead of multiple (xgboost, lightgbm, etc).

Let me get back to you when I have time estimate for this feature.

@headupinclouds
Copy link
Author

Let me get back to you when I have time estimate for this feature.

Great. Let me know if I can help.

One way to achieve this is to have treelite convert a model (in any supported format) into a common format.

Perhaps the the trees could be loaded then serialized with something like cereal, which provides a lot of format flexibilty: portable binary archive, xml, json, etc. Thoughts?

@headupinclouds
Copy link
Author

Let me get back to you when I have time estimate for this feature.

Any updates on this one?

Background: I have a few regression/detection tasks for mobile devices where this could significantly streamline the xgboost model deployment.

@headupinclouds
Copy link
Author

Seems to be fixed by #45 ?

@hcho3
Copy link
Collaborator

hcho3 commented Sep 17, 2018

@headupinclouds I don't think it goes all the way you want, since models will still need to be compiled in order to be fed into the runtime.

@hcho3
Copy link
Collaborator

hcho3 commented Sep 17, 2018

@headupinclouds Conceptually, to do what you want, we'd need to add a new runtime that directly reads from Protobuf. Any reason why current approach (generating shared libs) doesn't work for your scenario?

@headupinclouds
Copy link
Author

since models will still need to be compiled in order to be fed into the runtime.

I see, thanks for the clarification.

Any reason why current approach (generating shared libs) doesn't work for your scenario

I'm interested in using treelite as a general purpose alternative to direct static linking of xgboost in a couple real-time cross platform computer vision applications, although I see my requirements may not be in line with current project goals. I'd like to have a single universal prediction module that supports a variety of pretrained regression and classification tasks, where different model sizes may be required due to varying application specific and platform constraints. A mobile model <= 1 MB may be required for some builds, and a more accurate 8 MB model may be appropriate for desktop platforms. You mentioned SHARED libs, allthough I'm guessing this isn't a hard constraint. For iOS, one would have to generate a SHARED lib in the form of a dynamic framework, which takes some work. In my use case, static linking is generally preferable.

@hcho3
Copy link
Collaborator

hcho3 commented Sep 18, 2018

@headupinclouds You can use static linking for prediction runtime. Currently, the model itself will be encoded as a .so file, since Treelite produces an executable code for the model. What is the advantage of reading a Protobuf file instead of reading .so file? Either way, you need to read one file for each model.

@headupinclouds
Copy link
Author

I can have one protobuf file that will work on Linux, Windows, macOS, iOS, Android and Raspberry Pi, but I would need 6 model.so files to accomplish this. Right?

@hcho3
Copy link
Collaborator

hcho3 commented Sep 18, 2018

@headupinclouds Yes, you'd need to prepare multiple .so/.dll/.dylib files to target multiple platforms. I understand now the benefit of using Protobuf. As I mentioned earlier, to use Protobuf, we'll need to create a new runtime that takes in Protobuf as input.

@headupinclouds
Copy link
Author

Okay, thanks for the input. Treelite is a great idea.

@karimkhanvi
Copy link

@headupinclouds and @hcho3 Thanks for interesting discussion. I followed tutorial given on https://treelite.readthedocs.io/en/latest/ and generated .so file.

But now no idea on how to convert it into protobuf file so that I can use same model into iOS and Android. Any reference link is very much appreciated.

@headupinclouds
Copy link
Author

But now no idea on how to convert it into protobuf file so that I can use same model into iOS and Android. Any reference link is very much appreciated.

That isn't how this library is designed. From the above discussion this would have to be added as a new feature, but it is quite orthogonal to the current design, so it isn't clear it will necessarily be added. The .so you created is your model, so you need to generate one for each architecture. In the case of iOS, you would have to create a shared framework (essentially a wrapper around an iOS *.dylib).

@hcho3
Copy link
Collaborator

hcho3 commented Oct 14, 2020

I decided to remove Protobuf entirely from Treelite. For the use case you described, I suggest that you write a custom runtime that operates on the Treelite C++ object (https://github.com/dmlc/treelite/blob/mainline/include/treelite/tree.h). For example, The Forest Inferencing Library (FIL) in cuML uses this approach.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants