Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New fastai inference API #32

Closed
tcapelle opened this issue Jan 20, 2021 · 3 comments
Closed

New fastai inference API #32

tcapelle opened this issue Jan 20, 2021 · 3 comments

Comments

@tcapelle
Copy link

tcapelle commented Jan 20, 2021

Hey Zach,
Let's work here prototyping the inference.
What I would like to have (Santa's wishful letter).

  • Streamlined torchscript support on all fastai models, simple models should be compatible with jit.trace and more complex ones, with decisions with jit.script. The guys at facebook may be able to help here, they are super interested on this right now. The user should have simple image preprocessing/posprocessing to make inference work once the models is exported on plain pytorch. If jeremy splits the fastai lib on core/vision/etc... we could depende on the fastai core's.
  • ONNX: Exporting on all models, image encoders should work out of the box, some layers are missing for Unet's (PixelShuffle). Tabular should work also. Without being an expert, I would expect that torchscript replaces the ONNX pipeline in the future, one less layer.
  • RTTorch: We should start discussing with them probably, as the TensorRT frameworks is super fast for GPU inference. This could be done latter, once we have ONNX exports. I have a contact at NVIDIA that could help us export to TensorRT.
  • DeepStream? Stas Beckman is a guru on this topic, we could ask him what he thinks about it.

We should have tests that periodically verify that this functionality is not broken, and the performance is maintained. This is something fastai does not have right now and it needs, e.g., fastai's unet is slower than before, noted this the other day.

Another cool thing, would be to directly serve the model with torch.serve directly from fastai. Like,

learn.serve(port=5151)

and get a service running to make inference over HTTP.

@muellerzr
Copy link
Owner

muellerzr commented Jan 20, 2021 via email

@muellerzr
Copy link
Owner

muellerzr commented Jan 20, 2021 via email

@tcapelle
Copy link
Author

You don't sleep!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants