-
Notifications
You must be signed in to change notification settings - Fork 296
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance issues with onnx_tf #353
Comments
Try this: #271 . This issue and related patch makes tf_rep.run significantly faster. As for |
I have tried to look into the solution suggested on that thread but I'm not certain on where to include these following changes:
It'll be really great if you could guide me on this. |
That comment was written because inferencing without batching is inefficient (due to lack of parallelism). Usually, people export models with explicit batch size (often 1) and this can be a performance bottleneck. Moreover, session creation is also very time consuming, which is what the related PR was trying to resolve. But that PR has gone a bit outdated and may not work out of the box, @fumihwh can you try updating your patch: #273 ? We should probably merge this PR ASAP since it's quite useful. |
Would be great if this PR was merged, I am facing the exact problem it would solve. |
@Terizian Can you try exporting the graph (using tf_rep.export_graph) and using it directly? If I remember correctly, that improved the performance significantly. |
I have developed a model on Matlab and saved it using the onnx framework. The size of the model is 25 MB and opset version is 6. I am currently trying to move this model to production using the supported libraries on Python. In my code, I have the following:
This takes 25.71 seconds to run, making it very heavy for production usage.
Additionally, when running the predictions:
output = tf_rep.run(x)
Each prediction takes on average 4 seconds to run. My target is running 100 predictions in a second and I'm finding that impossible with the framework. What are things that I may try to speed it up?
The text was updated successfully, but these errors were encountered: