-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bulk Prediction for multi input in C# #20148
Comments
The input/output dimensions of your model dictate what you can process. You also need to stop leaking memory in a loop. Please, read https://onnxruntime.ai/docs/tutorials/csharp/basic_csharp.html |
I guess I'm asking how to predict in batches, rather than one at a time in a loop. Thanks. |
I currently save the model that has a last layer: as so: import tf2onnx onnx_model, _ = tf2onnx.convert.from_keras(model4 , input_signature, opset=18) how could I change it for multibatch? thanks |
There are no special capabilities for bulk predictions. If you model is constructed in a way that it would accept the shapes in your example, then you feed that data. C# is just a thin layer on top of the native library. There is no difference between python, C# in terms of the capabilities. |
Beautiful, @yuslepukhin . Thanks. I figured it out. For future people, what yuslepukhin says is key. There's no difference. You need a mix of these two threads: Just need to change to N, and then provide what you need. Got it working in python, should be able to in C# now. Will report back if not, but otherwise closed. |
Describe the issue
4090 gpu, 13900k CPU. How can I predict in bulk using sess.run in C#?
Python appears to run an order of magnitude faster, as I can bulk predict in tensorflow doing model(input) or model.predict(input).
See reproduce below, thanks
To reproduce
Hi,
My Model takes in 2 1D array images and 5 features.
I set up:
int gpuDeviceId = 0; // The GPU device ID to execute on
using var gpuSessionOptoins = SessionOptions.MakeSessionOptionWithCudaProvider(gpuDeviceId);
///then I go through all of them, and predict one at a time
This takes much longer on GPU (4090) than on my CPU. Whereas when doing my prediction on python it's over an order of magnitude faster (yes this is after I do an initial prediction so the model is setup). I imagine this is because it's predicting one at a time doing this, and doing so requires i/o gpu to cpu calculations which are extremely slow when repeated.
How can I change my input or code so the onnx sess does a bulk prediction in C#?
I believe if I had just one imagine (1, 512, 512, 3) I could just change the first to (1500,512,512,3), but I'm not sure what to do in my case.
Thanks so much!
Urgency
No response
Platform
Windows
OS Version
10
ONNX Runtime Installation
Built from Source
ONNX Runtime Version or Commit ID
1.17.1
ONNX Runtime API
C#
Architecture
X64
Execution Provider
CUDA
Execution Provider Library Version
11
The text was updated successfully, but these errors were encountered: