Bulk Prediction for multi input in C# #20148

roushrsh · 2024-03-29T18:59:24Z

Describe the issue

4090 gpu, 13900k CPU. How can I predict in bulk using sess.run in C#?

Python appears to run an order of magnitude faster, as I can bulk predict in tensorflow doing model(input) or model.predict(input).

See reproduce below, thanks

To reproduce

Hi,

My Model takes in 2 1D array images and 5 features.

     var inputsB = new List<NamedOnnxValue> { namedInputValue1B, namedInputValue2B,   namedInputValue4,namedInputValue5,namedInputValue6,
namedInputValue7,namedInputValue8};
      listTopredOn.Add(inputsB);

I set up:
int gpuDeviceId = 0; // The GPU device ID to execute on
using var gpuSessionOptoins = SessionOptions.MakeSessionOptionWithCudaProvider(gpuDeviceId);

 InferenceSession globalSessionX;
 globalSessionX = new InferenceSession(ONNX_MODEL_PATH, gpuSessionOptoins);

///then I go through all of them, and predict one at a time

 for (int i = 0; i < listTopredOn.Count; i++)
 {

     var ff1 = globalSessionX.Run(listTopredOn[i])[0].AsTensor<float>();
     var  resultValue1 = ff1.GetValue(0);
     var  prediction11 = Convert.ToDouble(resultValue1);
 }

This takes much longer on GPU (4090) than on my CPU. Whereas when doing my prediction on python it's over an order of magnitude faster (yes this is after I do an initial prediction so the model is setup). I imagine this is because it's predicting one at a time doing this, and doing so requires i/o gpu to cpu calculations which are extremely slow when repeated.

How can I change my input or code so the onnx sess does a bulk prediction in C#?
I believe if I had just one imagine (1, 512, 512, 3) I could just change the first to (1500,512,512,3), but I'm not sure what to do in my case.

Thanks so much!

Urgency

No response

Platform

Windows

OS Version

10

ONNX Runtime Installation

Built from Source

ONNX Runtime Version or Commit ID

1.17.1

ONNX Runtime API

C#

Architecture

X64

Execution Provider

CUDA

Execution Provider Library Version

11

The text was updated successfully, but these errors were encountered:

yuslepukhin · 2024-03-29T19:08:52Z

The input/output dimensions of your model dictate what you can process.
This does not change with the API you choose to use, the model remains the same so is the implementation behind the API.

You also need to stop leaking memory in a loop. Please, read https://onnxruntime.ai/docs/tutorials/csharp/basic_csharp.html

roushrsh · 2024-03-29T19:11:15Z

I guess I'm asking how to predict in batches, rather than one at a time in a loop. Thanks.

roushrsh · 2024-03-29T20:10:13Z

I currently save the model

that has a last layer:
model4 = keras.models.Model(inputs=[spectra1, spectra2, ratioAProtS,ratioBProtS,isoS,hitA,hitB,sizeS], outputs=output)

as so:

import tf2onnx
import tensorflow as tf
import onnx
input_signature = ([tf.TensorSpec((1, (1001)), tf.float32) ,tf.TensorSpec((1, (1001)), tf.float32)
,tf.TensorSpec((1, (1001)), tf.float32) ,tf.TensorSpec((1),
tf.float32),tf.TensorSpec((1), tf.float32) ,
tf.TensorSpec((1), tf.float32) ,
tf.TensorSpec((1), tf.float32) ,
tf.TensorSpec((1), tf.float32) ,tf.TensorSpec((1), tf.float32) ])

onnx_model, _ = tf2onnx.convert.from_keras(model4 , input_signature, opset=18)
onnx.save(onnx_model, "../../Desktop/testModel.onnx")

how could I change it for multibatch? thanks

yuslepukhin · 2024-03-29T20:38:18Z

There are no special capabilities for bulk predictions. If you model is constructed in a way that it would accept the shapes in your example, then you feed that data. C# is just a thin layer on top of the native library.

There is no difference between python, C# in terms of the capabilities.

roushrsh · 2024-03-29T22:12:08Z

Beautiful, @yuslepukhin . Thanks. I figured it out.

For future people, what yuslepukhin says is key. There's no difference.

You need a mix of these two threads:

#9867
onnx/onnx#2182

Just need to change to N, and then provide what you need.

Got it working in python, should be able to in C# now. Will report back if not, but otherwise closed.

github-actions bot added ep:CUDA issues related to the CUDA execution provider platform:windows issues related to the Windows platform labels Mar 29, 2024

roushrsh closed this as completed Mar 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bulk Prediction for multi input in C# #20148

Bulk Prediction for multi input in C# #20148

roushrsh commented Mar 29, 2024 •

edited

Loading

yuslepukhin commented Mar 29, 2024

roushrsh commented Mar 29, 2024

roushrsh commented Mar 29, 2024

yuslepukhin commented Mar 29, 2024 •

edited

Loading

roushrsh commented Mar 29, 2024

Bulk Prediction for multi input in C# #20148

Bulk Prediction for multi input in C# #20148

Comments

roushrsh commented Mar 29, 2024 • edited Loading

Describe the issue

To reproduce

Urgency

Platform

OS Version

ONNX Runtime Installation

ONNX Runtime Version or Commit ID

ONNX Runtime API

Architecture

Execution Provider

Execution Provider Library Version

yuslepukhin commented Mar 29, 2024

roushrsh commented Mar 29, 2024

roushrsh commented Mar 29, 2024

yuslepukhin commented Mar 29, 2024 • edited Loading

roushrsh commented Mar 29, 2024

roushrsh commented Mar 29, 2024 •

edited

Loading

yuslepukhin commented Mar 29, 2024 •

edited

Loading