Open
Description
Describe the issue
Hello,
I'm using onnx+directML for inference. I'm doing the pre-processing on a GPU, but my GPU usage doesn't exceed 55%.I can perhaps do multi-threading but there are perhaps more efficient ones, How can I ensure I'm using the GPU's full capacity?
Here's how I set it up:
To reproduce
`global_session_options.SetIntraOpNumThreads(1);
global_session_options.SetGraphOptimizationLevel(ORT_ENABLE_ALL);
// AJOUTER LE FOURNISSEUR D'EXÉCUTION DIRECTML
try {
s_ortApi = &Ort::GetApi();
int adapterIndex = 0;
Dx12::ChangeAdapter(adapterIndex);
// Créez l'OrtMemoryInfo pour DML une seule fois
OrtMemoryInfo* raw_memory_info;
Ort::ThrowOnError(s_ortApi->CreateMemoryInfo("DML", OrtAllocatorType::OrtDeviceAllocator, 0, OrtMemType::OrtMemTypeDefault, &raw_memory_info));
s_dmlMemoryInfoWrapper = std::unique_ptr<OrtMemoryInfo, std::function<void(OrtMemoryInfo*)>>(
raw_memory_info,
[api_ptr = s_ortApi](OrtMemoryInfo* p) {
if (api_ptr)
api_ptr->ReleaseMemoryInfo(p);
}
);
s_dmlMemoryInfo = s_dmlMemoryInfoWrapper.get();
ONNIX::_dmlApi->SessionOptionsAppendExecutionProvider_DML1(global_session_options, Dx12::_dmlDevice.get(), Dx12::_d3d12CommandQueue.get());
OutputDebugStringA("DirectML Execution Provider successfully enabled.\n");
}
catch (const Ort::Exception& e) {
OutputDebugStringA(("DirectML provider not available: " + std::string(e.what()) + "\n").c_str());
}
std::string modelPath = getExecutableDirectory() + "best.onnx";
if (!std::filesystem::exists(modelPath)) {
ImGui::Text("Fichier ONNX introuvable !");
return;
}
std::wstring wModelPath(modelPath.begin(), modelPath.end());
global_session_ptr = std::make_unique<Ort::Session>(*global_env_ptr, wModelPath.c_str(), global_session_options);
ONNIX::OnnxIni = true;
}`
Urgency
No response
Platform
Windows
OS Version
win 11 pro 24H2
ONNX Runtime Installation
Built from Source
ONNX Runtime Version or Commit ID
opset 12
ONNX Runtime API
C++
Architecture
X64
Execution Provider
DirectML
Execution Provider Library Version
Microsoft.ML.OnnxRuntime.DirectML 1.22.0
Model File
No response
Is this a quantized model?
Yes