[Performance] Maximize GPU Usage ONNX+directML C++

### Describe the issue

Hello,
I'm using onnx+directML for inference. I'm doing the pre-processing on a GPU, but my GPU usage doesn't exceed 55%.I can perhaps do multi-threading but there are perhaps more efficient ones,  How can I ensure I'm using the GPU's full capacity? 

Here's how I set it up:


### To reproduce

`global_session_options.SetIntraOpNumThreads(1);
   global_session_options.SetGraphOptimizationLevel(ORT_ENABLE_ALL);

    // AJOUTER LE FOURNISSEUR D'EXÉCUTION DIRECTML

    try {
        s_ortApi = &Ort::GetApi();
        int adapterIndex = 0;

        Dx12::ChangeAdapter(adapterIndex);
        // Créez l'OrtMemoryInfo pour DML une seule fois
        OrtMemoryInfo* raw_memory_info;
        Ort::ThrowOnError(s_ortApi->CreateMemoryInfo("DML", OrtAllocatorType::OrtDeviceAllocator, 0,                             OrtMemType::OrtMemTypeDefault, &raw_memory_info));
        s_dmlMemoryInfoWrapper = std::unique_ptr<OrtMemoryInfo, std::function<void(OrtMemoryInfo*)>>(
            raw_memory_info,
            [api_ptr = s_ortApi](OrtMemoryInfo* p) {
                if (api_ptr)
                    api_ptr->ReleaseMemoryInfo(p);

            }
        );
            s_dmlMemoryInfo = s_dmlMemoryInfoWrapper.get();

        ONNIX::_dmlApi->SessionOptionsAppendExecutionProvider_DML1(global_session_options, Dx12::_dmlDevice.get(),   Dx12::_d3d12CommandQueue.get());
        OutputDebugStringA("DirectML Execution Provider successfully enabled.\n");
    }
    catch (const Ort::Exception& e) {
        OutputDebugStringA(("DirectML provider not available: " + std::string(e.what()) + "\n").c_str());
    }


    std::string modelPath = getExecutableDirectory() + "best.onnx";
    if (!std::filesystem::exists(modelPath)) {
        ImGui::Text("Fichier ONNX introuvable !");
        return;
    }

        std::wstring wModelPath(modelPath.begin(), modelPath.end());
        global_session_ptr = std::make_unique<Ort::Session>(*global_env_ptr, wModelPath.c_str(), global_session_options);
        ONNIX::OnnxIni = true;
    
}`

### Urgency

_No response_

### Platform

Windows

### OS Version

win 11 pro 24H2

### ONNX Runtime Installation

Built from Source

### ONNX Runtime Version or Commit ID

opset 12

### ONNX Runtime API

C++

### Architecture

X64

### Execution Provider

DirectML

### Execution Provider Library Version

Microsoft.ML.OnnxRuntime.DirectML 1.22.0

### Model File

_No response_

### Is this a quantized model?

Yes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Performance] Maximize GPU Usage ONNX+directML C++ #25123

Describe the issue

To reproduce

Urgency

Platform

OS Version

ONNX Runtime Installation

ONNX Runtime Version or Commit ID

ONNX Runtime API

Architecture

Execution Provider

Execution Provider Library Version

Model File

Is this a quantized model?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Performance] Maximize GPU Usage ONNX+directML C++ #25123

Description

Describe the issue

To reproduce

Urgency

Platform

OS Version

ONNX Runtime Installation

ONNX Runtime Version or Commit ID

ONNX Runtime API

Architecture

Execution Provider

Execution Provider Library Version

Model File

Is this a quantized model?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions