Skip to content

[Performance] Maximize GPU Usage ONNX+directML C++ #25123

Open
@vortexModding

Description

@vortexModding

Describe the issue

Hello,
I'm using onnx+directML for inference. I'm doing the pre-processing on a GPU, but my GPU usage doesn't exceed 55%.I can perhaps do multi-threading but there are perhaps more efficient ones, How can I ensure I'm using the GPU's full capacity?

Here's how I set it up:

To reproduce

`global_session_options.SetIntraOpNumThreads(1);
global_session_options.SetGraphOptimizationLevel(ORT_ENABLE_ALL);

// AJOUTER LE FOURNISSEUR D'EXÉCUTION DIRECTML

try {
    s_ortApi = &Ort::GetApi();
    int adapterIndex = 0;

    Dx12::ChangeAdapter(adapterIndex);
    // Créez l'OrtMemoryInfo pour DML une seule fois
    OrtMemoryInfo* raw_memory_info;
    Ort::ThrowOnError(s_ortApi->CreateMemoryInfo("DML", OrtAllocatorType::OrtDeviceAllocator, 0,                             OrtMemType::OrtMemTypeDefault, &raw_memory_info));
    s_dmlMemoryInfoWrapper = std::unique_ptr<OrtMemoryInfo, std::function<void(OrtMemoryInfo*)>>(
        raw_memory_info,
        [api_ptr = s_ortApi](OrtMemoryInfo* p) {
            if (api_ptr)
                api_ptr->ReleaseMemoryInfo(p);

        }
    );
        s_dmlMemoryInfo = s_dmlMemoryInfoWrapper.get();

    ONNIX::_dmlApi->SessionOptionsAppendExecutionProvider_DML1(global_session_options, Dx12::_dmlDevice.get(),   Dx12::_d3d12CommandQueue.get());
    OutputDebugStringA("DirectML Execution Provider successfully enabled.\n");
}
catch (const Ort::Exception& e) {
    OutputDebugStringA(("DirectML provider not available: " + std::string(e.what()) + "\n").c_str());
}


std::string modelPath = getExecutableDirectory() + "best.onnx";
if (!std::filesystem::exists(modelPath)) {
    ImGui::Text("Fichier ONNX introuvable !");
    return;
}

    std::wstring wModelPath(modelPath.begin(), modelPath.end());
    global_session_ptr = std::make_unique<Ort::Session>(*global_env_ptr, wModelPath.c_str(), global_session_options);
    ONNIX::OnnxIni = true;

}`

Urgency

No response

Platform

Windows

OS Version

win 11 pro 24H2

ONNX Runtime Installation

Built from Source

ONNX Runtime Version or Commit ID

opset 12

ONNX Runtime API

C++

Architecture

X64

Execution Provider

DirectML

Execution Provider Library Version

Microsoft.ML.OnnxRuntime.DirectML 1.22.0

Model File

No response

Is this a quantized model?

Yes

Metadata

Metadata

Assignees

No one assigned

    Labels

    ep:DMLissues related to the DirectML execution providerperformanceissues related to performance regressions

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions