DirectML Execution Provider (Preview)
DirectML is a high-performance, hardware-accelerated DirectX 12 library for machine learning on Windows. DirectML provides GPU acceleration for common machine learning tasks across a broad range of supported hardware and drivers.
When used standalone, the DirectML API is a low-level DirectX 12 library and is suitable for high-performance, low-latency applications such as frameworks, games, and other real-time applications. The seamless interoperability of DirectML with Direct3D 12 as well as its low overhead and conformance across hardware makes DirectML ideal for accelerating machine learning when both high performance is desired, and the reliability and predictabiltiy of results across hardware is critical.
The DirectML Execution Provider is an optional component of ONNX Runtime that uses DirectML to accelerate inference of ONNX models. The DirectML execution provider is capable of greatly improving evaluation time of models using commodity GPU hardware, without sacrificing broad hardware support or requiring vendor-specific extensions to be installed.
The DirectML Execution Provider is currently in preview.
Table of contents
- DirectML Execution Provider (Preview)
The DirectML execution provider requires any DirectX 12 capable device. Almost all commercially-available graphics cards released in the last several years support DirectX 12. Examples of compatible hardware include:
- NVIDIA Kepler (GTX 600 series) and above
- AMD GCN 1st Gen (Radeon HD 7000 series) and above
- Intel Haswell (4th-gen core) HD Integrated Graphics and above
DirectML is compatible with Windows 10, version 1709 (10.0.16299; RS3, "Fall Creators Update") and newer.
Building from source
For general information about building onnxruntime, see BUILD.md.
Requirements for building the DirectML execution provider:
- Visual Studio 2017 toolchain (see cmake configuration instructions)
- The Windows 10 SDK (10.0.18362.0) for Windows 10, version 1903 (or newer)
To build onnxruntime with the DML EP included, supply the
--use_dml parameter to
build.bat --config RelWithDebInfo --build_shared_lib --parallel --use_dml
The DirectML execution provider supports building for both x64 (default) and x86 architectures.
Note that building onnxruntime with the DirectML execution provider enabled causes the the DirectML redistributable package to be automatically downloaded as part of the build. This package contains a pre-release version of DirectML, and its use is governed by a license whose text may be found as part of the NuGet package.
Using the DirectML execution provider
When using the C API with a DML-enabled build of onnxruntime (see Building from source), the DirectML execution provider can be enabled using one of the two factory functions included in
Creates a DirectML Execution Provider which executes on the hardware adapter with the given
device_id, also known as the adapter index. The device ID corresponds to the enumeration order of hardware adapters as given by IDXGIFactory::EnumAdapters. A
device_id of 0 always corresponds to the default adapter, which is typically the primary display GPU installed on the system. A negative
device_id is invalid.
OrtStatus* OrtSessionOptionsAppendExecutionProvider_DML( _In_ OrtSessionOptions* options, int device_id );
Creates a DirectML Execution Provider using the given DirectML device, and which executes work on the supplied D3D12 command queue. The DirectML device and D3D12 command queue must have the same parent ID3D12Device, or an error will be returned. The D3D12 command queue must be of type
COMPUTE (see D3D12_COMMAND_LIST_TYPE). If this function succeeds, the inference session once created will maintain a strong reference on both the
OrtStatus* OrtSessionOptionsAppendExecutionProviderEx_DML( _In_ OrtSessionOptions* options, _In_ IDMLDevice* dml_device, _In_ ID3D12CommandQueue* cmd_queue );
ONNX opset support
The DirectML execution provider currently supports ONNX opset 9 (ONNX v1.4). Evaluating models which require a higher opset version is not supported, and may produce unexpected results.
Multi-threading and supported session options
The DirectML execution provider does not support the use of memory pattern optimizations or parallel execution in onnxruntime. When supplying session options during InferenceSession creation, these options must be disabled or an error will be returned.
If using the onnxruntime C API, you must call
SetSessionExecutionMode functions to set the options required by the DirectML execution provider.
OrtStatus*(ORT_API_CALL* DisableMemPattern)(_Inout_ OrtSessionOptions* options)NO_EXCEPTION; OrtStatus*(ORT_API_CALL* SetSessionExecutionMode)(_Inout_ OrtSessionOptions* options, ExecutionMode execution_mode)NO_EXCEPTION;
If creating the onnxruntime InferenceSession object directly, you must set the appropriate fields on the
onnxruntime::SessionOptions struct. Specifically,
execution_mode must be set to
enable_mem_pattern must be
Additionally, as the DirectML execution provider does not support parallel execution, it does not support multi-threaded calls to
Run on the same inference session. That is, if an inference session using the DirectML execution provider, only one thread may call
Run at a time. Multiple threads are permitted to call
Run simultaneously if they operate on different inference session objects.
A complete sample of onnxruntime using the DirectML execution provider can be found under samples/c_cxx/fns_candy_style_transfer.