UID

title

description

helpviewer_keywords

old-location

tech.root

ms.assetid

ms.date

req.header

req.include-header

req.target-type

req.target-min-winverclnt

req.target-min-winversvr

req.kmdf-ver

req.umdf-ver

req.ddi-compliance

req.unicode-ansi

req.idl

req.max-support

req.namespace

req.assembly

req.type-library

req.lib

req.dll

req.irql

targetos

req.typenames

req.redist

ms.custom

f1_keywords

dev_langs

topic_type

api_type

api_location

api_name

NS:directml.DML_GEMM_OPERATOR_DESC

DML_GEMM_OPERATOR_DESC

Performs a general matrix multiplication function of the form `Output = FusedActivation(Alpha * TransA(A) x TransB(B) + Beta * C)`, where `x` denotes matrix multiplication, and `*` denotes multiplication with a scalar.

DML_GEMM_OPERATOR_DESC

DML_GEMM_OPERATOR_DESC structure

direct3d12.dml_gemm_operator_desc

directml/DML_GEMM_OPERATOR_DESC

direct3d12\dml_gemm_operator_desc.htm

directml

11482420-678E-4914-90F0-9F952BC09FF7

12/01/2022

directml.h

Windows

19H1

DML_GEMM_OPERATOR_DESC

directml/DML_GEMM_OPERATOR_DESC

c++

APIRef

kbSyntax

HeaderDef

DirectML.h

DML_GEMM_OPERATOR_DESC

-description

Performs a general matrix multiplication function of the form Output = FusedActivation(Alpha * TransA(A) x TransB(B) + Beta * C), where x denotes matrix multiplication, and * denotes multiplication with a scalar.

This operator requires 4D tensors with layout { BatchCount, ChannelCount, Height, Width }, and it will perform BatchCount * ChannelCount number of independent matrix multiplications.

For example, if ATensor has Sizes of { BatchCount, ChannelCount, M, K }, and BTensor has Sizes of { BatchCount, ChannelCount, K, N }, and OutputTensor has Sizes of { BatchCount, ChannelCount, M, N }, then this operator performs BatchCount * ChannelCount independent matrix multiplications of dimensions {M,K} x {K,N} = {M,N}.

-struct-fields

-field ATensor

Type: const DML_TENSOR_DESC*

A tensor containing the A matrix. This tensor's Sizes should be { BatchCount, ChannelCount, M, K } if TransA is DML_MATRIX_TRANSFORM_NONE, or { BatchCount, ChannelCount, K, M } if TransA is DML_MATRIX_TRANSFORM_TRANSPOSE.

-field BTensor

Type: const DML_TENSOR_DESC*

A tensor containing the B matrix. This tensor's Sizes should be { BatchCount, ChannelCount, K, N } if TransB is DML_MATRIX_TRANSFORM_NONE, or { BatchCount, ChannelCount, N, K } if TransB is DML_MATRIX_TRANSFORM_TRANSPOSE.

-field CTensor

Type: _Maybenull_ const DML_TENSOR_DESC*

A tensor containing the C matrix, or nullptr. Values default to 0 when not provided. If provided, this tensor's Sizes should be { BatchCount, ChannelCount, M, N }.

-field OutputTensor

Type: const DML_TENSOR_DESC*

The tensor to write the results to. This tensor's Sizes are { BatchCount, ChannelCount, M, N }.

-field TransA

Type: DML_MATRIX_TRANSFORM

The transform to be applied to ATensor; either a transpose, or no transform.

-field TransB

Type: DML_MATRIX_TRANSFORM

The transform to be applied to BTensor; either a transpose, or no transform.

-field Alpha

Type: FLOAT

The value of the scalar multiplier for the product of inputs ATensor and BTensor.

-field Beta

Type: FLOAT

The value of the scalar multiplier for the optional input CTensor. If CTensor is not provided, then this value is ignored.

-field FusedActivation

Type: _Maybenull_ const DML_OPERATOR_DESC*

An optional fused activation layer to apply after the GEMM. For more info, see Using fused operators for improved performance.

Availability

This operator was introduced in DML_FEATURE_LEVEL_1_0.

Tensor constraints

ATensor, BTensor, CTensor, and OutputTensor must have the same DataType and DimensionCount.
CTensor and OutputTensor must have the same Sizes.

Tensor support

DML_FEATURE_LEVEL_4_0 and above

Tensor	Kind	Dimensions	Supported dimension counts	Supported data types
ATensor	Input	{ [BatchCount], [ChannelCount], M, K }	2 to 4	FLOAT32, FLOAT16
BTensor	Input	{ [BatchCount], [ChannelCount], K, N }	2 to 4	FLOAT32, FLOAT16
CTensor	Optional input	{ [BatchCount], [ChannelCount], M, N }	2 to 4	FLOAT32, FLOAT16
OutputTensor	Output	{ [BatchCount], [ChannelCount], M, N }	2 to 4	FLOAT32, FLOAT16

DML_FEATURE_LEVEL_1_0 and above

Tensor	Kind	Dimensions	Supported dimension counts	Supported data types
ATensor	Input	{ BatchCount, ChannelCount, M, K }	4	FLOAT32, FLOAT16
BTensor	Input	{ BatchCount, ChannelCount, K, N }	4	FLOAT32, FLOAT16
CTensor	Optional input	{ BatchCount, ChannelCount, M, N }	4	FLOAT32, FLOAT16
OutputTensor	Output	{ BatchCount, ChannelCount, M, N }	4	FLOAT32, FLOAT16

-see-also

Using fused operators for improved performance

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ns-directml-dml_gemm_operator_desc.md

ns-directml-dml_gemm_operator_desc.md

-description

-struct-fields

-field ATensor

-field BTensor

-field CTensor

-field OutputTensor

-field TransA

-field TransB

-field Alpha

-field Beta

-field FusedActivation

Availability

Tensor constraints

Tensor support

DML_FEATURE_LEVEL_4_0 and above

DML_FEATURE_LEVEL_1_0 and above

-see-also

Files

ns-directml-dml_gemm_operator_desc.md

Latest commit

History

ns-directml-dml_gemm_operator_desc.md

File metadata and controls

-description

-struct-fields

-field ATensor

-field BTensor

-field CTensor

-field OutputTensor

-field TransA

-field TransB

-field Alpha

-field Beta

-field FusedActivation

Availability

Tensor constraints

Tensor support

DML_FEATURE_LEVEL_4_0 and above

DML_FEATURE_LEVEL_1_0 and above

-see-also