Skip to content

Latest commit

 

History

History
137 lines (99 loc) · 5.79 KB

ns-directml-dml_gemm_operator_desc.md

File metadata and controls

137 lines (99 loc) · 5.79 KB
UID title description helpviewer_keywords old-location tech.root ms.assetid ms.date req.header req.include-header req.target-type req.target-min-winverclnt req.target-min-winversvr req.kmdf-ver req.umdf-ver req.ddi-compliance req.unicode-ansi req.idl req.max-support req.namespace req.assembly req.type-library req.lib req.dll req.irql targetos req.typenames req.redist ms.custom f1_keywords dev_langs topic_type api_type api_location api_name
NS:directml.DML_GEMM_OPERATOR_DESC
DML_GEMM_OPERATOR_DESC
Performs a general matrix multiplication function of the form `Output = FusedActivation(Alpha * TransA(A) x TransB(B) + Beta * C)`, where `x` denotes matrix multiplication, and `*` denotes multiplication with a scalar.
DML_GEMM_OPERATOR_DESC
DML_GEMM_OPERATOR_DESC structure
direct3d12.dml_gemm_operator_desc
directml/DML_GEMM_OPERATOR_DESC
direct3d12\dml_gemm_operator_desc.htm
directml
11482420-678E-4914-90F0-9F952BC09FF7
12/01/2022
directml.h
Windows
Windows
19H1
DML_GEMM_OPERATOR_DESC
directml/DML_GEMM_OPERATOR_DESC
c++
APIRef
kbSyntax
HeaderDef
DirectML.h
DML_GEMM_OPERATOR_DESC

-description

Performs a general matrix multiplication function of the form Output = FusedActivation(Alpha * TransA(A) x TransB(B) + Beta * C), where x denotes matrix multiplication, and * denotes multiplication with a scalar.

This operator requires 4D tensors with layout { BatchCount, ChannelCount, Height, Width }, and it will perform BatchCount * ChannelCount number of independent matrix multiplications.

For example, if ATensor has Sizes of { BatchCount, ChannelCount, M, K }, and BTensor has Sizes of { BatchCount, ChannelCount, K, N }, and OutputTensor has Sizes of { BatchCount, ChannelCount, M, N }, then this operator performs BatchCount * ChannelCount independent matrix multiplications of dimensions {M,K} x {K,N} = {M,N}.

-struct-fields

-field ATensor

Type: const DML_TENSOR_DESC*

A tensor containing the A matrix. This tensor's Sizes should be { BatchCount, ChannelCount, M, K } if TransA is DML_MATRIX_TRANSFORM_NONE, or { BatchCount, ChannelCount, K, M } if TransA is DML_MATRIX_TRANSFORM_TRANSPOSE.

-field BTensor

Type: const DML_TENSOR_DESC*

A tensor containing the B matrix. This tensor's Sizes should be { BatchCount, ChannelCount, K, N } if TransB is DML_MATRIX_TRANSFORM_NONE, or { BatchCount, ChannelCount, N, K } if TransB is DML_MATRIX_TRANSFORM_TRANSPOSE.

-field CTensor

Type: _Maybenull_ const DML_TENSOR_DESC*

A tensor containing the C matrix, or nullptr. Values default to 0 when not provided. If provided, this tensor's Sizes should be { BatchCount, ChannelCount, M, N }.

-field OutputTensor

Type: const DML_TENSOR_DESC*

The tensor to write the results to. This tensor's Sizes are { BatchCount, ChannelCount, M, N }.

-field TransA

Type: DML_MATRIX_TRANSFORM

The transform to be applied to ATensor; either a transpose, or no transform.

-field TransB

Type: DML_MATRIX_TRANSFORM

The transform to be applied to BTensor; either a transpose, or no transform.

-field Alpha

Type: FLOAT

The value of the scalar multiplier for the product of inputs ATensor and BTensor.

-field Beta

Type: FLOAT

The value of the scalar multiplier for the optional input CTensor. If CTensor is not provided, then this value is ignored.

-field FusedActivation

Type: _Maybenull_ const DML_OPERATOR_DESC*

An optional fused activation layer to apply after the GEMM. For more info, see Using fused operators for improved performance.

Availability

This operator was introduced in DML_FEATURE_LEVEL_1_0.

Tensor constraints

  • ATensor, BTensor, CTensor, and OutputTensor must have the same DataType and DimensionCount.
  • CTensor and OutputTensor must have the same Sizes.

Tensor support

DML_FEATURE_LEVEL_4_0 and above

Tensor Kind Dimensions Supported dimension counts Supported data types
ATensor Input { [BatchCount], [ChannelCount], M, K } 2 to 4 FLOAT32, FLOAT16
BTensor Input { [BatchCount], [ChannelCount], K, N } 2 to 4 FLOAT32, FLOAT16
CTensor Optional input { [BatchCount], [ChannelCount], M, N } 2 to 4 FLOAT32, FLOAT16
OutputTensor Output { [BatchCount], [ChannelCount], M, N } 2 to 4 FLOAT32, FLOAT16

DML_FEATURE_LEVEL_1_0 and above

Tensor Kind Dimensions Supported dimension counts Supported data types
ATensor Input { BatchCount, ChannelCount, M, K } 4 FLOAT32, FLOAT16
BTensor Input { BatchCount, ChannelCount, K, N } 4 FLOAT32, FLOAT16
CTensor Optional input { BatchCount, ChannelCount, M, N } 4 FLOAT32, FLOAT16
OutputTensor Output { BatchCount, ChannelCount, M, N } 4 FLOAT32, FLOAT16

-see-also