Supporting Intel AMX instructions in quantized GEMM #14042

chenfucn · 2022-12-21T18:37:35Z

Description

Using Intel AMX int8 instructions to accelerate quantized GEMM

Motivation and Context

AMX instructions accelerate quantized GEMM significantly:

Prepacked B perf numbers (latency in ns)

GEMM Config	AVX512Vnni	AMX
M:384/N:1024/K:1024/Batch:1/Threads:4	1057511	285393
M:384/N:1024/K:3072/Batch:1/Threads:4	2643929	700397
M:384/N:1024/K:4096/Batch:1/Threads:4	3784750	890701
M:384/N:4096/K:1024/Batch:1/Threads:4	2378139	887251
M:384/N:1024/K:1024/Batch:1/Threads:16	307137	138481
M:384/N:1024/K:3072/Batch:1/Threads:16	855730	295027
M:384/N:1024/K:4096/Batch:1/Threads:16	1126878	317395
M:384/N:4096/K:1024/Batch:1/Threads:16	781963	237014
M:1536/N:1024/K:1024/Batch:1/Threads:16	538864	181459
M:1536/N:1024/K:3072/Batch:1/Threads:16	1681002	561600
M:1536/N:1024/K:4096/Batch:1/Threads:16	2158127	717470
M:1536/N:4096/K:1024/Batch:1/Threads:16	2428622	896140
M:3072/N:1024/K:1024/Batch:1/Threads:16	1058029	357031
M:3072/N:1024/K:3072/Batch:1/Threads:16	3138504	1095857
M:3072/N:1024/K:4096/Batch:1/Threads:16	4155640	`1386183`
M:3072/N:4096/K:1024/Batch:1/Threads:16	4679030	1778624

snnn · 2022-12-22T02:16:11Z

onnxruntime/core/mlas/lib/qgemm_kernel_amx.cpp

+};
+
+constexpr size_t MLAS_GEMM_U8S8_KERNEL_AMX::PackedK;
+constexpr MLAS_GEMM_QUANT_STRIDES MLAS_GEMM_U8S8_KERNEL_AMX::Strides;


You may try this: https://www.codingame.com/playgrounds/2205/7-features-of-c17-that-will-simplify-your-code/inline-variables

hmm, this pattern happens a lot in old MLAS. let me get another PR for this.

cmake/onnxruntime_mlas.cmake

onnxruntime/core/mlas/lib/qgemm_kernel_amx.cpp

onnxruntime/core/mlas/lib/amd64/QgemmU8S8KernelAmx.asm

yufenglee

yihonglyu and others added 2 commits December 19, 2022 15:02

QGEMM u8s8 AMX kernel

28be17d

compile benchmark

b987f74

chenfucn requested a review from a team as a code owner December 21, 2022 18:37

jchen351 previously approved these changes Dec 21, 2022

View reviewed changes

snnn reviewed Dec 22, 2022

View reviewed changes

chenfucn dismissed jchen351’s stale review via 2d87e51 December 22, 2022 17:42

avoid uninitialized var warning

20d013f

chenfucn force-pushed the cfu_amx_merge branch from 90e2141 to 20d013f Compare December 22, 2022 21:20

chenfucn added 2 commits December 28, 2022 13:12

avoid building amx on older compilers

18994b3

amx only on gcc 11

73bf259

jchen351 previously approved these changes Dec 29, 2022

View reviewed changes

yufenglee reviewed Jan 9, 2023

View reviewed changes

cmake/onnxruntime_mlas.cmake Show resolved Hide resolved

yufenglee reviewed Jan 9, 2023

View reviewed changes

onnxruntime/core/mlas/lib/qgemm_kernel_amx.cpp Outdated Show resolved Hide resolved

comments

ecdc008

chenfucn dismissed jchen351’s stale review via ecdc008 January 9, 2023 22:24

yufenglee reviewed Jan 10, 2023

View reviewed changes

onnxruntime/core/mlas/lib/amd64/QgemmU8S8KernelAmx.asm Show resolved Hide resolved

yufenglee approved these changes Jan 10, 2023

View reviewed changes

chenfucn merged commit 9014289 into microsoft:main Jan 10, 2023

chenfucn mentioned this pull request Feb 3, 2023

Support for AMX instruction #13586

Closed

jchia mentioned this pull request Mar 15, 2024

[Feature Request] Support AMX BF16 #19937

Open

marenz2569 mentioned this pull request Nov 24, 2024

AMX Extension tud-zih-energy/FIRESTARTER#93

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Supporting Intel AMX instructions in quantized GEMM #14042

Supporting Intel AMX instructions in quantized GEMM #14042

Uh oh!

chenfucn commented Dec 21, 2022

Uh oh!

snnn Dec 22, 2022

Uh oh!

chenfucn Dec 28, 2022

Uh oh!

Uh oh!

Uh oh!

Uh oh!

yufenglee left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Supporting Intel AMX instructions in quantized GEMM #14042

Supporting Intel AMX instructions in quantized GEMM #14042

Uh oh!

Conversation

chenfucn commented Dec 21, 2022

Description

Motivation and Context

Uh oh!

snnn Dec 22, 2022

Choose a reason for hiding this comment

Uh oh!

chenfucn Dec 28, 2022

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

yufenglee left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants