Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add detection for Intel Advanced Matrix Extensions (AMX) instructions #231

Merged
merged 1 commit into from
Mar 28, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
56 changes: 56 additions & 0 deletions include/cpuinfo.h
Original file line number Diff line number Diff line change
Expand Up @@ -812,6 +812,10 @@ struct cpuinfo_x86_isa {
bool avx512vp2intersect;
bool avx512_4vnniw;
bool avx512_4fmaps;
bool amx_bf16;
bool amx_tile;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove amx_tile?
is tile useful? all cpus that support amx_bf16 or amx_int8 will support amx_tile, and amd_tile by itself is not useful?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest we keep these flags at the low level that are mapped exactly to underlying CPU ISA feature bits. We can probably have some helper functions like has_amx_support at the higher level for ease of use purposes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is true that the existing platforms that supports amx_bf16 or amx_int8 will support amx_tile. However I prefer to leave a low level flag here, just in case of possible future changes.

bool amx_int8;
bool amx_fp16;
bool hle;
bool rtm;
bool xtest;
Expand Down Expand Up @@ -1328,6 +1332,58 @@ static inline bool cpuinfo_has_x86_avx512_4fmaps(void) {
#endif
}

/* [NOTE] Intel Advanced Matrix Extensions (AMX) detection
*
* I. AMX is a new extensions to the x86 ISA to work on matrices, consists of
* 1) 2-dimentional registers (tiles), hold sub-matrices from larger matrices in memory
* 2) Accelerator called Tile Matrix Multiply (TMUL), contains instructions operating on tiles
*
* II. Platforms that supports AMX:
* +-----------------+-----+----------+----------+----------+----------+
* | Platforms | Gen | amx-bf16 | amx-tile | amx-int8 | amx-fp16 |
* +-----------------+-----+----------+----------+----------+----------+
* | Sapphire Rapids | 4th | YES | YES | YES | NO |
* +-----------------+-----+----------+----------+----------+----------+
* | Emerald Rapids | 5th | YES | YES | YES | NO |
* +-----------------+-----+----------+----------+----------+----------+
* | Granite Rapids | 6th | YES | YES | YES | YES |
* +-----------------+-----+----------+----------+----------+----------+
*
* Reference: https://www.intel.com/content/www/us/en/products/docs
* /accelerator-engines/advanced-matrix-extensions/overview.html
*/
static inline bool cpuinfo_has_x86_amx_bf16(void) {
#if CPUINFO_ARCH_X86 || CPUINFO_ARCH_X86_64
return cpuinfo_isa.amx_bf16;
#else
return false;
#endif
}

static inline bool cpuinfo_has_x86_amx_tile(void) {
#if CPUINFO_ARCH_X86 || CPUINFO_ARCH_X86_64
return cpuinfo_isa.amx_tile;
#else
return false;
#endif
}

static inline bool cpuinfo_has_x86_amx_int8(void) {
#if CPUINFO_ARCH_X86 || CPUINFO_ARCH_X86_64
return cpuinfo_isa.amx_int8;
#else
return false;
#endif
}

static inline bool cpuinfo_has_x86_amx_fp16(void) {
#if CPUINFO_ARCH_X86 || CPUINFO_ARCH_X86_64
return cpuinfo_isa.amx_fp16;
#else
return false;
#endif
}

static inline bool cpuinfo_has_x86_hle(void) {
#if CPUINFO_ARCH_X86 || CPUINFO_ARCH_X86_64
return cpuinfo_isa.hle;
Expand Down
24 changes: 24 additions & 0 deletions src/x86/isa.c
Original file line number Diff line number Diff line change
Expand Up @@ -537,6 +537,30 @@ struct cpuinfo_x86_isa cpuinfo_x86_detect_isa(
*/
isa.avx512bf16 = avx512_regs && !!(structured_feature_info1.eax & UINT32_C(0x00000020));

/*
* AMX_BF16 instructions:
* - Intel: edx[bit 22] in structured feature info (ecx = 0).
*/
isa.amx_bf16 = avx512_regs && !!(structured_feature_info0.edx & UINT32_C(0x00400000));

/*
* AMX_TILE instructions:
* - Intel: edx[bit 24] in structured feature info (ecx = 0).
*/
isa.amx_tile = avx512_regs && !!(structured_feature_info0.edx & UINT32_C(0x01000000));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you confirm this works on gnr256 with avx10 but not avx512?

Copy link
Contributor Author

@mingfeima mingfeima Mar 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you referring to this:

Granite Rapids
        AMXBF16: yes
        AMXTILE: yes
        AMXINT8: yes
        AMXFP16: yes
Granite Rapids (AVX10.1 / 256VL)
        AMXBF16: yes
        AMXTILE: yes
        AMXINT8: yes
        AMXFP16: yes

Results collected with intel Software Development Emulator

quote from https://www.tomshardware.com/news/intels-new-avx10-brings-avx-512-capabilities-to-e-cores

Intel will support AVX10 version 1 (AVX10.1) beginning with its sixth-gen Xeon "Granite Rapids" chips, but that generation will only support 512-bit vector instructions, and not the new converged 256-bit vector instructions. Instead, this first gen will serve as the transition chip from AVX-512 to AVX10.


/*
* AMX_INT8 instructions:
* - Intel: edx[bit 25] in structured feature info (ecx = 0).
*/
isa.amx_int8 = avx512_regs && !!(structured_feature_info0.edx & UINT32_C(0x02000000));

/*
* AMX_FP16 instructions:
* - Intel: eax[bit 21] in structured feature info (ecx = 1).
*/
isa.amx_fp16 = avx512_regs && !!(structured_feature_info1.eax & UINT32_C(0x00200000));

/*
* HLE instructions:
* - Intel: ebx[bit 4] in structured feature info (ecx = 0).
Expand Down
4 changes: 4 additions & 0 deletions tools/isa-info.c
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,10 @@ int main(int argc, char** argv) {
printf("\tAVX512VP2INTERSECT: %s\n", cpuinfo_has_x86_avx512vp2intersect() ? "yes" : "no");
printf("\tAVX512_4VNNIW: %s\n", cpuinfo_has_x86_avx512_4vnniw() ? "yes" : "no");
printf("\tAVX512_4FMAPS: %s\n", cpuinfo_has_x86_avx512_4fmaps() ? "yes" : "no");
printf("\tAMX_BF16: %s\n", cpuinfo_has_x86_amx_bf16() ? "yes" : "no");
printf("\tAMX_TILE: %s\n", cpuinfo_has_x86_amx_tile() ? "yes" : "no");
printf("\tAMX_INT8: %s\n", cpuinfo_has_x86_amx_int8() ? "yes" : "no");
printf("\tAMX_FP16: %s\n", cpuinfo_has_x86_amx_fp16() ? "yes" : "no");
printf("\tAVXVNNI: %s\n", cpuinfo_has_x86_avxvnni() ? "yes" : "no");

printf("Multi-threading extensions:\n");
Expand Down
Loading