[BesTLA] Support fp16 for compute_dtype and scale_dtype #292

luoyu-intel · 2024-06-13T12:24:13Z

Type of Change

support scale_dtype=fp16
support compute_dtype=fp16, only available for GNR with AMX_FP16
split kernels into multiple ISA files
move intrinsic codes from mha to bestla kernel
change the default scale_dtype to fp16, default compute_dtype to int8

bestla/bestla/bestla_prologue_b.h

zhewang1-intc

what' meaning of move intrinsic code from mha to bestla kernel? I didn't notice any mha related ut

zhewang1-intc · 2024-06-21T08:03:04Z

will this pr support gemv with bf16 activation?

luoyu-intel · 2024-06-21T08:12:46Z

what' meaning of move intrinsic code from mha to bestla kernel? I didn't notice any mha related ut

Use the unified ISA control

luoyu-intel · 2024-06-21T08:13:15Z

will this pr support gemv with bf16 activation?

Not in this pr.

github-advanced-security bot found potential problems Jun 13, 2024

View reviewed changes

bestla/bestla/bestla_prologue_b.h Fixed Show fixed Hide fixed

bestla/bestla/bestla_prologue_b.h Fixed Show fixed Hide fixed

bestla/bestla/bestla_prologue_b.h Fixed Show fixed Hide fixed

luoyu-intel added 18 commits June 20, 2024 16:22

split functions by isa

e9fa939

compiled

5f5adbf

remove all intrinsics code from mha

05cfbef

add header

aebf5d3

update target for ICX

b505648

add diagnostic back

5dd2741

update header

c243e5d

type conversion

a8d85d0

fix

e30bee3

remove function for gcc11

2986b40

add fp16 conversion for avx2

0698969

use one template function instead

186d941

add scale_dtype=fp16

6569771

add comp_fp16 UT for low bits

c681bca

fix warning

a359f1d

support f16 for quant api

228354b

add template for comp_fp16

589f321

remove avx512_bf16 templates

50c5523

luoyu-intel force-pushed the sync_mha branch from f9ab717 to 50c5523 Compare June 20, 2024 08:26

fix gcc version

e83d87c

luoyu-intel added BesTLA ready to merge labels Jun 21, 2024

kevinintel requested review from zhewang1-intc and airMeng June 21, 2024 07:38

zhewang1-intc reviewed Jun 21, 2024

View reviewed changes

zhewang1-intc approved these changes Jun 21, 2024

View reviewed changes

airMeng approved these changes Jun 21, 2024

View reviewed changes

luoyu-intel merged commit 9652017 into main Jun 21, 2024
15 of 16 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BesTLA] Support fp16 for compute_dtype and scale_dtype #292

[BesTLA] Support fp16 for compute_dtype and scale_dtype #292

luoyu-intel commented Jun 13, 2024 •

edited

Loading

zhewang1-intc left a comment

zhewang1-intc commented Jun 21, 2024

luoyu-intel commented Jun 21, 2024

luoyu-intel commented Jun 21, 2024

[BesTLA] Support fp16 for compute_dtype and scale_dtype #292

[BesTLA] Support fp16 for compute_dtype and scale_dtype #292

Conversation

luoyu-intel commented Jun 13, 2024 • edited Loading

Type of Change

zhewang1-intc left a comment

Choose a reason for hiding this comment

zhewang1-intc commented Jun 21, 2024

luoyu-intel commented Jun 21, 2024

luoyu-intel commented Jun 21, 2024

luoyu-intel commented Jun 13, 2024 •

edited

Loading