Skip to content
This repository has been archived by the owner on Aug 30, 2024. It is now read-only.

[BesTLA] Support fp16 for compute_dtype and scale_dtype #292

Merged
merged 19 commits into from
Jun 21, 2024
Merged

Conversation

luoyu-intel
Copy link
Contributor

@luoyu-intel luoyu-intel commented Jun 13, 2024

Type of Change

  1. support scale_dtype=fp16
  2. support compute_dtype=fp16, only available for GNR with AMX_FP16
  3. split kernels into multiple ISA files
  4. move intrinsic codes from mha to bestla kernel
  5. change the default scale_dtype to fp16, default compute_dtype to int8

bestla/bestla/bestla_prologue_b.h Fixed Show fixed Hide fixed
bestla/bestla/bestla_prologue_b.h Fixed Show fixed Hide fixed
bestla/bestla/bestla_prologue_b.h Fixed Show fixed Hide fixed
Copy link
Contributor

@zhewang1-intc zhewang1-intc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what' meaning of move intrinsic code from mha to bestla kernel? I didn't notice any mha related ut

@zhewang1-intc
Copy link
Contributor

will this pr support gemv with bf16 activation?

@luoyu-intel
Copy link
Contributor Author

what' meaning of move intrinsic code from mha to bestla kernel? I didn't notice any mha related ut

Use the unified ISA control

@luoyu-intel
Copy link
Contributor Author

will this pr support gemv with bf16 activation?

Not in this pr.

@luoyu-intel luoyu-intel merged commit 9652017 into main Jun 21, 2024
15 of 16 checks passed
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants