Commit fd89556
authored
[SYCL] Add BF16 support to GET_ROWS operation (#21391)
Add GGML_TYPE_BF16 to the SYCL backend's GET_ROWS operation, both in
supports_op and in the kernel dispatch. This fixes a performance
regression where models using BF16 embedding tensors (e.g., Gemma4's
per_layer_token_embd.weight) fall back to CPU for the GET_ROWS op,
causing a full GPU-to-CPU tensor transfer every token.
The fix reuses the existing get_rows_sycl_float template with
sycl::ext::oneapi::bfloat16, matching the pattern already used for
sycl::half (F16) and float (F32).1 parent 6048993 commit fd89556
2 files changed
Lines changed: 5 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
183 | 183 | | |
184 | 184 | | |
185 | 185 | | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
186 | 190 | | |
187 | 191 | | |
188 | 192 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
4974 | 4974 | | |
4975 | 4975 | | |
4976 | 4976 | | |
| 4977 | + | |
4977 | 4978 | | |
4978 | 4979 | | |
4979 | 4980 | | |
| |||
0 commit comments