-
Notifications
You must be signed in to change notification settings - Fork 13k
vulkan: small mul_mat_vec optimizations #10665
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
44 commits
Select commit
Hold shift + click to select a range
b7ad234
dot and delta optimization
netrunnereve be2d004
server : fix default draft model parameters (#10586)
ggerganov ed8649f
github : minify link [no ci]
ggerganov ca7c213
github : minify link [no ci] (revert)
ggerganov d6753d7
metal : small-batch mat-mul kernels (#10581)
ggerganov d37b7e0
readme : add option, update default value, fix formatting (#10271)
pothitos f697baf
llama : add missing LLAMA_API for llama_chat_builtin_templates (#10636)
ngxson e92a46b
metal : add `GGML_OP_CONV_TRANSPOSE_1D` kernels (ggml/1026)
PABannier 2b15590
feat: add `GGML_UNARY_OP_ARGMAX` Metal kernel (ggml/1019)
PABannier 0df0452
CUDA: remove unnecessary warp reduce in FA (ggml/1032)
mahorozte 69c7f20
sync : ggml
ggerganov f8fe71a
scripts : remove amx sync
ggerganov 70f0346
server : (web ui) Various improvements, now use vite as bundler (#10599)
ngxson fa9abd6
vulkan: optimize and reenable split_k (#10637)
jeffbolznv 0fa9dc4
clip : add sycl support (#10574)
piDack 0a81a82
Add docs for creating a static build (#10268) (#10630)
mostlygeek 9075271
Avoid using __fp16 on ARM with old nvcc (#10616)
frankier 4153d57
fix typo of README.md (#10605)
WrRan e147054
SYCL : Move to compile time oneMKL interface backend selection for NV…
s-Nick 062f256
remove a multiply
netrunnereve fe81134
merge
netrunnereve c403d89
Merge https://github.com/ggerganov/llama.cpp into vulkan
netrunnereve 5fbaf12
remove a multiply
netrunnereve 2f56bac
additional small optimizations
netrunnereve 591894a
Merge https://github.com/ggerganov/llama.cpp into vulkan
netrunnereve 0b1b7c8
Merge branch 'ggerganov:master' into vulkan
netrunnereve 4eefebc
Merge branch 'vulkan' of https://github.com/netrunnereve/llama.cpp in…
netrunnereve 32b994e
Merge branch 'ggerganov:master' into vulkan
netrunnereve 4b65c6b
Merge branch 'vulkan' of https://github.com/netrunnereve/llama.cpp in…
netrunnereve f5a15fc
remove ifdefs
netrunnereve bd17bc4
cleanup
netrunnereve 4a185ad
double the number of rows per workgroup
netrunnereve 984d470
Update ggml-vulkan.cpp
netrunnereve 595c1a7
Vulkan: Add VK_EXT_subgroup_size_control support to ensure full subgr…
0cc4m 6de2866
only increase the number of rows for amd and subgroup size 64
netrunnereve bfecabe
merge
netrunnereve 1c16367
fix missing NUM_ROWS for mul_mat_vec_iq4_nl_f16_f32, untested
netrunnereve 8972f1d
Merge branch '0cc4m/vulkan-subgroup-size-control' of https://github.c…
netrunnereve c7bc42c
Merge https://github.com/ggerganov/llama.cpp into vulkan
netrunnereve 9af9e80
use subgroup min and max to check for gcn (requires https://github.co…
netrunnereve d9c6bf1
manual merge ggml-vulkan.cpp
netrunnereve 8b13f2d
fix conflict
netrunnereve 1aa26d7
set min and max subgroup size in any case
netrunnereve 20b47d4
Also double the number of rows for Intel GPUs
0cc4m File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.