-
Notifications
You must be signed in to change notification settings - Fork 11k
Pull requests: ggml-org/llama.cpp
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
server: fix "--grammar-file" parameter
examples
server
#12285
opened Mar 9, 2025 by
dodekapod
Loading…
vulkan: Pad N dimension of B matrix for coopmat2 perf, to avoid bounds checking
ggml
changes relating to the ggml tensor library for machine learning
Vulkan
Issues specific to the Vulkan backend
#12273
opened Mar 8, 2025 by
jeffbolznv
Loading…
vulkan: fix coopmat shader generation when cross-compiling
ggml
changes relating to the ggml tensor library for machine learning
Vulkan
Issues specific to the Vulkan backend
#12272
opened Mar 8, 2025 by
Icenowy
Loading…
metal: Cache compiled library at device level
Apple Metal
https://en.wikipedia.org/wiki/Metal_(API)
ggml
changes relating to the ggml tensor library for machine learning
#12265
opened Mar 8, 2025 by
BB-fat
Loading…
doc: add text-based diagram of software architecture in toplevel README.md
#12263
opened Mar 8, 2025 by
zhouwg
Loading…
vulkan: optimization proposals for coopmat1 mul_mm
ggml
changes relating to the ggml tensor library for machine learning
Vulkan
Issues specific to the Vulkan backend
#12260
opened Mar 7, 2025 by
remyoudompheng
•
Draft
vulkan: Adjust coopmat2 tile sizes and selection heuristic
ggml
changes relating to the ggml tensor library for machine learning
Vulkan
Issues specific to the Vulkan backend
#12258
opened Mar 7, 2025 by
jeffbolznv
Loading…
server : Add verbose output to OAI compatible chat endpoint.
android
Issues specific to Android
Apple Metal
https://en.wikipedia.org/wiki/Metal_(API)
build
Compilation issues
devops
improvements to build systems and github actions
documentation
Improvements or additions to documentation
examples
ggml
changes relating to the ggml tensor library for machine learning
Kompute
https://github.com/KomputeProject/kompute/
nix
Issues specific to consuming flake.nix, or generally concerned with ❄ Nix-based llama.cpp deployment
Nvidia GPU
Issues specific to Nvidia GPUs
python
python script changes
script
Script related
server
SYCL
https://en.wikipedia.org/wiki/SYCL - GPU programming language
testing
Everything test related
Vulkan
Issues specific to the Vulkan backend
#12246
opened Mar 7, 2025 by
mglambda
Loading…
Fix rocWMMA build documentation
documentation
Improvements or additions to documentation
#12243
opened Mar 7, 2025 by
Headcrabed
Loading…
Issues while enabling MMA support on AIX machines
ggml
changes relating to the ggml tensor library for machine learning
#12241
opened Mar 7, 2025 by
mehendarkarprajwal
Loading…
tests: use adaptive number of threads
testing
Everything test related
#12236
opened Mar 6, 2025 by
JohannesGaessler
Loading…
Optimized DeepSeek V2/V3 implementation (MLA + flash attention)
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
python
python script changes
opencl: use OpenCL C standard supported by the device
ggml
changes relating to the ggml tensor library for machine learning
#12221
opened Mar 6, 2025 by
linehill
Loading…
feat(CMakeLists): Add MSVC-specific compiler warning flags in CMake configuration
ggml
changes relating to the ggml tensor library for machine learning
#12206
opened Mar 5, 2025 by
25077667
Loading…
SYCL: Rename oneMKL to oneMath
documentation
Improvements or additions to documentation
examples
ggml
changes relating to the ggml tensor library for machine learning
SYCL
https://en.wikipedia.org/wiki/SYCL - GPU programming language
#12192
opened Mar 5, 2025 by
Rbiessy
Loading…
libfuse3 supported mounting split gguf's to a single in-memory file
examples
#12189
opened Mar 5, 2025 by
matbee-eth
•
Draft
vulkan: double buffer scale caches
ggml
changes relating to the ggml tensor library for machine learning
Vulkan
Issues specific to the Vulkan backend
#12188
opened Mar 4, 2025 by
netrunnereve
Loading…
fix: AVX2 intrinsics, const correctness, and SIMD headers
build
Compilation issues
ggml
changes relating to the ggml tensor library for machine learning
#12186
opened Mar 4, 2025 by
sandboxyer
Loading…
CUDA: Improve flash decoding kernel GPU occupancy for BS=1 case
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
testing
Everything test related
#12183
opened Mar 4, 2025 by
gaugarg-nv
Loading…
1 of 3 tasks
Previous Next
ProTip!
What’s not been updated in a month: updated:<2025-02-09.