-
Notifications
You must be signed in to change notification settings - Fork 11.2k
Pull requests: ggml-org/llama.cpp
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
sync : ggml
ggml
changes relating to the ggml tensor library for machine learning
script
Script related
#12645
opened Mar 29, 2025 by
ggerganov
Loading…
llama-server : implement universal assisted decoding
examples
server
#12635
opened Mar 28, 2025 by
g2mt
Loading…
llama : support BailingMoE (Ling)
model
Model specific
python
python script changes
#12634
opened Mar 28, 2025 by
CISC
Loading…
vulkan: Hybrid waitForFences/getFenceStatus to reduce fence latency
ggml
changes relating to the ggml tensor library for machine learning
Vulkan
Issues specific to the Vulkan backend
#12630
opened Mar 28, 2025 by
jeffbolznv
Loading…
vulkan: Implement split_k for coopmat2 flash attention.
ggml
changes relating to the ggml tensor library for machine learning
testing
Everything test related
Vulkan
Issues specific to the Vulkan backend
#12627
opened Mar 28, 2025 by
jeffbolznv
Loading…
opencl: remove a self-referential macro
ggml
changes relating to the ggml tensor library for machine learning
#12626
opened Mar 28, 2025 by
linehill
Loading…
sycl: allow ggml-sycl configuration and compilation using Visual Studio project/solution
documentation
Improvements or additions to documentation
ggml
changes relating to the ggml tensor library for machine learning
SYCL
https://en.wikipedia.org/wiki/SYCL - GPU programming language
#12625
opened Mar 28, 2025 by
s-Nick
Loading…
opencl: Add support for multiple devices
ggml
changes relating to the ggml tensor library for machine learning
Add Yandex instruct model template support
testing
Everything test related
#12621
opened Mar 28, 2025 by
vorobyov01
Loading…
1 of 3 tasks
musa: fix all warnings, re-enable improvements to build systems and github actions
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
-DLLAMA_FATAL_WARNINGS=ON
in ci and update doc
devops
#12611
opened Mar 27, 2025 by
yeahdongcn
Loading…
2 tasks done
Enable MMA for BF16 data types on Powerpc
ggml
changes relating to the ggml tensor library for machine learning
#12565
opened Mar 25, 2025 by
shalinib-ibm
•
Draft
vulkan: Implement grouped query attention in the coopmat2 FA shader
ggml
changes relating to the ggml tensor library for machine learning
Vulkan
Issues specific to the Vulkan backend
#12559
opened Mar 25, 2025 by
jeffbolznv
Loading…
ggml-quants : weighted rounding algorithms with cumulative search
generation quality
Quality of model output
ggml
changes relating to the ggml tensor library for machine learning
Less than 4 bits
Efforts related to viable quantized models using <4 bits
research 🔬
Review Complexity : Medium
Generally require more time to grok but manageable by beginner to medium expertise level
Tensor Encoding Scheme
https://github.com/ggerganov/llama.cpp/wiki/Tensor-Encoding-Schemes
#12557
opened Mar 25, 2025 by
compilade
Loading…
Add Trillion 7B model support
python
python script changes
#12556
opened Mar 25, 2025 by
juyoung-trl
Loading…
1 of 3 tasks
Draft: vulkan: Add bfloat16 support
ggml
changes relating to the ggml tensor library for machine learning
Vulkan
Issues specific to the Vulkan backend
#12554
opened Mar 24, 2025 by
jeffbolznv
Loading…
llama-map to support hugepage feature of pagesize 2M or 1G which can …
#12552
opened Mar 24, 2025 by
nickhuang99
Loading…
perplexity: Add option to ignore context window overflow errors and continue score calculation
examples
#12512
opened Mar 22, 2025 by
EAddario
Loading…
quantize: Handle user-defined quantization levels for additional tensors
examples
#12511
opened Mar 22, 2025 by
EAddario
Loading…
cmake: Allow to configure GGML_BUILD_NUMBER with file
ggml
changes relating to the ggml tensor library for machine learning
Previous Next
ProTip!
no:milestone will show everything without a milestone.