ggml-org / llama.cpp Public

Notifications
Fork 11.3k
Star 77.5k

Code
Issues 345
Pull requests 401
Discussions
Actions
Projects 9
Wiki
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Wiki
Security
Insights

Pull requests: ggml-org/llama.cpp

Labels 72 Milestones 0

New pull request New

401 Open 5,264 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

gguf-split now respects dry-run option examples

#12681 opened Mar 31, 2025 by nickhuang99

Loading…

WIP: Add support for CogAgent examples python

python script changes

server

#12679 opened Mar 31, 2025 by Tianyue-Zhao • Draft

vocab : BailingMoE : change possessive quantifiers to greedy

#12677 opened Mar 31, 2025 by CISC

Loading…

[CANN]get_rows and dup optimization Ascend NPU

issues specific to Ascend NPUs

ggml

changes relating to the ggml tensor library for machine learning

#12671 opened Mar 31, 2025 by noemotiovon

Loading…

update rope_multi: ggml

changes relating to the ggml tensor library for machine learning

#12665 opened Mar 31, 2025 by foldl

Loading…

contrib: support modelscope community examples

#12664 opened Mar 31, 2025 by tastelikefeet

Loading…

llama : nit, DeepSeek V1 MoE is 16B and GigaChat is 20B

#12652 opened Mar 30, 2025 by CISC

Loading…

opencl : fix memory allocation size ggml

changes relating to the ggml tensor library for machine learning

#12649 opened Mar 30, 2025 by sparkleholic

Loading…

tts : implement sesame CSM + Mimi decoder examples python

python script changes

#12648 opened Mar 29, 2025 by ngxson

Loading…

llama-server : implement universal assisted decoding examples server

#12635 opened Mar 28, 2025 by g2mt

Loading…

vulkan: Hybrid waitForFences/getFenceStatus to reduce fence latency ggml

changes relating to the ggml tensor library for machine learning

Vulkan

Issues specific to the Vulkan backend

#12630 opened Mar 28, 2025 by jeffbolznv

Loading…

vulkan: Implement split_k for coopmat2 flash attention. ggml

changes relating to the ggml tensor library for machine learning

testing

Everything test related

Vulkan

Issues specific to the Vulkan backend

#12627 opened Mar 28, 2025 by jeffbolznv

Loading…

opencl: remove a self-referential macro ggml

changes relating to the ggml tensor library for machine learning

#12626 opened Mar 28, 2025 by linehill

Loading…

sycl: allow ggml-sycl configuration and compilation using Visual Studio project/solution documentation

Improvements or additions to documentation

ggml

changes relating to the ggml tensor library for machine learning

SYCL

https://en.wikipedia.org/wiki/SYCL - GPU programming language

#12625 opened Mar 28, 2025 by s-Nick

Loading…

opencl: Add support for multiple devices ggml

changes relating to the ggml tensor library for machine learning

#12622 opened Mar 28, 2025 by linehill • Draft

Enable MMA for BF16 data types on Powerpc ggml

changes relating to the ggml tensor library for machine learning

#12565 opened Mar 25, 2025 by shalinib-ibm • Draft

vulkan: Implement grouped query attention in the coopmat2 FA shader ggml

changes relating to the ggml tensor library for machine learning

Vulkan

Issues specific to the Vulkan backend

#12559 opened Mar 25, 2025 by jeffbolznv

Loading…

ggml-quants : weighted rounding algorithms with cumulative search generation quality

Quality of model output

ggml

changes relating to the ggml tensor library for machine learning

Less than 4 bits

Efforts related to viable quantized models using <4 bits

research 🔬 Review Complexity : Medium

Generally require more time to grok but manageable by beginner to medium expertise level

Tensor Encoding Scheme

https://github.com/ggerganov/llama.cpp/wiki/Tensor-Encoding-Schemes

#12557 opened Mar 25, 2025 by compilade • Draft

Draft: vulkan: Add bfloat16 support ggml

changes relating to the ggml tensor library for machine learning

Vulkan

Issues specific to the Vulkan backend

#12554 opened Mar 24, 2025 by jeffbolznv

Loading…

llama-map to support hugepage feature of pagesize 2M or 1G which can …

#12552 opened Mar 24, 2025 by nickhuang99

Loading…

Vulkan: Remove dedicated aligned matrix matrix multiplication shaders ggml

changes relating to the ggml tensor library for machine learning

testing

Everything test related

Vulkan

Issues specific to the Vulkan backend

#12515 opened Mar 22, 2025 by 0cc4m • Draft

llama-tts : precompute irFFT theta examples

#12514 opened Mar 22, 2025 by marcoStocchi

Loading…

perplexity: Add option to ignore context window overflow errors and continue score calculation examples

#12512 opened Mar 22, 2025 by EAddario

Loading…

quantize: Handle user-defined quantization levels for additional tensors examples

#12511 opened Mar 22, 2025 by EAddario

Loading…

cmake: Allow to configure GGML_BUILD_NUMBER with file ggml

changes relating to the ggml tensor library for machine learning

#12509 opened Mar 22, 2025 by booxter • Draft

Previous 1 2 3 4 5 … 16 17 Next

Previous Next

ProTip! Add no:assignee to see everything that’s not assigned.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly