ggml-org / llama.cpp Public

Notifications
Fork 11.2k
Star 77.2k

Code
Issues 346
Pull requests 406
Discussions
Actions
Projects 9
Wiki
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Wiki
Security
Insights

Pull requests: ggml-org/llama.cpp

Labels 72 Milestones 0

New pull request New

406 Open 5,202 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

convert : fix squeeze for ssm_conv tensors python

python script changes

#12573 opened Mar 25, 2025 by ggerganov

Loading…

metal : refactor mat-vec code Apple Metal

https://en.wikipedia.org/wiki/Metal_(API)

ggml

changes relating to the ggml tensor library for machine learning

#12569 opened Mar 25, 2025 by ggerganov

Loading…

clip: Fix llama-llava-clip-quantize-cli quantization error under CUDA backend examples

#12566 opened Mar 25, 2025 by Ivy233

Loading…

Enable MMA for BF16 data types on Powerpc ggml

changes relating to the ggml tensor library for machine learning

#12565 opened Mar 25, 2025 by shalinib-ibm

Loading…

vulkan: Implement grouped query attention in the coopmat2 FA shader ggml

changes relating to the ggml tensor library for machine learning

Vulkan

Issues specific to the Vulkan backend

#12559 opened Mar 25, 2025 by jeffbolznv

Loading…

ggml-quants : weighted rounding algorithms with cumulative search generation quality

Quality of model output

ggml

changes relating to the ggml tensor library for machine learning

Less than 4 bits

Efforts related to viable quantized models using <4 bits

research 🔬 Review Complexity : Medium

Generally require more time to grok but manageable by beginner to medium expertise level

Tensor Encoding Scheme

https://github.com/ggerganov/llama.cpp/wiki/Tensor-Encoding-Schemes

#12557 opened Mar 25, 2025 by compilade

Loading…

Add Trillion 7B model support python

python script changes

#12556 opened Mar 25, 2025 by juyoung-trl

Loading…

1 of 3 tasks

Draft: vulkan: Add bfloat16 support ggml

changes relating to the ggml tensor library for machine learning

Vulkan

Issues specific to the Vulkan backend

#12554 opened Mar 24, 2025 by jeffbolznv

Loading…

llama-map to support hugepage feature of pagesize 2M or 1G which can …

#12552 opened Mar 24, 2025 by nickhuang99

Loading…

(draft) tts: Sesame support examples python

python script changes

#12549 opened Mar 24, 2025 by pminev • Draft

ggml : fix MUL_MAT_ID repack with Q8_K ggml

changes relating to the ggml tensor library for machine learning

#12544 opened Mar 24, 2025 by ggerganov

Loading…

ggml : riscv: add 128-bit RVV support ggml

changes relating to the ggml tensor library for machine learning

#12530 opened Mar 23, 2025 by xctan

Loading…

cmake: fix ccache conflict ggml

changes relating to the ggml tensor library for machine learning

#12522 opened Mar 23, 2025 by BusyJay

Loading…

Vulkan: Remove dedicated aligned matrix matrix multiplication shaders ggml

changes relating to the ggml tensor library for machine learning

testing

Everything test related

Vulkan

Issues specific to the Vulkan backend

#12515 opened Mar 22, 2025 by 0cc4m • Draft

llama-tts : precompute irFFT theta examples

#12514 opened Mar 22, 2025 by marcoStocchi

Loading…

perplexity: Add option to ignore context window overflow errors and continue score calculation examples

#12512 opened Mar 22, 2025 by EAddario

Loading…

quantize: Handle user-defined quantization levels for additional tensors examples

#12511 opened Mar 22, 2025 by EAddario

Loading…

cmake: Allow to configure GGML_BUILD_NUMBER with file ggml

changes relating to the ggml tensor library for machine learning

#12509 opened Mar 22, 2025 by booxter • Draft

llama: support Qwen3 python

python script changes

#12501 opened Mar 21, 2025 by CISC • Draft

rpc : send hash when tensor data is above some fixed threshold examples ggml

changes relating to the ggml tensor library for machine learning

#12496 opened Mar 21, 2025 by rgerganov

Loading…

llamafile : ppc64le MMA implementation for Q4_0. ggml

changes relating to the ggml tensor library for machine learning

#12489 opened Mar 21, 2025 by amritahs-ibm

Loading…

Evenly and stably pinning thread pool ggml

changes relating to the ggml tensor library for machine learning

#12488 opened Mar 21, 2025 by zts9989 • Draft

(draft) tts: Orpheus support ggml

changes relating to the ggml tensor library for machine learning

python

python script changes

#12487 opened Mar 21, 2025 by jamorphy • Draft

Metal TQ2_0 Apple Metal

https://en.wikipedia.org/wiki/Metal_(API)

ggml

changes relating to the ggml tensor library for machine learning

#12485 opened Mar 20, 2025 by dmahurin

Loading…

Nomic Embed Text V2 with Mixture-of-Experts (MoE) architecture python

python script changes

#12466 opened Mar 19, 2025 by manyoso • Draft

Previous 1 2 3 4 5 … 16 17 Next

Previous Next

ProTip! Follow long discussions with comments:>50.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly