-
Notifications
You must be signed in to change notification settings - Fork 12.3k
Pull requests: ggml-org/llama.cpp
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
CUDA: add bilinear interpolation for upscale
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#14563
opened Jul 7, 2025 by
am17an
Loading…
SYCL: Initial set_rows kernel implementation
ggml
changes relating to the ggml tensor library for machine learning
SYCL
https://en.wikipedia.org/wiki/SYCL - GPU programming language
#14562
opened Jul 7, 2025 by
qnixsynapse
Loading…
musa: fix build warnings (unused variable)
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#14561
opened Jul 7, 2025 by
yeahdongcn
Loading…
Add PLaMo-2 model
examples
python
python script changes
#14560
opened Jul 7, 2025 by
mitmul
Loading…
vulkan: optimizations for deepseek prompt processing
ggml
changes relating to the ggml tensor library for machine learning
Vulkan
Issues specific to the Vulkan backend
#14555
opened Jul 6, 2025 by
jeffbolznv
Loading…
vulkan: optimize flash attention split_k_reduce
ggml
changes relating to the ggml tensor library for machine learning
Vulkan
Issues specific to the Vulkan backend
#14554
opened Jul 6, 2025 by
jeffbolznv
Loading…
CUDA: add set rows for f32 and f16
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#14551
opened Jul 6, 2025 by
am17an
Loading…
opencl: add changes relating to the ggml tensor library for machine learning
OpenCL
Issues specific to the OpenCL backend
set_rows
for f16
and f32
ggml
#14547
opened Jul 6, 2025 by
lhez
Loading…
server: Add ability to mount server at prefix
examples
server
#14544
opened Jul 5, 2025 by
oluwandabira
Loading…
OpenCL: add tiled mul_mat_f16_f32
ggml
changes relating to the ggml tensor library for machine learning
OpenCL
Issues specific to the OpenCL backend
#14535
opened Jul 4, 2025 by
rmatif
Loading…
llama: add initial support for Falcon-H1 model family
python
python script changes
#14534
opened Jul 4, 2025 by
ibrahimkhadraoui
Loading…
common: detect and prefer big cores on AArch64 hybrid CPU on linux
#14532
opened Jul 4, 2025 by
kiwi142857
Loading…
ggml: fix typo in ggml.c
ggml
changes relating to the ggml tensor library for machine learning
#14531
opened Jul 4, 2025 by
zhouwg
Loading…
webui : add a preset feature to the settings
examples
server
#14523
opened Jul 3, 2025 by
gabriellarson
Loading…
train: add simple loading already tokenized data from parquet dataset
build
Compilation issues
examples
#14522
opened Jul 3, 2025 by
lexasub
Loading…
ggml: Add initial WebGPU backend
devops
improvements to build systems and github actions
documentation
Improvements or additions to documentation
ggml
changes relating to the ggml tensor library for machine learning
python
python script changes
#14521
opened Jul 3, 2025 by
reeselevine
Loading…
mtmd : Fix 32-bit narrowing issue in export-lora and mtmd clip
examples
#14503
opened Jul 2, 2025 by
kiwi142857
Loading…
MUSA: upgrade musa sdk to <<TBD>>
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#14498
opened Jul 2, 2025 by
yeahdongcn
•
Draft
Compute buffer and KV-cache aware layer distribution for multi-GPU inference
#14484
opened Jul 1, 2025 by
borebot
Loading…
server : (webui) let server send locally-defined default webui settings
examples
server
#14468
opened Jun 30, 2025 by
woof-dog
Loading…
Chore: batch prompts, extract tensors specific layer
examples
#14463
opened Jun 30, 2025 by
VakantieModus
Loading…
Previous Next
ProTip!
Type g i on any issue or pull request to go back to the issue listing page.