Pulse · ggml-org/llama.cpp · GitHub

March 2, 2025 – March 9, 2025

Overview

75 Active pull requests

70 Active issues

Could not load contribution data

Please try again later

32 Releases published by 1 person

b4800
published Mar 2, 2025
b4801
published Mar 3, 2025
b4803
published Mar 3, 2025
b4804
published Mar 3, 2025
b4805
published Mar 3, 2025
b4806
published Mar 3, 2025
b4818
published Mar 3, 2025
b4819
published Mar 3, 2025
b4820
published Mar 4, 2025
b4821
published Mar 4, 2025
b4823
published Mar 4, 2025
b4824
published Mar 5, 2025
b4826
published Mar 5, 2025
b4827
published Mar 5, 2025
b4829
published Mar 5, 2025
b4830
published Mar 5, 2025
b4831
published Mar 5, 2025
b4832
published Mar 6, 2025
b4833
published Mar 6, 2025
b4834
published Mar 6, 2025
b4835
published Mar 6, 2025
b4836
published Mar 6, 2025
b4837
published Mar 6, 2025
b4846
published Mar 7, 2025
b4847
published Mar 7, 2025
b4848
published Mar 7, 2025
b4849
published Mar 7, 2025
b4851
published Mar 7, 2025
b4853
published Mar 7, 2025
b4854
published Mar 7, 2025
b4855
published Mar 7, 2025
b4856
published Mar 8, 2025

47 Pull requests merged by 26 people

authors : update
#12271 merged Mar 8, 2025
ggml-backend : make path_str compatible with C++20
#12269 merged Mar 8, 2025
server : infill gen ends on new line
#12254 merged Mar 7, 2025
ggml : skip intermediate .air file when compiling .metallib
#12247 merged Mar 7, 2025
sync : ggml
#12248 merged Mar 7, 2025
ggml-cpu: faster AVX2 variant for IQ1_M
#12216 merged Mar 7, 2025
ci : fix save-load test invokations
#12245 merged Mar 7, 2025
server: log original chat template parsing error
#12233 merged Mar 7, 2025
sync: minja (support QwQ-32B)
#12235 merged Mar 7, 2025
metal : simplify kernel arguments using a struct (#3229)
#12194 merged Mar 7, 2025
Fix HIP rocWMMA CI build break
#12230 merged Mar 7, 2025
metal : fix default.metallib build
#12224 merged Mar 7, 2025
opencl: Noncontiguous norm, rms_norm, disable fp16 for some ops
#12217 merged Mar 7, 2025
cmake : fix undefined reference errors for std::filesystem in ggml (#12092)
#12094 merged Mar 6, 2025
Update README.md
#12229 merged Mar 6, 2025
CUDA: fix FA logic for PTX 7.0 and CC >= 7.5
#12222 merged Mar 6, 2025
HIP: rocWMMA documentation and enabling in workflow builds
#12179 merged Mar 6, 2025
docs: update function-calling.md w/ template override needed by functionary-small-v3.2
#12214 merged Mar 6, 2025
llava: add big-endian conversion for image encoder
#12218 merged Mar 6, 2025
HIP/CUDA: set the paramerter value in maintain_cuda_graph instead of replaceing it.
#12209 merged Mar 6, 2025
android : Calculate required KV cache size by summing up tokens size and response token length (#12211)
#12212 merged Mar 6, 2025
opencl: Fix not enough space in the buffer
#12197 merged Mar 6, 2025
opencl: Fix ulong kernel args were set from int variables
#12174 merged Mar 6, 2025
opencl:Fix profile-related errors
#12095 merged Mar 6, 2025
ggml-cpu: Faster IQ1 mul_mat_vec on AVX2 using BMI2 instructions
#12154 merged Mar 6, 2025
SYCL: Disable f16 Unary OPs as not supported by the kernels
#12201 merged Mar 5, 2025
ggml : refactor metal library loading to avoid GGMLMetalClass ODR
#12200 merged Mar 5, 2025
ci : add fetch-depth to xcframework upload
#12195 merged Mar 5, 2025
tool-call: fix Qwen 2.5 Coder support, add micro benchmarks, support trigger patterns for lazy grammars
#12034 merged Mar 5, 2025
ci : fix xcframework artifact tag
#12191 merged Mar 5, 2025
ci : remove xframework upload
#12190 merged Mar 5, 2025
Server: Cache position calculation error(#12160)
#12161 merged Mar 5, 2025
llama : add xcframework build script
#11996 merged Mar 5, 2025
Some portability improvements from trying to build with Visual Studio 2017
#12150 merged Mar 4, 2025
readme : fix roadmap link
#12185 merged Mar 4, 2025
main: allow preloading conversation with -p and add -st / --single-turn
#12145 merged Mar 4, 2025
server: fix response_format w/ json_schema.schema
#12168 merged Mar 4, 2025
Add GGML_HIP_ROCWMMA_FATTN to enable rocWMMA for FlashAttention
#12032 merged Mar 3, 2025
sync : ggml
#12104 merged Mar 3, 2025
ci : set GITHUB_ACTIONS to true for server tests
#12162 merged Mar 3, 2025
tts: add speaker file support
#12048 merged Mar 3, 2025
test-backend-ops : add option -p to filter by op params
#12155 merged Mar 3, 2025
Fix kleidiai build
#12159 merged Mar 3, 2025
Adding UTF-8 support to linenoise.cpp
#12111 merged Mar 3, 2025
webui : add ?m=... and ?q=... params
#12148 merged Mar 3, 2025
SYCL: Move CPY kernels to a separate file and add few missing kernels
#12133 merged Mar 3, 2025
ggml-backend : keep paths in native string type when possible
#12144 merged Mar 2, 2025

28 Pull requests opened by 24 people

build: fix build error when build source code on Windows
#12157 opened Mar 3, 2025
CUDA/HIP: refractor mmqv to unify the calculation of nwarps and rows per block between host and device code.
#12177 opened Mar 4, 2025
llama : refactor llama_context, llama_kv_cache, llm_build_context (v2)
#12181 opened Mar 4, 2025
CUDA: Improve flash decoding kernel GPU occupancy for BS=1 case
#12183 opened Mar 4, 2025
fix: AVX2 intrinsics, const correctness, and SIMD headers
#12186 opened Mar 4, 2025
vulkan: double buffer scale caches
#12188 opened Mar 4, 2025
libfuse3 supported mounting split gguf's to a single in-memory file
#12189 opened Mar 5, 2025
SYCL: Rename oneMKL to oneMath
#12192 opened Mar 5, 2025
feat(CMakeLists): Add MSVC-specific compiler warning flags in CMake configuration
#12206 opened Mar 5, 2025
build: build llama.cpp + ggml-qnn in pure command line mode on x86-64 Windows
#12215 opened Mar 6, 2025
opencl: use OpenCL C standard supported by the device
#12221 opened Mar 6, 2025
Optimized DeepSeek V2/V3 implementation (MLA + flash attention)
#12227 opened Mar 6, 2025
tests: use adaptive number of threads
#12236 opened Mar 6, 2025
Issues while enabling MMA support on AIX machines
#12241 opened Mar 7, 2025
Fix rocWMMA build documentation
#12243 opened Mar 7, 2025
clang-tidy : disable bugprone-branch-clone
#12244 opened Mar 7, 2025
server : Add verbose output to OAI compatible chat endpoint.
#12246 opened Mar 7, 2025
main : add -sysf / --system-prompt-file (#12249)
#12250 opened Mar 7, 2025
vulkan: Adjust coopmat2 tile sizes and selection heuristic
#12258 opened Mar 7, 2025
vulkan: optimization proposals for coopmat1 mul_mm
#12260 opened Mar 7, 2025
Add simple-tts example
#12261 opened Mar 8, 2025
doc: add text-based diagram of software architecture in toplevel README.md
#12263 opened Mar 8, 2025
metal: Cache compiled library at device level
#12265 opened Mar 8, 2025
vulkan: fix coopmat shader generation when cross-compiling
#12272 opened Mar 8, 2025
vulkan: Pad N dimension of B matrix for coopmat2 perf, to avoid bounds checking
#12273 opened Mar 8, 2025
(research) experiment with phi-4-multimodal vision support
#12274 opened Mar 8, 2025
Refactoring '-o' option
#12278 opened Mar 9, 2025
server: fix "--grammar-file" parameter
#12285 opened Mar 9, 2025

29 Issues closed by 18 people

Misc. bug: GPU Support Missing in Version >=0.3.5 on Windows with CUDA 12.4 and RTX 3090
#12283 closed Mar 9, 2025
Compile bug: Unable to make the ./main folder and run fine tuned model.
#12257 closed Mar 7, 2025
Support for AMD iGPU?
#12239 closed Mar 7, 2025
llama : add test for saving/loading sessions to the CI
#2631 closed Mar 7, 2025
Eval bug: Jinja parser not working with QwQ-32B
#12231 closed Mar 7, 2025
Misc. bug: llama.swiftui simulator error
#12219 closed Mar 7, 2025
metal : simplify kernel arguments using a struct
#3229 closed Mar 7, 2025
Eval bug: llama.cpp returns gibberish on Intel Core Ultra 7 (155H) with ARC iGPU
#12096 closed Mar 6, 2025
[Solved]Model generation speed significantly slows down when using MiroStat V2
#12220 closed Mar 6, 2025
Misc. bug: llama-cli's inference result seems not correct on 64bit-Windows
#12226 closed Mar 6, 2025
Eval bug: Granite Vision 3.1 and 3.2 Surgery Script Found 0 Tensors to Extract
#12202 closed Mar 6, 2025
Eval bug: Incorrect KV cache calculation in llama.android example
#12211 closed Mar 6, 2025
Compile bug: RISCV compilation HELP
#12170 closed Mar 6, 2025
Misc. bug: SYCL out of memory error
#11044 closed Mar 6, 2025
EoS Tokenization issue for Nemo 12b
#11299 closed Mar 6, 2025
Eval bug: bartowski/functionary-small-v3.2-GGUF:Q4_K_M model prepends "assistant\n" to text responses when tools are provided
#12213 closed Mar 6, 2025
Misc. bug: Q4_0 repacking results in double RAM usage
#12149 closed Mar 5, 2025
Misc. bug: Corrupted HF models
#12207 closed Mar 5, 2025
Misc. bug: `json_schema` under `response_format` is not working on OpenAI compatible API endpoint `v1/chat/completions`
#11988 closed Mar 4, 2025
Feature Request: Split model over multiple Vulkan GPUs
#11004 closed Mar 4, 2025
Misc. bug: strange reducing memsize type to 32bit without dev comment
#11293 closed Mar 4, 2025
Bug: Flash Attention performs worse under ROCM
#10439 closed Mar 3, 2025
Compile bug: How to build llama.android example with -DGGML_VULKAN=ON through android studio.
#12085 closed Mar 3, 2025
Feature Request: Avoid xml use in tool call instructions.
#12153 closed Mar 3, 2025
Eval bug: llama cpp becomes slower as the number of threads -t increases
#11247 closed Mar 3, 2025
qwen model quantized with AWQ and lora weights
#11277 closed Mar 3, 2025
Misc. bug: --file flag not working
#12138 closed Mar 2, 2025
Eval bug: Crash with filesystem error when run while in a directory containing files with certain names
#11198 closed Mar 2, 2025
Misc. bug: hipGraph causes a crash in hipGraphDestroy
#11949 closed Mar 2, 2025

41 Issues opened by 37 people

Compile bug: 执行python3 convert-hf-to-gguf.py D:\DevelopSoftware\Ollamamodel\DeepSeek-R1-Medical-COT-500命令，没有响应
#12286 opened Mar 9, 2025
Misc. bug: An error occurred when committing using the pre-commit config in the project
#12284 opened Mar 9, 2025
Eval bug: BLAS backend crash in ggml_backend_dev_backend_reg
#12282 opened Mar 9, 2025
Compile bug: undefined reference to `ggml_set_f32_nd'
#12281 opened Mar 9, 2025
Misc. bug: tool call issues with hf unsloth/Qwen2.5-Coder-7B-Instruct-128K-GGUF
#12279 opened Mar 9, 2025
Eval bug: GPU Hang Error on Metal backend
#12277 opened Mar 8, 2025
Feature Request: grammar / json schema with reasoning format. Allow model free to think but strict to answer.
#12276 opened Mar 8, 2025
Misc. bug: QwQ 32B doesn't put the reasoning content in `message.reasoning_content`
#12275 opened Mar 8, 2025
Feature Request: Add support for InstellaForCausalLM model architecture
#12270 opened Mar 8, 2025
Misc. bug: Misc. bug: cannot convert GLM-4v-9B- (glm-4v-9b) to GGUF format #11263
#12266 opened Mar 8, 2025
Eval bug: server API endpoint not respecting `n_predict` with `-2` (until context filled)
#12264 opened Mar 8, 2025
Eval bug: Segfault at the end of the cache (cache defragmentation?)
#12259 opened Mar 7, 2025
Eval bug: Llama server <tool_call> is occasionally not parsed as json, and is in content rather than tool_calls
#12256 opened Mar 7, 2025
Misc. bug: convert_hf_to_gguf failing for deepseek-r1 full
#12255 opened Mar 7, 2025
Eval bug: garbage output right after kv-cache defragmentation for CPU backend
#12253 opened Mar 7, 2025
Misc. bug: llama-server: SegFault with json_schema containing unsupported pattern
#12252 opened Mar 7, 2025
Eval bug: QWQ generates repeated text when running with reduced context length
#12251 opened Mar 7, 2025
Misc. bug:
#12249 opened Mar 7, 2025
Feature Request: Convert deepseek-v3's mtp module to gguf and quantize to q4km
#12242 opened Mar 7, 2025
Issues while enabling MMA support on AIX machines
#12240 opened Mar 7, 2025
FA bug causes Memory access fault by GPU
#12238 opened Mar 7, 2025
Prompt eval is 5x slower than in Ollama and maxes out the CPU
#12237 opened Mar 6, 2025
Eval bug: Excessive stack usage during tool calling
#12234 opened Mar 6, 2025
Eval bug: Phi-4 mini in iOS with xcframework
#12232 opened Mar 6, 2025
Eval bug: Failed to Load Granite Vision on s390x (llama-llava-cli)
#12225 opened Mar 6, 2025
Misc. bug: ALL gguf models fail to run (no log, docker exit code 139),
#12205 opened Mar 5, 2025
Misc. bug: llama-rpc crashes when deciding memory on CPU with CUDA_VISIBLE_DEVICES=""
#12203 opened Mar 5, 2025
[Metal] Context init optimization opportunity: metal library is compiled for every llama context
#12199 opened Mar 5, 2025
Eval bug: Crash on lazy grammar
#12196 opened Mar 5, 2025
CUDA: Improve flash decoding kernel occupancy for BS=1 case
#12182 opened Mar 4, 2025
Misc. bug: [Server] Crashes with a coredump during termination
#12180 opened Mar 4, 2025
Misc. bug: Denial of Service (crash) when using verbose output with input tokens that are not in printable range.
#12178 opened Mar 4, 2025
Eval bug: Server returns 500 error on /api/generate and /api/chat requests
#12176 opened Mar 4, 2025
Misc. bug: HIP compilation together with -DGGML_CPU_ALL_VARIANTS=ON does not load the model or detects the GPU
#12175 opened Mar 4, 2025
Misc. bug: The inference speed of llama-server is one-third of that of llama-cli
#12171 opened Mar 4, 2025
Compile bug: issue compiling in ubuntu (desktop and server version) using virtualbox
#12164 opened Mar 3, 2025
Eval bug: loading model: vk::PhysicalDevice::createDevice: ErrorExtensionNotPresent. Not falling back to CPU
#12163 opened Mar 3, 2025
Misc. bug: Calculating the position of kv cache error in llama sever
#12160 opened Mar 3, 2025
Eval bug: The answers have some problems with the example/llama.android
#12158 opened Mar 3, 2025
CUDA: HIP: maintain_cuda_graph use of cudaGraphKernelNodeGetParams is incorrect.
#12152 opened Mar 2, 2025
Replacement for deprecated codevct string conversion
#12151 opened Mar 2, 2025

74 Unresolved conversations

Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.

llama-tts : add -o option
#12042 commented on Mar 8, 2025 • 12 new comments
tool-call: add support for tool-calls using Model Context Protocol
#11556 commented on Mar 9, 2025 • 4 new comments
Vulkan: Add DP4A MMQ and Q8_1 quantization shader
#12135 commented on Mar 4, 2025 • 1 new comment
Eval bug: <think> tag with DeepSeek-R1-Distill-Qwen-1.5B-Q5_K_M.gguf
#11325 commented on Mar 9, 2025 • 0 new comments
Feature Request: allow to run on CPU despite backend initialization failure.
#11584 commented on Mar 9, 2025 • 0 new comments
Feature Request: allow setting jinja chat template from server webui
#11689 commented on Mar 9, 2025 • 0 new comments
Eval bug: Inconsistent <think> Tag Output in simple-chat vs. llama-cli with DeepSeek-R1-Distill-Qwen-7B-Q4_K_M Model
#11702 commented on Mar 9, 2025 • 0 new comments
Misc. bug: The test-chat fails with std::runtime_error
#11705 commented on Mar 9, 2025 • 0 new comments
Feature Request: (webui) read data from /props endpoint and use it on the webui
#11717 commented on Mar 9, 2025 • 0 new comments
Feature Request: (webui) add import / export function for ALL conversations
#11718 commented on Mar 9, 2025 • 0 new comments
Misc. bug: add tool_calls id in response in server
#11992 commented on Mar 8, 2025 • 0 new comments
Feature Request: Support for Phi4MMForCausalLM Architecture
#12117 commented on Mar 8, 2025 • 0 new comments
Misc. bug: Missing <think> tag in response (DeepSeek R1)
#11861 commented on Mar 8, 2025 • 0 new comments
ggml : add ANE backend
#10453 commented on Mar 8, 2025 • 0 new comments
Misc. bug: vulkan on 6900xt
#12147 commented on Mar 8, 2025 • 0 new comments
Regarding llama-bench and llama-parallel commands
#12106 commented on Mar 8, 2025 • 0 new comments
Misc. bug: ggml files conflict between llama.cpp and whisper.cpp
#11303 commented on Mar 8, 2025 • 0 new comments
Compile bug: Vulkan can not work on Android (cross-compilation from linux) - Aborted without explaination
#11327 commented on Mar 8, 2025 • 0 new comments
Eval bug: using rpc，report error [Inferior 1 (process 290070) detached]
#11431 commented on Mar 8, 2025 • 0 new comments
Eval bug: Segmentation fault on image encoder quantization
#11683 commented on Mar 8, 2025 • 0 new comments
sycl: cleanup oneDNN related code
#12097 commented on Mar 4, 2025 • 0 new comments
vulkan: subgroup size test
#12087 commented on Mar 8, 2025 • 0 new comments
[WIP]backend: Integrating QNN (Qualcomm AI Engine Direct) as a dedicated backend for Qualcomm NPUs
#12063 commented on Mar 8, 2025 • 0 new comments
ggml-cpu: add arm64 CPU feature check for OpenBSD, FreeBSD
#11939 commented on Mar 5, 2025 • 0 new comments
llama : private llama_batch
#11875 commented on Mar 7, 2025 • 0 new comments
Add supports for Janus vision encoder and projector [WIP]
#11646 commented on Mar 7, 2025 • 0 new comments
Add support for Deepseek-R1 flash attention
#11557 commented on Mar 7, 2025 • 0 new comments
Optimized DeepSeek V2/V3 implementation (MLA)
#11446 commented on Mar 6, 2025 • 0 new comments
llama : add option to override model tensor buffers
#11397 commented on Mar 6, 2025 • 0 new comments
llama : second attempt to refactor vision API
#11292 commented on Mar 9, 2025 • 0 new comments
SYCL: Fixes for building SYCL backend for AMD GPUs
#10851 commented on Mar 6, 2025 • 0 new comments
Introduce Graph Profiler
#9659 commented on Mar 6, 2025 • 0 new comments
Changes for the existing quant strategies / FTYPEs and new ones
#8836 commented on Mar 8, 2025 • 0 new comments
Rebalancing Metal threads workload in dot product kernel kernel_mul_mv_f16_f32_l4
#7522 commented on Mar 8, 2025 • 0 new comments
support MiniCPM-V-2
#6919 commented on Mar 7, 2025 • 0 new comments
Feature Request: Enable cuda 11.4 and cuda arch 3.7
#12140 commented on Mar 9, 2025 • 0 new comments
Research: Performance differences between Metal (macOS) and Vulkan (Linux)
#10982 commented on Mar 9, 2025 • 0 new comments
Compile bug: Nix + cross compilation + Vulkan doesn't work
#11654 commented on Mar 8, 2025 • 0 new comments
Feature Request: YuE (music gen)
#11467 commented on Mar 5, 2025 • 0 new comments
Misc. bug: Converting sftensor model
#11497 commented on Mar 5, 2025 • 0 new comments
Feature Request: resize an existing context
#11577 commented on Mar 5, 2025 • 0 new comments
Llama-3.2 11B Vision Support
#9643 commented on Mar 4, 2025 • 0 new comments
Misc. bug: llama-server throws "Unsupported param: tools"
#10920 commented on Mar 4, 2025 • 0 new comments
Misc. bug: model warmup doesn't work correctly for MoE models
#11163 commented on Mar 4, 2025 • 0 new comments
Eval bug: -sm row performance on NVidia multy-gpu config is extremely low on the long contexts after b3990
#11510 commented on Mar 4, 2025 • 0 new comments
Feature Request: add --no-warmup to llama-qwen2vl-cli
#11526 commented on Mar 4, 2025 • 0 new comments
Misc. bug: File "train-text-from-scratch" missing
#11561 commented on Mar 4, 2025 • 0 new comments
Flash attention implementations do not handle case where value vectors have different dimension from query vectors
#7343 commented on Mar 3, 2025 • 0 new comments
changelog : `libllama` API
#9289 commented on Mar 3, 2025 • 0 new comments
changelog : `llama-server` REST API
#9291 commented on Mar 3, 2025 • 0 new comments
Feature Request: mixed ROCm+CUDA possible?
#11506 commented on Mar 3, 2025 • 0 new comments
Misc. bug: server api endpoint /completion ignoring grammar parameter
#11544 commented on Mar 3, 2025 • 0 new comments
Compile bug: _WIN32_WINNT not, or not correctly set when compiling with clang and "MinGW Makefiles" generator
#11542 commented on Mar 3, 2025 • 0 new comments
Misc. bug: error when converting lora to gguf (ERROR:lora-to-gguf:Unexpected name 'base_model.model.lm_head.weight': Not a lora_A or lora_B tensor)
#11554 commented on Mar 3, 2025 • 0 new comments
Misc. bug: Quantization of deepseek r1 qwen models fails when using K quants.
#11560 commented on Mar 3, 2025 • 0 new comments
Misc. bug: vulkan on Adreno GPU
#12139 commented on Mar 2, 2025 • 0 new comments
Misc. bug: Sporadic MUL_MAT Failures in test-backend-ops for Nvidia backend
#11972 commented on Mar 7, 2025 • 0 new comments
Feature Request: Support for Deepseek Janus-Pro-7B & Janus-1.3B
#11490 commented on Mar 7, 2025 • 0 new comments
Feature Request: Qwen 2.5 VL
#11483 commented on Mar 7, 2025 • 0 new comments
Eval bug: ~~Q2_K and Q3_K~~ Q8_0 not working on Vulkan anymore on RX 5700XT
#10710 commented on Mar 7, 2025 • 0 new comments
Misc. bug: AMD Rcom command error only with cli tools
#11509 commented on Mar 7, 2025 • 0 new comments
Compile bug: ARMv7 NEON FP16 Intrinsic Errors When Cross-Compiling with Android NDK r26b
#11636 commented on Mar 7, 2025 • 0 new comments
Eval bug: qwen2-vl failed to process while using the HIP in windows 11
#11638 commented on Mar 7, 2025 • 0 new comments
Eval bug: CANNOT LINK EXECUTABLE "./llama-cli": library "libomp.so" not found: needed by main executable
#11979 commented on Mar 7, 2025 • 0 new comments
Feature Request: Proposing User-Customizable RAG Integration in llama.cpp: A Path to Enhanced Contextual Retrieval
#12129 commented on Mar 6, 2025 • 0 new comments
Eval bug: getting assertion error when trying to use a gguf quantized model at inference "GGML_ASSERT(n_outputs_enc > 0 && "call llama_encode() first") failed"
#12080 commented on Mar 6, 2025 • 0 new comments
Compile bug: llama.cpp-master/ggml/src/ggml-cpu/ggml-cpu-aarch64.cpp:80:54:error '_mm256_set_m128i' was not declared in this scope
#11385 commented on Mar 6, 2025 • 0 new comments
Misc. bug: Vulkan Q4_K_M inference speed degradation
#11559 commented on Mar 6, 2025 • 0 new comments
Feature Request: Ship llama.cpp binaries in AppImage format
#11579 commented on Mar 6, 2025 • 0 new comments
Compile bug: Not compilable with MACOSX_DEPLOYMENT_TARGET < 10.15
#11612 commented on Mar 6, 2025 • 0 new comments
Misc. bug: CTRL-ENTER not visibly shown in web UI prompt
#11586 commented on Mar 6, 2025 • 0 new comments
Misc. bug: Webui: Default light theme code blocks are not visible
#11623 commented on Mar 6, 2025 • 0 new comments
Refactor: llama-impl.h should be private
#11630 commented on Mar 6, 2025 • 0 new comments
Compile bug: undefined reference to std::filesystem
#10978 commented on Mar 5, 2025 • 0 new comments