-
Notifications
You must be signed in to change notification settings - Fork 11k
Insights: ggml-org/llama.cpp
Overview
Could not load contribution data
Please try again later
32 Releases published by 1 person
-
b4800
published
Mar 2, 2025 -
b4801
published
Mar 3, 2025 -
b4803
published
Mar 3, 2025 -
b4804
published
Mar 3, 2025 -
b4805
published
Mar 3, 2025 -
b4806
published
Mar 3, 2025 -
b4818
published
Mar 3, 2025 -
b4819
published
Mar 3, 2025 -
b4820
published
Mar 4, 2025 -
b4821
published
Mar 4, 2025 -
b4823
published
Mar 4, 2025 -
b4824
published
Mar 5, 2025 -
b4826
published
Mar 5, 2025 -
b4827
published
Mar 5, 2025 -
b4829
published
Mar 5, 2025 -
b4830
published
Mar 5, 2025 -
b4831
published
Mar 5, 2025 -
b4832
published
Mar 6, 2025 -
b4833
published
Mar 6, 2025 -
b4834
published
Mar 6, 2025 -
b4835
published
Mar 6, 2025 -
b4836
published
Mar 6, 2025 -
b4837
published
Mar 6, 2025 -
b4846
published
Mar 7, 2025 -
b4847
published
Mar 7, 2025 -
b4848
published
Mar 7, 2025 -
b4849
published
Mar 7, 2025 -
b4851
published
Mar 7, 2025 -
b4853
published
Mar 7, 2025 -
b4854
published
Mar 7, 2025 -
b4855
published
Mar 7, 2025 -
b4856
published
Mar 8, 2025
47 Pull requests merged by 26 people
-
authors : update
#12271 merged
Mar 8, 2025 -
ggml-backend : make path_str compatible with C++20
#12269 merged
Mar 8, 2025 -
server : infill gen ends on new line
#12254 merged
Mar 7, 2025 -
ggml : skip intermediate .air file when compiling .metallib
#12247 merged
Mar 7, 2025 -
sync : ggml
#12248 merged
Mar 7, 2025 -
ggml-cpu: faster AVX2 variant for IQ1_M
#12216 merged
Mar 7, 2025 -
ci : fix save-load test invokations
#12245 merged
Mar 7, 2025 -
server: log original chat template parsing error
#12233 merged
Mar 7, 2025 -
sync
: minja (support QwQ-32B)#12235 merged
Mar 7, 2025 -
metal : simplify kernel arguments using a struct (#3229)
#12194 merged
Mar 7, 2025 -
Fix HIP rocWMMA CI build break
#12230 merged
Mar 7, 2025 -
metal : fix default.metallib build
#12224 merged
Mar 7, 2025 -
opencl: Noncontiguous
norm
,rms_norm
, disablefp16
for some ops#12217 merged
Mar 7, 2025 -
cmake : fix undefined reference errors for std::filesystem in ggml (#12092)
#12094 merged
Mar 6, 2025 -
Update README.md
#12229 merged
Mar 6, 2025 -
CUDA: fix FA logic for PTX 7.0 and CC >= 7.5
#12222 merged
Mar 6, 2025 -
HIP: rocWMMA documentation and enabling in workflow builds
#12179 merged
Mar 6, 2025 -
docs: update function-calling.md w/ template override needed by functionary-small-v3.2
#12214 merged
Mar 6, 2025 -
llava: add big-endian conversion for image encoder
#12218 merged
Mar 6, 2025 -
HIP/CUDA: set the paramerter value in maintain_cuda_graph instead of replaceing it.
#12209 merged
Mar 6, 2025 -
android : Calculate required KV cache size by summing up tokens size and response token length (#12211)
#12212 merged
Mar 6, 2025 -
opencl: Fix not enough space in the buffer
#12197 merged
Mar 6, 2025 -
opencl: Fix
ulong
kernel args were set fromint
variables#12174 merged
Mar 6, 2025 -
opencl:Fix profile-related errors
#12095 merged
Mar 6, 2025 -
ggml-cpu: Faster IQ1 mul_mat_vec on AVX2 using BMI2 instructions
#12154 merged
Mar 6, 2025 -
SYCL: Disable f16 Unary OPs as not supported by the kernels
#12201 merged
Mar 5, 2025 -
ggml : refactor metal library loading to avoid GGMLMetalClass ODR
#12200 merged
Mar 5, 2025 -
ci : add fetch-depth to xcframework upload
#12195 merged
Mar 5, 2025 -
tool-call
: fix Qwen 2.5 Coder support, add micro benchmarks, support trigger patterns for lazy grammars#12034 merged
Mar 5, 2025 -
ci : fix xcframework artifact tag
#12191 merged
Mar 5, 2025 -
ci : remove xframework upload
#12190 merged
Mar 5, 2025 -
Server: Cache position calculation error(#12160)
#12161 merged
Mar 5, 2025 -
llama : add xcframework build script
#11996 merged
Mar 5, 2025 -
Some portability improvements from trying to build with Visual Studio 2017
#12150 merged
Mar 4, 2025 -
readme : fix roadmap link
#12185 merged
Mar 4, 2025 -
main: allow preloading conversation with -p and add -st / --single-turn
#12145 merged
Mar 4, 2025 -
server
: fix response_format w/ json_schema.schema#12168 merged
Mar 4, 2025 -
Add GGML_HIP_ROCWMMA_FATTN to enable rocWMMA for FlashAttention
#12032 merged
Mar 3, 2025 -
sync : ggml
#12104 merged
Mar 3, 2025 -
ci : set GITHUB_ACTIONS to true for server tests
#12162 merged
Mar 3, 2025 -
tts: add speaker file support
#12048 merged
Mar 3, 2025 -
test-backend-ops : add option -p to filter by op params
#12155 merged
Mar 3, 2025 -
Fix kleidiai build
#12159 merged
Mar 3, 2025 -
Adding UTF-8 support to linenoise.cpp
#12111 merged
Mar 3, 2025 -
webui : add ?m=... and ?q=... params
#12148 merged
Mar 3, 2025 -
SYCL: Move CPY kernels to a separate file and add few missing kernels
#12133 merged
Mar 3, 2025 -
ggml-backend : keep paths in native string type when possible
#12144 merged
Mar 2, 2025
28 Pull requests opened by 24 people
-
build: fix build error when build source code on Windows
#12157 opened
Mar 3, 2025 -
llama : refactor llama_context, llama_kv_cache, llm_build_context (v2)
#12181 opened
Mar 4, 2025 -
CUDA: Improve flash decoding kernel GPU occupancy for BS=1 case
#12183 opened
Mar 4, 2025 -
fix: AVX2 intrinsics, const correctness, and SIMD headers
#12186 opened
Mar 4, 2025 -
vulkan: double buffer scale caches
#12188 opened
Mar 4, 2025 -
libfuse3 supported mounting split gguf's to a single in-memory file
#12189 opened
Mar 5, 2025 -
SYCL: Rename oneMKL to oneMath
#12192 opened
Mar 5, 2025 -
feat(CMakeLists): Add MSVC-specific compiler warning flags in CMake configuration
#12206 opened
Mar 5, 2025 -
build: build llama.cpp + ggml-qnn in pure command line mode on x86-64 Windows
#12215 opened
Mar 6, 2025 -
opencl: use OpenCL C standard supported by the device
#12221 opened
Mar 6, 2025 -
Optimized DeepSeek V2/V3 implementation (MLA + flash attention)
#12227 opened
Mar 6, 2025 -
tests: use adaptive number of threads
#12236 opened
Mar 6, 2025 -
Issues while enabling MMA support on AIX machines
#12241 opened
Mar 7, 2025 -
Fix rocWMMA build documentation
#12243 opened
Mar 7, 2025 -
clang-tidy : disable bugprone-branch-clone
#12244 opened
Mar 7, 2025 -
server : Add verbose output to OAI compatible chat endpoint.
#12246 opened
Mar 7, 2025 -
main : add -sysf / --system-prompt-file (#12249)
#12250 opened
Mar 7, 2025 -
vulkan: Adjust coopmat2 tile sizes and selection heuristic
#12258 opened
Mar 7, 2025 -
vulkan: optimization proposals for coopmat1 mul_mm
#12260 opened
Mar 7, 2025 -
Add simple-tts example
#12261 opened
Mar 8, 2025 -
doc: add text-based diagram of software architecture in toplevel README.md
#12263 opened
Mar 8, 2025 -
metal: Cache compiled library at device level
#12265 opened
Mar 8, 2025 -
vulkan: fix coopmat shader generation when cross-compiling
#12272 opened
Mar 8, 2025 -
vulkan: Pad N dimension of B matrix for coopmat2 perf, to avoid bounds checking
#12273 opened
Mar 8, 2025 -
(research) experiment with phi-4-multimodal vision support
#12274 opened
Mar 8, 2025 -
Refactoring '-o' option
#12278 opened
Mar 9, 2025 -
server: fix "--grammar-file" parameter
#12285 opened
Mar 9, 2025
29 Issues closed by 18 people
-
Misc. bug: GPU Support Missing in Version >=0.3.5 on Windows with CUDA 12.4 and RTX 3090
#12283 closed
Mar 9, 2025 -
Compile bug: Unable to make the ./main folder and run fine tuned model.
#12257 closed
Mar 7, 2025 -
Support for AMD iGPU?
#12239 closed
Mar 7, 2025 -
llama : add test for saving/loading sessions to the CI
#2631 closed
Mar 7, 2025 -
Eval bug: Jinja parser not working with QwQ-32B
#12231 closed
Mar 7, 2025 -
Misc. bug: llama.swiftui simulator error
#12219 closed
Mar 7, 2025 -
metal : simplify kernel arguments using a struct
#3229 closed
Mar 7, 2025 -
Eval bug: llama.cpp returns gibberish on Intel Core Ultra 7 (155H) with ARC iGPU
#12096 closed
Mar 6, 2025 -
[Solved]Model generation speed significantly slows down when using MiroStat V2
#12220 closed
Mar 6, 2025 -
Misc. bug: llama-cli's inference result seems not correct on 64bit-Windows
#12226 closed
Mar 6, 2025 -
Eval bug: Granite Vision 3.1 and 3.2 Surgery Script Found 0 Tensors to Extract
#12202 closed
Mar 6, 2025 -
Eval bug: Incorrect KV cache calculation in llama.android example
#12211 closed
Mar 6, 2025 -
Compile bug: RISCV compilation HELP
#12170 closed
Mar 6, 2025 -
Misc. bug: SYCL out of memory error
#11044 closed
Mar 6, 2025 -
EoS Tokenization issue for Nemo 12b
#11299 closed
Mar 6, 2025 -
Misc. bug: Q4_0 repacking results in double RAM usage
#12149 closed
Mar 5, 2025 -
Misc. bug: Corrupted HF models
#12207 closed
Mar 5, 2025 -
Feature Request: Split model over multiple Vulkan GPUs
#11004 closed
Mar 4, 2025 -
Misc. bug: strange reducing memsize type to 32bit without dev comment
#11293 closed
Mar 4, 2025 -
Bug: Flash Attention performs worse under ROCM
#10439 closed
Mar 3, 2025 -
Compile bug: How to build llama.android example with -DGGML_VULKAN=ON through android studio.
#12085 closed
Mar 3, 2025 -
Feature Request: Avoid xml use in tool call instructions.
#12153 closed
Mar 3, 2025 -
Eval bug: llama cpp becomes slower as the number of threads -t increases
#11247 closed
Mar 3, 2025 -
qwen model quantized with AWQ and lora weights
#11277 closed
Mar 3, 2025 -
Misc. bug: --file flag not working
#12138 closed
Mar 2, 2025 -
Eval bug: Crash with filesystem error when run while in a directory containing files with certain names
#11198 closed
Mar 2, 2025 -
Misc. bug: hipGraph causes a crash in hipGraphDestroy
#11949 closed
Mar 2, 2025
41 Issues opened by 37 people
-
Misc. bug: An error occurred when committing using the pre-commit config in the project
#12284 opened
Mar 9, 2025 -
Eval bug: BLAS backend crash in ggml_backend_dev_backend_reg
#12282 opened
Mar 9, 2025 -
Compile bug: undefined reference to `ggml_set_f32_nd'
#12281 opened
Mar 9, 2025 -
Misc. bug: tool call issues with hf unsloth/Qwen2.5-Coder-7B-Instruct-128K-GGUF
#12279 opened
Mar 9, 2025 -
Eval bug: GPU Hang Error on Metal backend
#12277 opened
Mar 8, 2025 -
Misc. bug: QwQ 32B doesn't put the reasoning content in `message.reasoning_content`
#12275 opened
Mar 8, 2025 -
Feature Request: Add support for InstellaForCausalLM model architecture
#12270 opened
Mar 8, 2025 -
Misc. bug: Misc. bug: cannot convert GLM-4v-9B- (glm-4v-9b) to GGUF format #11263
#12266 opened
Mar 8, 2025 -
Eval bug: server API endpoint not respecting `n_predict` with `-2` (until context filled)
#12264 opened
Mar 8, 2025 -
Eval bug: Segfault at the end of the cache (cache defragmentation?)
#12259 opened
Mar 7, 2025 -
Misc. bug: convert_hf_to_gguf failing for deepseek-r1 full
#12255 opened
Mar 7, 2025 -
Eval bug: garbage output right after kv-cache defragmentation for CPU backend
#12253 opened
Mar 7, 2025 -
Misc. bug: llama-server: SegFault with json_schema containing unsupported pattern
#12252 opened
Mar 7, 2025 -
Eval bug: QWQ generates repeated text when running with reduced context length
#12251 opened
Mar 7, 2025 -
Misc. bug:
#12249 opened
Mar 7, 2025 -
Feature Request: Convert deepseek-v3's mtp module to gguf and quantize to q4km
#12242 opened
Mar 7, 2025 -
Issues while enabling MMA support on AIX machines
#12240 opened
Mar 7, 2025 -
FA bug causes Memory access fault by GPU
#12238 opened
Mar 7, 2025 -
Prompt eval is 5x slower than in Ollama and maxes out the CPU
#12237 opened
Mar 6, 2025 -
Eval bug: Excessive stack usage during tool calling
#12234 opened
Mar 6, 2025 -
Eval bug: Phi-4 mini in iOS with xcframework
#12232 opened
Mar 6, 2025 -
Eval bug: Failed to Load Granite Vision on s390x (llama-llava-cli)
#12225 opened
Mar 6, 2025 -
Misc. bug: ALL gguf models fail to run (no log, docker exit code 139),
#12205 opened
Mar 5, 2025 -
Misc. bug: llama-rpc crashes when deciding memory on CPU with CUDA_VISIBLE_DEVICES=""
#12203 opened
Mar 5, 2025 -
[Metal] Context init optimization opportunity: metal library is compiled for every llama context
#12199 opened
Mar 5, 2025 -
Eval bug: Crash on lazy grammar
#12196 opened
Mar 5, 2025 -
CUDA: Improve flash decoding kernel occupancy for BS=1 case
#12182 opened
Mar 4, 2025 -
Misc. bug: [Server] Crashes with a coredump during termination
#12180 opened
Mar 4, 2025 -
Eval bug: Server returns 500 error on /api/generate and /api/chat requests
#12176 opened
Mar 4, 2025 -
Misc. bug: The inference speed of llama-server is one-third of that of llama-cli
#12171 opened
Mar 4, 2025 -
Compile bug: issue compiling in ubuntu (desktop and server version) using virtualbox
#12164 opened
Mar 3, 2025 -
Misc. bug: Calculating the position of kv cache error in llama sever
#12160 opened
Mar 3, 2025 -
Eval bug: The answers have some problems with the example/llama.android
#12158 opened
Mar 3, 2025 -
CUDA: HIP: maintain_cuda_graph use of cudaGraphKernelNodeGetParams is incorrect.
#12152 opened
Mar 2, 2025 -
Replacement for deprecated codevct string conversion
#12151 opened
Mar 2, 2025
74 Unresolved conversations
Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.
-
llama-tts : add -o option
#12042 commented on
Mar 8, 2025 • 12 new comments -
tool-call: add support for tool-calls using Model Context Protocol
#11556 commented on
Mar 9, 2025 • 4 new comments -
Vulkan: Add DP4A MMQ and Q8_1 quantization shader
#12135 commented on
Mar 4, 2025 • 1 new comment -
Eval bug: <think> tag with DeepSeek-R1-Distill-Qwen-1.5B-Q5_K_M.gguf
#11325 commented on
Mar 9, 2025 • 0 new comments -
Feature Request: allow to run on CPU despite backend initialization failure.
#11584 commented on
Mar 9, 2025 • 0 new comments -
Feature Request: allow setting jinja chat template from server webui
#11689 commented on
Mar 9, 2025 • 0 new comments -
Eval bug: Inconsistent <think> Tag Output in simple-chat vs. llama-cli with DeepSeek-R1-Distill-Qwen-7B-Q4_K_M Model
#11702 commented on
Mar 9, 2025 • 0 new comments -
Misc. bug: The test-chat fails with std::runtime_error
#11705 commented on
Mar 9, 2025 • 0 new comments -
Feature Request: (webui) read data from /props endpoint and use it on the webui
#11717 commented on
Mar 9, 2025 • 0 new comments -
Feature Request: (webui) add import / export function for ALL conversations
#11718 commented on
Mar 9, 2025 • 0 new comments -
Misc. bug: add tool_calls id in response in server
#11992 commented on
Mar 8, 2025 • 0 new comments -
Feature Request: Support for Phi4MMForCausalLM Architecture
#12117 commented on
Mar 8, 2025 • 0 new comments -
Misc. bug: Missing <think> tag in response (DeepSeek R1)
#11861 commented on
Mar 8, 2025 • 0 new comments -
ggml : add ANE backend
#10453 commented on
Mar 8, 2025 • 0 new comments -
Misc. bug: vulkan on 6900xt
#12147 commented on
Mar 8, 2025 • 0 new comments -
Regarding llama-bench and llama-parallel commands
#12106 commented on
Mar 8, 2025 • 0 new comments -
Misc. bug: ggml files conflict between llama.cpp and whisper.cpp
#11303 commented on
Mar 8, 2025 • 0 new comments -
Compile bug: Vulkan can not work on Android (cross-compilation from linux) - Aborted without explaination
#11327 commented on
Mar 8, 2025 • 0 new comments -
Eval bug: using rpc,report error [Inferior 1 (process 290070) detached]
#11431 commented on
Mar 8, 2025 • 0 new comments -
Eval bug: Segmentation fault on image encoder quantization
#11683 commented on
Mar 8, 2025 • 0 new comments -
sycl: cleanup oneDNN related code
#12097 commented on
Mar 4, 2025 • 0 new comments -
vulkan: subgroup size test
#12087 commented on
Mar 8, 2025 • 0 new comments -
[WIP]backend: Integrating QNN (Qualcomm AI Engine Direct) as a dedicated backend for Qualcomm NPUs
#12063 commented on
Mar 8, 2025 • 0 new comments -
ggml-cpu: add arm64 CPU feature check for OpenBSD, FreeBSD
#11939 commented on
Mar 5, 2025 • 0 new comments -
llama : private llama_batch
#11875 commented on
Mar 7, 2025 • 0 new comments -
Add supports for Janus vision encoder and projector [WIP]
#11646 commented on
Mar 7, 2025 • 0 new comments -
Add support for Deepseek-R1 flash attention
#11557 commented on
Mar 7, 2025 • 0 new comments -
Optimized DeepSeek V2/V3 implementation (MLA)
#11446 commented on
Mar 6, 2025 • 0 new comments -
llama : add option to override model tensor buffers
#11397 commented on
Mar 6, 2025 • 0 new comments -
llama : second attempt to refactor vision API
#11292 commented on
Mar 9, 2025 • 0 new comments -
SYCL: Fixes for building SYCL backend for AMD GPUs
#10851 commented on
Mar 6, 2025 • 0 new comments -
Introduce Graph Profiler
#9659 commented on
Mar 6, 2025 • 0 new comments -
Changes for the existing quant strategies / FTYPEs and new ones
#8836 commented on
Mar 8, 2025 • 0 new comments -
Rebalancing Metal threads workload in dot product kernel kernel_mul_mv_f16_f32_l4
#7522 commented on
Mar 8, 2025 • 0 new comments -
support MiniCPM-V-2
#6919 commented on
Mar 7, 2025 • 0 new comments -
Feature Request: Enable cuda 11.4 and cuda arch 3.7
#12140 commented on
Mar 9, 2025 • 0 new comments -
Research: Performance differences between Metal (macOS) and Vulkan (Linux)
#10982 commented on
Mar 9, 2025 • 0 new comments -
Compile bug: Nix + cross compilation + Vulkan doesn't work
#11654 commented on
Mar 8, 2025 • 0 new comments -
Feature Request: YuE (music gen)
#11467 commented on
Mar 5, 2025 • 0 new comments -
Misc. bug: Converting sftensor model
#11497 commented on
Mar 5, 2025 • 0 new comments -
Feature Request: resize an existing context
#11577 commented on
Mar 5, 2025 • 0 new comments -
Llama-3.2 11B Vision Support
#9643 commented on
Mar 4, 2025 • 0 new comments -
Misc. bug: llama-server throws "Unsupported param: tools"
#10920 commented on
Mar 4, 2025 • 0 new comments -
Misc. bug: model warmup doesn't work correctly for MoE models
#11163 commented on
Mar 4, 2025 • 0 new comments -
Eval bug: -sm row performance on NVidia multy-gpu config is extremely low on the long contexts after b3990
#11510 commented on
Mar 4, 2025 • 0 new comments -
Feature Request: add --no-warmup to llama-qwen2vl-cli
#11526 commented on
Mar 4, 2025 • 0 new comments -
Misc. bug: File "train-text-from-scratch" missing
#11561 commented on
Mar 4, 2025 • 0 new comments -
Flash attention implementations do not handle case where value vectors have different dimension from query vectors
#7343 commented on
Mar 3, 2025 • 0 new comments -
changelog : `libllama` API
#9289 commented on
Mar 3, 2025 • 0 new comments -
changelog : `llama-server` REST API
#9291 commented on
Mar 3, 2025 • 0 new comments -
Feature Request: mixed ROCm+CUDA possible?
#11506 commented on
Mar 3, 2025 • 0 new comments -
Misc. bug: server api endpoint /completion ignoring grammar parameter
#11544 commented on
Mar 3, 2025 • 0 new comments -
Compile bug: _WIN32_WINNT not, or not correctly set when compiling with clang and "MinGW Makefiles" generator
#11542 commented on
Mar 3, 2025 • 0 new comments -
Misc. bug: error when converting lora to gguf (ERROR:lora-to-gguf:Unexpected name 'base_model.model.lm_head.weight': Not a lora_A or lora_B tensor)
#11554 commented on
Mar 3, 2025 • 0 new comments -
Misc. bug: Quantization of deepseek r1 qwen models fails when using K quants.
#11560 commented on
Mar 3, 2025 • 0 new comments -
Misc. bug: vulkan on Adreno GPU
#12139 commented on
Mar 2, 2025 • 0 new comments -
Misc. bug: Sporadic MUL_MAT Failures in test-backend-ops for Nvidia backend
#11972 commented on
Mar 7, 2025 • 0 new comments -
Feature Request: Support for Deepseek Janus-Pro-7B & Janus-1.3B
#11490 commented on
Mar 7, 2025 • 0 new comments -
Feature Request: Qwen 2.5 VL
#11483 commented on
Mar 7, 2025 • 0 new comments -
Eval bug: ~~Q2_K and Q3_K~~ Q8_0 not working on Vulkan anymore on RX 5700XT
#10710 commented on
Mar 7, 2025 • 0 new comments -
Misc. bug: AMD Rcom command error only with cli tools
#11509 commented on
Mar 7, 2025 • 0 new comments -
Compile bug: ARMv7 NEON FP16 Intrinsic Errors When Cross-Compiling with Android NDK r26b
#11636 commented on
Mar 7, 2025 • 0 new comments -
Eval bug: qwen2-vl failed to process while using the HIP in windows 11
#11638 commented on
Mar 7, 2025 • 0 new comments -
Eval bug: CANNOT LINK EXECUTABLE "./llama-cli": library "libomp.so" not found: needed by main executable
#11979 commented on
Mar 7, 2025 • 0 new comments -
Feature Request: Proposing User-Customizable RAG Integration in llama.cpp: A Path to Enhanced Contextual Retrieval
#12129 commented on
Mar 6, 2025 • 0 new comments -
Eval bug: getting assertion error when trying to use a gguf quantized model at inference "GGML_ASSERT(n_outputs_enc > 0 && "call llama_encode() first") failed"
#12080 commented on
Mar 6, 2025 • 0 new comments -
Compile bug: llama.cpp-master/ggml/src/ggml-cpu/ggml-cpu-aarch64.cpp:80:54:error '_mm256_set_m128i' was not declared in this scope
#11385 commented on
Mar 6, 2025 • 0 new comments -
Misc. bug: Vulkan Q4_K_M inference speed degradation
#11559 commented on
Mar 6, 2025 • 0 new comments -
Feature Request: Ship llama.cpp binaries in AppImage format
#11579 commented on
Mar 6, 2025 • 0 new comments -
Compile bug: Not compilable with MACOSX_DEPLOYMENT_TARGET < 10.15
#11612 commented on
Mar 6, 2025 • 0 new comments -
Misc. bug: CTRL-ENTER not visibly shown in web UI prompt
#11586 commented on
Mar 6, 2025 • 0 new comments -
Misc. bug: Webui: Default light theme code blocks are not visible
#11623 commented on
Mar 6, 2025 • 0 new comments -
Refactor: llama-impl.h should be private
#11630 commented on
Mar 6, 2025 • 0 new comments -
Compile bug: undefined reference to std::filesystem
#10978 commented on
Mar 5, 2025 • 0 new comments