Releases: ggml-org/llama.cpp
Releases · ggml-org/llama.cpp
b5734
b5733
CUDA: add mean operation (#14313) * CUDA: add mean operation * add back sum_rows_f32_cuda * Review: early exit if col!=0
b5731
Add support for VK_EXT_debug_utils to add labels to Vulkan objects. (…
b5729
metal : fix thread-safety (#14300) ggml-ci
b5728
memory : rename interface to llama_memory_context_i (#14296) * memory : rename interface to llama_memory_context_i ggml-ci * cont : fix comments * cont : use "mctx" for referencing a memory context ggml-ci
b5726
sync : ggml ggml-ci
b5723
CUDA: add conv_2d_transpose (#14287) * CUDA: add conv_2d_transpose * remove direct include of cuda_fp16 * Review: add brackets for readability, remove ggml_set_param and add asserts
b5722
lint : remove trailing whitepace (#14304)
b5721
vocab : prevent tokenizer overflow (#14301) * vocab : prevent stack overflow in tokenize * vocab : return error instead of aborting on oversized token count * vocab : INT32_MIN from llama_tokenize on overflow
b5720
sycl: add usage of enqueue_functions extension (#14244) * Add header and namespace to use enqueue_functions extension * Convert submit and parallel_for to use new extension in convert.cpp * Convert submit and parallel_for to use extension in ggml-sycl.cpp * Convert submit and parallel_for to use extension in gla.cpp * Convert submit and parallel_for in mmq.cpp * Convert submit and parallel_for in mmvq.cpp * Convert submit and parallel_for in remaining files * Convert all simple parallel_for to nd_launch from enqueue_functions extension * Wrapping extension in general function Create a general function that enable the enqueue_functions extension if it is enable in the compiler, otherwise call the general SYCL function to launch kernels. --------- Signed-off-by: nscipione <nicolo.scipione@codeplay.com>