merge from upstream #77

l3utterfly · 2025-07-14T09:20:52Z

Make sure to read the contributing guidelines before submitting a PR

**Important** LFM2 was [merged ](huggingface/transformers#39340 transformers, but has not yet been released. To convert into gguf, install transformers from source ```shell pip install "transformers @ git+https://github.com/huggingface/transformers.git@main" ```

* vulkan: allow unclamped loads in coopmat2 mul_mat_id shader * vulkan: increase coopmat2 mul_mat_id tile size * vulkan: optimize mat_mul_id row_ids search to batch loads, and port to coopmat1 path * vulkan: use smaller FA row size when head size is large. applies to both scalar and CM2 paths (CM1 isn't used due to shared memory limits)

* vulkan: support SET_ROWS Add variants of the copy_to_quant shader that do the SET_ROWS operation. Change these shaders to spread the work across the workgroup. The memory access pattern is probably not great (one thread per quant block), but should be fine for now. * vulkan: optimize set_rows Larger workgroups for non-quant types. Set "norepeat" (there is manual repeat logic). Use fastmod.

ggml-ci

* CUDA: add set rows for f32 and f16 * Review: change kernel params, use strides from host * Use 1-d kernel * Review: use int64_t for blockDim.x, rename nb->s for clarity

* readme : add LFM2 to models section * fix copy paste...

…#14661) ggml-ci

* llama : add jinja template for rwkv-world Signed-off-by: Molly Sophia <mollysophia379@gmail.com> * Update convert_hf_to_gguf.py Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> --------- Signed-off-by: Molly Sophia <mollysophia379@gmail.com> Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

slojosic-amd and others added 20 commits July 11, 2025 18:55

HIP : Add HIP 7.0+ compatibility for hipBLAS compute types (ggml-org#…

756aa10

…14634)

server : fix pooled embedding output (ggml-org#14645)

0c1df14

vulkan : implement ggml_roll (ggml/1290)

3e303b1

ggml-ci

vulkan : implement bilinear interpolation (ggml/1291)

74bb294

ggml-ci

sync : ggml

2155357

ggml-ci

vulkan : remove unused vars (#0)

3120413

ggml-ci

sync : ggml

8eff955

CUDA: add set rows for f32 and f16 (ggml-org#14551)

7de5c7c

* CUDA: add set rows for f32 and f16 * Review: change kernel params, use strides from host * Use 1-d kernel * Review: use int64_t for blockDim.x, rename nb->s for clarity

docs : add LFM2 to models section (ggml-org#14650)

67eade1

* readme : add LFM2 to models section * fix copy paste...

tests : cover lfm2 cases in test_ssm_conv (ggml-org#14651)

c31e606

cmake : Add CMake presets for Linux and GCC (ggml-org#14656)

84b396e

metal : Add missing unary ops Metal support (ggml-org#14660)

dcf7f2e

ggml : add build-time message to remind about ggml_set_rows (ggml-org…

05fec5b

…#14661) ggml-ci

cuda : add ELU support (ggml-org#14657)

e743cdd

cuda : add set rows for bf16 (ggml-org#14664)

923e3ea

quantize : fix minor logic flaw in --tensor-type (ggml-org#14572)

982e347

l3utterfly merged commit 252164c into layla-build Jul 14, 2025
63 of 64 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

merge from upstream #77

merge from upstream #77

Uh oh!

l3utterfly commented Jul 14, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

12 participants

merge from upstream #77

merge from upstream #77

Uh oh!

Conversation

l3utterfly commented Jul 14, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

12 participants