metal: add support for opt_step_sgd #16539

cern1710 · 2025-10-12T19:19:14Z

This PR adds Metal backend for the OPT_STEP_SGD operator.

Implementation

Added kernel_opt_step_sgd_f32 kernel in ggml-metal.metal
Implemented argument struct ggml_metal_kargs_opt_step_sgd
Storingnp in the struct as a passed constant & args
Threadgroup size calculation, similar to ggml_metal_op_opt_step_sgd

I've ran ./test-opt and ./test-backend-ops to verify the validity of this implementation.

Note: The implementation is mostly identical to #16529, with the only major difference being the kernel itself.

* origin/master: (32 commits) metal : FA support F32 K and V and head size = 32 (ggml-org#16531) graph : support cacheless embeddings with FA and iSWA (ggml-org#16528) opencl: fix build targeting CL 2 (ggml-org#16554) CUDA: fix numerical issues in tile FA kernel (ggml-org#16540) ggml : fix build broken with -march=armv9-a on MacOS (ggml-org#16520) CANN: fix CPU memory leak in CANN backend (ggml-org#16549) fix: add remark plugin to render raw HTML as literal text (ggml-org#16505) metal: add support for opt_step_sgd (ggml-org#16539) ggml : fix scalar path for computing norm (ggml-org#16558) CANN: Update several operators to support FP16 data format (ggml-org#16251) metal : add opt_step_adamw and op_sum (ggml-org#16529) webui: remove client-side context pre-check and rely on backend for limits (ggml-org#16506) [SYCL] fix UT fault cases: count-equal, argsort, pad OPs (ggml-org#16521) ci : add Vulkan on Ubuntu with default packages build (ggml-org#16532) common : handle unicode during partial json parsing (ggml-org#16526) common : update presets (ggml-org#16504) ggml : Fix FP16 ELU positive branch (ggml-org#16519) hparams : add check for layer index in is_recurrent (ggml-org#16511) ggml: Correct SVE implementation in ggml_vec_dot_f16_unroll (ggml-org#16518) CUDA: faster tile FA, add oob checks, more HSs (ggml-org#16492) ...

metal: add support for opt_step_sgd

bb8d149

cern1710 requested a review from ggerganov as a code owner October 12, 2025 19:19

github-actions bot added ggml changes relating to the ggml tensor library for machine learning Apple Metal https://en.wikipedia.org/wiki/Metal_(API) labels Oct 12, 2025

add newline to pass EditorConfig check

248d9c5

ggerganov approved these changes Oct 13, 2025

View reviewed changes

ggerganov merged commit 3f750f8 into ggml-org:master Oct 13, 2025
67 checks passed

cern1710 deleted the metal-opt-step-sgd branch October 13, 2025 08:27

cern1710 mentioned this pull request Oct 13, 2025

metal: optimise GGML_OP_SUM #16559

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

metal: add support for opt_step_sgd #16539

metal: add support for opt_step_sgd #16539

cern1710 commented Oct 12, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

metal: add support for opt_step_sgd #16539

metal: add support for opt_step_sgd #16539

Conversation

cern1710 commented Oct 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Implementation

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

cern1710 commented Oct 12, 2025 •

edited

Loading