Skip to content

Sync master with upstream release b9140#515

Merged
jan-service-account merged 13 commits into
devfrom
update-dev-from-master-2026-05-14-01-09
May 14, 2026
Merged

Sync master with upstream release b9140#515
jan-service-account merged 13 commits into
devfrom
update-dev-from-master-2026-05-14-01-09

Conversation

@jan-service-account
Copy link
Copy Markdown

Updates dev branch with latest release (b9140) from ggml-org/llama.cpp

trivikram-reddy1 and others added 13 commits May 12, 2026 17:28
…22993)

* hexagon: add hvx_vec_repl helpers and use those for splat-from-vtcm usecase

* hmx-mm: optimize per-group scale handling

* hmx-fa: optimize slope load from vtcm

* hmx-fa: use aligned access where possible in hmx-utils

* hexagon: add hvx_vec_repl_2x_f16 helper and consolidate repl helpers

---------

Co-authored-by: Max Krasnyansky <maxk@qti.qualcomm.com>
…gml-org#22681)

* ggml-zendnn : add runtime env var GGML_ZENDNN_ADAPTIVE_FALLBACK to control adaptive fallback (default: enabled)

* ggml-zendnn : restore original fallback logic when adaptive fallback is disabled
* spec : update CLI arguments for better consistency

* cont : fix CLI arg message
* ci: validate model naming convention

* bring back dedicated ec workflow

* add missing jobs

---------

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
…org#22727)

* server, webui : support continue generation on reasoning models (ggml-org#22727)

Remove the throw blocking assistant prefill on reasoning models and
orchestrate thinking tags around the prefilled message so the parser
routes the next stream chunks correctly. WebUI drops the reasoning
guard on the Continue button, sends reasoning_content with the
prefilled message and persists partial reasoning on stop so the CoT
survives reload and resume.

Scope : templates with a simple thinking_start_tag / thinking_end_tag
pair. Channel-based templates like GPT-OSS are out of scope, pending
a per-template prefill API in common/chat.

First step toward ggml-org#21754.

* chore: update webui build output

* server: reject reasoning prefill on channel based templates
Updated OPENVINO.md with Validated models and quantizations

Co-authored-by: Haarika Madaka <haarika.madaka@intel.com>
* webui: preserve system message on edit cancel when content is not the placeholder

* chore: update webui build output
…ases in UI (ggml-org#22979)

* fix: Deduplicate aliases + display single alias instead of default name or 2+ aliases as tags

* refactor: Address review comments
@jan-service-account jan-service-account merged commit dac4002 into dev May 14, 2026
14 checks passed
@jan-service-account jan-service-account deleted the update-dev-from-master-2026-05-14-01-09 branch May 14, 2026 01:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.