Rebase to b6298 #10

kpouget · 2025-08-29T10:02:36Z

Summary by CodeRabbit

New Features
- Diffusion-based generation: new CLI and options for steps, algorithms, visual mode, CFG, etc.
- Enhanced chat templating: supports GPT-OSS and Granite, BOS/EOS control, extra template kwargs.
- Speculative decoding improvements with draft/target vocab compatibility and replacements.
- Model conversion toolkit: end-to-end examples, logits/embeddings tools, Makefile workflows.
- Experimental WebGPU build option.
- New CANN Docker images; updated Vulkan install flow; MUSA/ROCm version bumps.
Bug Fixes
- More robust tool-call argument parsing, safer tokenization errors, improved CLS/SEP handling, NaN checks.
Documentation
- Expanded build guides (Linux, s390x, Vulkan, WebGPU), new ops support table, multimodal how-tos.
Chores
- Legacy Makefile removed (use CMake). Major CI/templates cleanup.

* remove unnecessary conts and merge reshapes * restore necessary conts * merge more conts and reshapes * merge even more conts and reshapes

* chat : clarify the meaning of reasoning_format * add link to this PR

…5385) * Added VSX intrinsics for Power9+ systems Signed-off-by: mgiessing <marvin.giessing@gmail.com> * Manual unrolling for minor perf improvement Signed-off-by: mgiessing <marvin.giessing@gmail.com> * Update ggml/src/ggml-cpu/arch/powerpc/quants.c Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> --------- Signed-off-by: mgiessing <marvin.giessing@gmail.com> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

…15413) Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>

* optimize rope ops * amendment * delete trailing whitespace * change the variable name

* server : disable context shift by default ggml-ci * server : make scopr of test parameters local

Fixes #15423.

…5375)

* musa: fix build warnings Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com> * fix warning: comparison of integers of different signs: 'const int' and 'unsigned int' [-Wsign-compare] Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com> --------- Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>

* lookahead : add sample command to readme * cont : build-agnostic command

This commit removes the content from the Makefile and updates the current deprecation message to information that `make` has been replaced by CMake instead. The message when `make` is invoked will now be the following: ```console $ make Makefile:6: *** Build system changed: The Makefile build has been replaced by CMake. For build instructions see: https://github.com/ggml-org/llama.cpp/blob/master/docs/build.md . Stop. ``` The motivation for this is that many, if not all targets fail to build now, after changes to the system, and `make` has also been deprected for some time now.

* Update docker.yml 修改docker.yml文件中的内容使其停止周期性的运行该workflow，如果想要运行该workflow可以手动启动 * feat:Modify the header file include path 1. There's no llava directory in the tools directory. 2. Because the command `target_include_directories(mtmd PUBLIC .)` is used in the `mtmd` CMakeLists.txt file, other targets that link against `mtmd` automatically include the `mtmd` directory as a search path for header files. Therefore, you can remove `target_include_directories(${TARGET} PRIVATE ../llava`` or use `target_include_directories(${TARGET} PRIVATE ../mtmd`` to explicitly require the `llama-server` target to use header files from `mtmd`. * Restore the docker.yml file

Signed-off-by: Jie Fu <jiefu@tencent.com>

This commit addresses an inconsistency during inference by adding a new member to the `templates_params` struct to indicate whether the chat is in inference mode. This allows the gpt-oss specific function `common_chat_params_init_gpt_oss` to check this flag and the `add_generation_prompt` flag to determine if it should replace the `<|return|>` token with the `<|end|>` token in the prompt. The motivation for this change is to ensure that the formatted prompt of past messages in `common_chat_format_single` matches the output of the formatted new message. The issue is that the gpt-oss template returns different end tags: `<|return|>` when `add_generation_prompt` is false, and `<|end|>` when `add_generation_prompt` is true. This causes the substring function to start at an incorrect position, resulting in tokenization starting with 'tart|>' instead of '<|start|>'. Resolves: ggml-org/llama.cpp#15417

These detailed strings were causing increased build time on gcc.

…eams (#15444)

…(#15346)

Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>

This commit removes references to `make` in the examples, as the build system has been updated to use CMake directly and using `make` will now generate an error since Commit 37f10f9 ("make : remove make in favor of CMake (#15449)").

* Fix webui crash after streaming * build webui

… UNUSED

…v1.1.1-remoting-0.1

…g a warning on failure

…_timer

…it reliable

…tgpu device cannot be open

…al_supports_op copied from ggml-metal

openshift-merge-robot · 2025-08-29T10:02:45Z

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

coderabbitai · 2025-08-29T10:02:48Z

Caution

Review failed

The pull request is closed.

Walkthrough

Repository-wide updates: major CI and GitHub workflow removals; new multi-stage Dockerfile for CANN plus container tweaks; Makefile disabled in favor of CMake with new presets; substantial CLI/API additions for chat templating, speculative decoding, diffusion, finetune/lr; new diffusion and model-conversion examples; multiple docs expansions; assorted config/script cleanups.

Changes

Cohort / File(s)	Summary
Build system overhaul `Makefile`, `CMakeLists.txt`, `CMakePresets.json`	Makefile replaced with hard error directing to CMake; CMake logs build type and adjusts install version; removes deprecated KOMPUTE alias; adds GCC and remoting presets.
Core params, sampling, finetune `common/common.h`, `common/common.cpp`	Adds diffusion, finetune LR scheduling, optimizer hooks, new params (api_prefix, cls_sep, kv_unified, no_extra_bufts, etc.); preserves non-printables; updates EOG bias handling; tokenization error guard; new helpers and LR logic.
CLI args and parsing `common/arg.cpp`	Centralizes tensor-buffer override parsing; new draft/main CPU-MoE controls; diffusion CLI options; server/API prefix, cls separator, template kwargs; init lr.
Chat templating and formats `common/chat.h`, `common/chat.cpp`, `common/chat-parser.cpp`	Extends chat API with Granite/GPT-OSS, BOS/EOS controls, template kwargs, extra context; robust tool_call arguments parsing; reasoning format helper.
Speculative decoding API `common/speculative.h`, `common/speculative.cpp`	Dual-context init and drafting; replacement mapping API; handles cross-vocab drafting and retokenization.
JSON schema grammar util `common/json-schema-to-grammar.cpp`	Migrates to std::string_view; removes local shim.
Conversion tools `convert_hf_to_gguf_update.py`, `convert_lora_to_gguf.py`	Adds --check-missing, model lists, stricter tokenizer checks, tolerant HF token; minor load_hparams signature usage change.
Examples: diffusion `examples/diffusion/*`	New diffusion CLI target, README, and implementation with algorithms, schedules, CFG, visualization.
Examples: model-conversion `examples/model-conversion/*`, `examples/CMakeLists.txt`	Adds logits tool, Makefile-driven workflows, scripts for run/convert/compare, requirements; integrates example subdirs.
Examples: misc fixes `examples/embedding/embedding.cpp`, `examples/eval-callback/eval-callback.cpp`, `examples/*.{sh}`, `examples/batched.swift/README.md`, `examples/lookahead/README.md`, `examples/llama.vim`	Embedding cls_sep handling and KV unify; eval-callback i64/NaN guard; shebangs to env bash; doc tweaks.
DevOps: Dockerfiles `.devops/cann.Dockerfile`, `.devops/cpu.Dockerfile`, `.devops/cuda.Dockerfile`, `.devops/musa.Dockerfile`, `.devops/rocm.Dockerfile`, `.devops/vulkan.Dockerfile`	New multi-stage CANN images (full/light/server); CPU build unifies arch handling; CUDA pip adds --break-system-packages; MUSA/ROCm version bumps; Vulkan switches to manual SDK install with env setup.
DevOps: scripts & nix `.devops/tools.sh`, `.devops/nix/package.nix`, `build.remoting.sh`, `build.backend.sh`, `ci/run.sh`, `ci/README.md`	Shebangs via env; Nix optionalAttrs for ROCm; new remoting build scripts; run.sh adds WebGPU flag and wget tweak; MUSA tag update in CI doc.
CI/workflows removal `.github/workflows/*`	Removes CI pipelines for build, release, server tests, cross, docker publish, python checks, labeler, editorconfig, bench, winget, close-issue.
GitHub templates/actions removal `.github/ISSUE_TEMPLATE/`, `.github/actions/`, `.github/labeler.yml`, `.github/pull_request_template.md`	Deletes multiple issue templates, composite actions (tag-name, windows setups), labeler config, and a line from PR template; clears issue config.
Configs & ownership `.clang-format`, `.gitignore`, `.gitmodules`, `CODEOWNERS`, `OWNERS`	Adjusts argument bin-pack and include categories; tracks models/templates; removes kompute submodule; updates code owners; adds OWNERS file.
Docs updates `docs/*`	Broad build doc revisions (Vulkan/WebGPU, CUDA notes, curl deps), backend docs tweaks (CANN/SYCL), s390x guide expansion, Docker images list, multimodal pages (MiniCPM, Voxtral), new ops matrix, model-adding HOWTO overhaul.
DevOps pipeline removal `.devops/cloud-v-pipeline`	Removes Jenkins-style RISC-V cross-compile/run pipeline.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant User
  participant CLI as llama-diffusion-cli
  participant Llama as llama_model/ctx
  Note over CLI: Parse args (diffusion, CFG, schedule)
  User->>CLI: Run with model + prompt
  CLI->>Llama: load_model()
  CLI->>Llama: create_context()
  CLI->>Llama: tokenize(prompt)
  loop steps
    CLI->>Llama: eval(masked tokens)
    Llama-->>CLI: logits
    CLI->>CLI: score + select (alg/schedule)
    alt CFG enabled
      CLI->>Llama: eval(uncond)
      Llama-->>CLI: logits_uncond
      CLI->>CLI: combine(logits, logits_uncond, cfg_scale)
    end
    CLI->>CLI: update masked tokens
  end
  CLI-->>User: Print generated text

sequenceDiagram
  autonumber
  participant App
  participant Spec as common_speculative
  participant Tgt as ctx_tgt
  participant Dft as ctx_dft
  App->>Spec: init(ctx_tgt, ctx_dft)
  App->>Spec: add_replacement(src→dst)
  App->>Spec: gen_draft(params, prompt_tgt, id_last_tgt)
  alt vocab compatible
    Spec->>Dft: decode(prompt_tgt)
  else incompatible
    Spec->>Spec: retokenize tgt→dft
    Spec->>Dft: decode(prompt_dft)
    Spec->>Spec: map draft tokens dft→tgt
  end
  Spec-->>App: draft tokens (≤ n_draft)

sequenceDiagram
  autonumber
  participant Dev as docker build
  participant Build as CANN build stage
  participant Base as CANN base
  participant Final as Image target
  Dev->>Build: FROM ${CANN_BASE_IMAGE}
  Build->>Build: setup toolchain/env
  Build->>Build: cmake -DGGML_CANN=ON -DSOC_TYPE=...
  Build->>Build: make -j
  Build->>Build: collect /app/lib and /app/full
  Dev->>Base: FROM ${CANN_BASE_IMAGE}
  Base->>Base: set runtime env
  Base->>Final: COPY from build (libs/full)
  alt target=full
    Final-->>Dev: ENTRYPOINT /app/tools.sh
  else target=light
    Final-->>Dev: ENTRYPOINT /app/llama-cli
  else target=server
    Final-->>Dev: ENTRYPOINT /app/llama-server (+healthcheck)
  end

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120+ minutes

Possibly related PRs

OWNERS: add file for openshift CI #3 — Introduces the same OWNERS file structure; matches ownership configuration changes here.
remoting: improve the failure when no virtgpu is available #6 — Modifies/adds remoting build/run scripts similar to new build.remoting.sh and backend build changes.
OWNERS: Update #8 — Adjusts OWNERS metadata; directly related to ownership/governance updates.

Suggested reviewers

cfergeau
praveenkumar

Poem

Bunny taps the CMake drum,
CI clouds fade—new builds hum.
Diffuse a dream, draft in flight,
Granite chats by candlelight.
Docker carrots, Vulkan stew—
Hop, compile, then run anew.
Byte by byte, we chew. 🥕🐇

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.

✨ Finishing Touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbit in a new review comment at the desired location with your query.
PR comments: Tag @coderabbit in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbit gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbit read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbit help to get the list of available commands.

Other keywords and placeholders

Add @coderabbit ignore or @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbit summary or @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbit or @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

Visit our Status Page to check the current availability of CodeRabbit.
Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

openshift-ci · 2025-08-29T10:02:51Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign gbraad for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

…16038) Initalizing RESERVED_NAME in is_reserved_name() is not thread safe and leads to corrupted memory when used from multiple threads as can be seen in the asan trace below. This fixes the initialization to make it thread-safe. #0 0x000100abd018 in std::__1::pair<std::__1::__hash_iterator<std::__1::__hash_node<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, void*>*>, bool> std::__1::__hash_table<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::hash<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>, std::__1::equal_to<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>>::__emplace_unique_key_args<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&>(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&) __hash_table:1565 crc-org#1 0x000100ab0320 in SchemaConverter::visit(nlohmann::json_abi_v3_12_0::basic_json<nlohmann::json_abi_v3_12_0::ordered_map, std::__1::vector, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, bool, long long, unsigned long long, double, std::__1::allocator, nlohmann::json_abi_v3_12_0::adl_serializer, std::__1::vector<unsigned char, std::__1::allocator<unsigned char>>, void> const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&) json-schema-to-grammar.cpp:802 crc-org#2 0x000100aafc48 in std::__1::__function::__func<build_grammar(std::__1::function<void (common_grammar_builder const&)> const&, common_grammar_options const&)::$_2, std::__1::allocator<build_grammar(std::__1::function<void (common_grammar_builder const&)> const&, common_grammar_options const&)::$_2>, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> (std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, nlohmann::json_abi_v3_12_0::basic_json<nlohmann::json_abi_v3_12_0::ordered_map, std::__1::vector, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, bool, long long, unsigned long long, double, std::__1::allocator, nlohmann::json_abi_v3_12_0::adl_serializer, std::__1::vector<unsigned char, std::__1::allocator<unsigned char>>, void> const&)>::operator()(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, nlohmann::json_abi_v3_12_0::basic_json<nlohmann::json_abi_v3_12_0::ordered_map, std::__1::vector, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, bool, long long, unsigned long long, double, std::__1::allocator, nlohmann::json_abi_v3_12_0::adl_serializer, std::__1::vector<unsigned char, std::__1::allocator<unsigned char>>, void> const&) function.h:319 crc-org#3 0x000100a2c938 in std::__1::__function::__func<common_chat_params_init_llama_3_x(minja::chat_template const&, templates_params const&, bool)::$_0::operator()(common_grammar_builder const&) const::'lambda'(nlohmann::json_abi_v3_12_0::basic_json<nlohmann::json_abi_v3_12_0::ordered_map, std::__1::vector, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, bool, long long, unsigned long long, double, std::__1::allocator, nlohmann::json_abi_v3_12_0::adl_serializer, std::__1::vector<unsigned char, std::__1::allocator<unsigned char>>, void> const&), std::__1::allocator<common_chat_params_init_llama_3_x(minja::chat_template const&, templates_params const&, bool)::$_0::operator()(common_grammar_builder const&) const::'lambda'(nlohmann::json_abi_v3_12_0::basic_json<nlohmann::json_abi_v3_12_0::ordered_map, std::__1::vector, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, bool, long long, unsigned long long, double, std::__1::allocator, nlohmann::json_abi_v3_12_0::adl_serializer, std::__1::vector<unsigned char, std::__1::allocator<unsigned char>>, void> const&)>, void (nlohmann::json_abi_v3_12_0::basic_json<nlohmann::json_abi_v3_12_0::ordered_map, std::__1::vector, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, bool, long long, unsigned long long, double, std::__1::allocator, nlohmann::json_abi_v3_12_0::adl_serializer, std::__1::vector<unsigned char, std::__1::allocator<unsigned char>>, void> const&)>::operator()(nlohmann::json_abi_v3_12_0::basic_json<nlohmann::json_abi_v3_12_0::ordered_map, std::__1::vector, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, bool, long long, unsigned long long, double, std::__1::allocator, nlohmann::json_abi_v3_12_0::adl_serializer, std::__1::vector<unsigned char, std::__1::allocator<unsigned char>>, void> const&) function.h:319 crc-org#4 0x000100a139f8 in foreach_function(nlohmann::json_abi_v3_12_0::basic_json<nlohmann::json_abi_v3_12_0::ordered_map, std::__1::vector, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, bool, long long, unsigned long long, double, std::__1::allocator, nlohmann::json_abi_v3_12_0::adl_serializer, std::__1::vector<unsigned char, std::__1::allocator<unsigned char>>, void> const&, std::__1::function<void (nlohmann::json_abi_v3_12_0::basic_json<nlohmann::json_abi_v3_12_0::ordered_map, std::__1::vector, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, bool, long long, unsigned long long, double, std::__1::allocator, nlohmann::json_abi_v3_12_0::adl_serializer, std::__1::vector<unsigned char, std::__1::allocator<unsigned char>>, void> const&)> const&) chat.cpp:762 crc-org#5 0x000100a2a7f4 in std::__1::__function::__func<common_chat_params_init_llama_3_x(minja::chat_template const&, templates_params const&, bool)::$_0, std::__1::allocator<common_chat_params_init_llama_3_x(minja::chat_template const&, templates_params const&, bool)::$_0>, void (common_grammar_builder const&)>::operator()(common_grammar_builder const&) function.h:319 crc-org#6 0x000100aa98f4 in build_grammar(std::__1::function<void (common_grammar_builder const&)> const&, common_grammar_options const&) json-schema-to-grammar.cpp:982 crc-org#7 0x0001009c9314 in common_chat_params_init_llama_3_x(minja::chat_template const&, templates_params const&, bool) chat.cpp:1110 crc-org#8 0x0001009b8afc in common_chat_templates_apply_jinja(common_chat_templates const*, common_chat_templates_inputs const&) chat.cpp:1992 crc-org#9 0x0001009b533c in common_chat_templates_apply(common_chat_templates const*, common_chat_templates_inputs const&) chat.cpp:2074 crc-org#10 0x000100810120 in llamacpp_apply_chat_template+0x724 (predict_oai-98384e17fb94e863:arm64+0x100090120) ... ==45482==Register values: x[0] = 0x00006020004147f8 x[1] = 0x00006080000013c8 x[2] = 0x0000000000000000 x[3] = 0x0000604006289738 x[4] = 0x0000000000000002 x[5] = 0x0000000000000001 x[6] = 0x04034000004b4000 x[7] = 0x0000000000000001 x[8] = 0xbebebebebebebebe x[9] = 0x17d7d7d7d7d7d7d7 x[10] = 0x00000c04000828ff x[11] = 0x0000000000000001 x[12] = 0x000000002018d383 x[13] = 0x0000000000000000 x[14] = 0xfa0000000000fafa x[15] = 0x000010700001ffff x[16] = 0x000000019dc012c0 x[17] = 0x00000001021284f8 x[18] = 0x0000000000000000 x[19] = 0x00000001700acdc0 x[20] = 0x0000000000000002 x[21] = 0x000000002018d384 x[22] = 0x16dd16fd2e731151 x[23] = 0x0000007000020000 x[24] = 0x0000000100c69c08 x[25] = 0x0000000100c69c20 x[26] = 0x00006080000013c7 x[27] = 0x0000000100c69c00 x[28] = 0x00000001700acd60 fp = 0x00000001700aceb0 lr = 0x0000000100abce30 sp = 0x00000001700acd60 AddressSanitizer can not provide additional info. SUMMARY: AddressSanitizer: SEGV __hash_table:1565 in std::__1::pair<std::__1::__hash_iterator<std::__1::__hash_node<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, void*>*>, bool> std::__1::__hash_table<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::hash<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>, std::__1::equal_to<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>>::__emplace_unique_key_args<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&>(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&) Thread T5 created by T0 here: #0 0x0001020b99d4 in pthread_create+0x5c (libclang_rt.asan_osx_dynamic.dylib:arm64e+0x359d4) crc-org#1 0x000100873910 in std::sys::pal::unix::thread::Thread::new::h77254fdd87a28e05+0x118 (predict_oai-98384e17fb94e863:arm64+0x1000f3910) crc-org#2 0x0001007c7a1c in test::run_test::haeb3c2bcd5ed6cf6+0x76c (predict_oai-98384e17fb94e863:arm64+0x100047a1c) crc-org#3 0x0001007aedb0 in test::console::run_tests_console::he9d142d704f3a986+0x149c (predict_oai-98384e17fb94e863:arm64+0x10002edb0) crc-org#4 0x0001007c5758 in test::test_main::hf86a5e20735245b9+0x118 (predict_oai-98384e17fb94e863:arm64+0x100045758) crc-org#5 0x0001007c5da0 in test::test_main_static::h61ee9c8fd30abca0+0x54 (predict_oai-98384e17fb94e863:arm64+0x100045da0) ... ==45482==ABORTING

CISC and others added 30 commits August 18, 2025 19:30

llama : merge conts and reshapes and remove unnecessary cont (#15380)

baa9255

* remove unnecessary conts and merge reshapes * restore necessary conts * merge more conts and reshapes * merge even more conts and reshapes

scripts : update sync scripts

f0c541d

sync : ggml

60212f1

codeowners : remove mmv.*

6d7f111

mtmd : clean up clip_n_output_tokens (#15391)

f08c4c0

batched-bench : use rand tokens (#15398)

f0d3c74

server : remove swa_full warning (#15399)

9d262f4

chat : clarify the meaning of reasoning_format (#15408)

e9288e8

* chat : clarify the meaning of reasoning_format * add link to this PR

musa: handle __hgt2_mask, available starting from MUSA SDK rc4.3.0 (#…

67f09a3

…15413) Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>

CANN: optimize rope operator (#15335)

a6d3cfe

* optimize rope ops * amendment * delete trailing whitespace * change the variable name

server : disable context shift by default (#15416)

d2fcd91

* server : disable context shift by default ggml-ci * server : make scopr of test parameters local

common : Add top-nsigma sampler to help globally (#15428)

1e19f5d

Fixes #15423.

model : add gpt-oss type strings (#15424)

9ef6b0b

opencl: mark argsort unsupported if cols exceed workgroup limit (#1…

fb22dd0

…5375)

lookahead : add sample command to readme (#15447)

2f37014

* lookahead : add sample command to readme * cont : build-agnostic command

common : fix context shift help message (#15448)

ec5ab1a

Signed-off-by: Jie Fu <jiefu@tencent.com>

vulkan: shorten pipeline name strings (#15431)

fec9519

These detailed strings were causing increased build time on gcc.

CUDA: replace GGML_CUDA_F16 with CUDA arch checks (#15433)

7a6e91a

CUDA: refactor FA support/selection code (#15454)

13aeb7a

server: fix OpenAI API compatibility for usage statistics in chat str…

1bc664a

…eams (#15444)

sched : copy only the used experts when offloading prompt processing …

5682a37

…(#15346)

musa: add GGML_UNUSED_VARS (#15446)

8ad038c

Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>

server : fix webui (#15462)

1b0db8f

* Fix webui crash after streaming * build webui

ggml : fix condition of im2col on Metal backend (#15460)

945e1f1

kpouget added 20 commits August 27, 2025 13:51

ggml: src: ggml-remotingbackend/backend-dispatched-metal: add missing…

2292e2e

… UNUSED

ggml-remotingbackend: allow saving the hypervisor logs to a file

28280f1

remotingbackend: update the VIRGL_APIR indexes to match virlrenderer …

516933f

…v1.1.1-remoting-0.1

remotingfrontend/virtgpu: give more time to load the libraries and lo…

87d71f2

…g a warning on failure

OWNERS: add file for openshift CI

3bd97bc

ggml-remotingbackend/shared/apir_backend: return the duration in stop…

1fa1c2d

…_timer

ggml-remotingfrontend/virtgpu: rewrite the timeout mechanism to make …

d1b255f

…it reliable

ggml-remotingfrontend: improve the timers display

38d49bd

ggml-remotingfrontend: add an ERROR log level

2a2b19b

ggml-remotingfrontend: turn some INFO logs into MESSAGE (always printed)

2ec8784

ggml-remotingfrontend: turn a INFO log into ERROR

2f8f3ee

ggml: src: ggml-remotingfrontend/virtgpu: correctly fail when the vir…

135ff21

…tgpu device cannot be open

run.remoting: update to run llama-server

ebb5fb2

OWNERS: Update

cb0fca5

remoting: improve the frontend<>backend error handling

910e3fc

remoting: improve the frontend<>backend return code exchange

3b9b455

ggml: src: ggml-remotingfrontend/virtgpu: fix typo

8725fdd

update the build scripts

871c4c8

ggml: src: ggml-remotingfrontend/ggml-metal-remoting: update ggml_met…

e5c6771

…al_supports_op copied from ggml-metal

update the build scripts

ab02d59

openshift-merge-robot added the needs-rebase label Aug 29, 2025

openshift-ci bot requested a review from praveenkumar August 29, 2025 10:02

openshift-ci bot requested a review from vyasgun August 29, 2025 10:02

kpouget closed this Aug 29, 2025

kpouget deleted the rebase-b6298 branch August 29, 2025 10:07

kpouget restored the rebase-b6298 branch August 29, 2025 10:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Rebase to b6298 #10

Rebase to b6298 #10

Uh oh!

kpouget commented Aug 29, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

openshift-merge-robot commented Aug 29, 2025

Uh oh!

coderabbitai bot commented Aug 29, 2025 •

edited

Loading

Review failed

Chat

Support

CodeRabbit Commands (Invoked using PR/Issue comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Status, Documentation and Community

Uh oh!

openshift-ci bot commented Aug 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

45 participants

Rebase to b6298 #10

Rebase to b6298 #10

Uh oh!

Conversation

kpouget commented Aug 29, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

openshift-merge-robot commented Aug 29, 2025

Uh oh!

coderabbitai bot commented Aug 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Chat

Support

CodeRabbit Commands (Invoked using PR/Issue comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Status, Documentation and Community

Uh oh!

openshift-ci bot commented Aug 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

45 participants

kpouget commented Aug 29, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Aug 29, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)