diff --git a/CLAUDE.md b/CLAUDE.md index d0ae5647..05f9f111 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -6,7 +6,7 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co Java bindings for [llama.cpp](https://github.com/ggerganov/llama.cpp) via JNI, providing a high-level API for LLM inference in Java. The Java layer communicates with a native C++ library through JNI. -Current llama.cpp pinned version: **b8778** +Current llama.cpp pinned version: **b8808** ## Upgrading CUDA Version @@ -105,33 +105,38 @@ jllama.cpp / server.hpp / utils.hpp └── mtmd-helper.h ``` -**Priority-ordered review list** (highest break risk first): +**Priority-ordered review list for upgrade diffs** (highest break risk first) + +The top 8 rows cover all known breaking changes from b5022 → b8808. +For future upgrades, provide diffs for at least these 8 files rather than the full patch. | File | What to watch for | |------|-------------------| +| `common/common.h` | `common_params`/`common_params_speculative` struct fields, `model_alias` container type, `common_init_result` shape, `build_info` symbol | +| `common/chat.h` | `common_chat_parser_params` (was `common_chat_syntax`), `to_json_oaicompat`, `common_chat_msg_diff_to_json_oaicompat`, `set_tool_call_ids` | +| `common/speculative.h` | `common_speculative_init`, `common_speculative_draft`, `common_speculative_accept` signatures, struct names | +| `tools/mtmd/mtmd.h` | `mtmd_context_params` fields, `image_marker`/`media_marker` API, deprecated symbols (was `common/mtmd.h` before ~b8190) | +| `include/llama-cpp.h` | `common_init_result_ptr` type, access pattern changes (`.get()` vs `->method()`) | +| `common/arg.h` | `n_parallel` sentinel value, what moved to `download.h` across versions | | `include/llama.h` | Core llama_ function signatures, token types, `llama_model_ptr`, renamed structs | -| `include/llama-cpp.h` | C++ smart pointer types: `llama_model_ptr`, `common_init_result_ptr`; access pattern changes (`.get()` vs `->method()`) | -| `common/common.h` | `common_params` / `common_params_speculative` struct fields, `model_alias` type, `common_init_result` shape | +| `common/download.h` | `common_remote_params` struct, `headers` field format (string vs key-value pair) | | `common/common.cpp` | Implementation of any inline API used directly | -| `common/speculative.h` | `common_speculative_init`, `common_speculative_draft`, `common_speculative_accept` signatures | | `common/speculative.cpp` | Speculative decoding implementation details | -| `common/chat.h` | `common_chat_parser_params` (was `common_chat_syntax`), `to_json_oaicompat`, `common_chat_msg_diff_to_json_oaicompat`, `set_tool_call_ids` | | `common/chat.cpp` | Chat parsing implementation | -| `common/arg.h` | Parameter parsing; check what moved to `download.h` across versions | -| `common/download.h` | `common_remote_params` struct, `headers` field format (string vs key-value pair) | | `common/sampling.h` | Sampler API, `common_sampler_*` functions | | `common/log.h` | Log macro signatures | -| `common/mtmd.h` | Multimodal API, `mtmd_init_params` fields | -| `common/mtmd-helper.h` | Multimodal helper functions | +| `tools/mtmd/mtmd-helper.h` | Multimodal helper functions | | `common/json-schema-to-grammar.h` | Grammar API | | `ggml/include/ggml.h` | `ggml_type` enum values (e.g. `GGML_TYPE_F16`), tensor primitives | | `ggml/include/ggml-backend.h` | Backend/device abstraction types | | `ggml/include/ggml-opt.h` | Optimizer params pulled in via `common.h` | -**Safe to skip** (stable leaf headers, not used directly by project code): +**Safe to skip** (have never caused a break; not used directly by project code): +`common/sampling.h`, `common/log.h`, `tools/mtmd/mtmd-helper.h`, `common/json-schema-to-grammar.h`, +`ggml/include/ggml.h`, `ggml/include/ggml-backend.h`, `ggml/include/ggml-opt.h`, `ggml-alloc.h`, `ggml-cpu.h`, `peg-parser.h`, `base64.hpp` -**Known breaking changes by version range** (b5022 → b8190): +**Known breaking changes by version range** (b5022 → b8808): | Version | File | Change | |---------|------|--------| @@ -145,6 +150,7 @@ jllama.cpp / server.hpp / utils.hpp | ~b7858–b7864 | `server.hpp` (internal) | `slot_action.slot_id` → `slot_action.id_slot`; `llama_init_dft` removed from `server_context`; `model_dft` changed from `llama_model*` to `llama_model_ptr`; `slot.ctx_tgt`/`ctx_dft` removed | | ~b7864 | `common/mtmd.h` | `mtmd_init_params.verbosity` field removed | | ~b7904–b8190 | `common/common.h` | `params_base.model_alias` changed from `std::string` to a container; use `*model_alias.begin()` instead of direct string cast | +| ~b8778–b8808 | `tools/mtmd/mtmd.h` | `MTMD_DEFAULT_IMAGE_MARKER` macro removed; `mtmd_image_tokens_get_nx/ny` deprecated; new `mtmd_decoder_pos` struct + `mtmd_image_tokens_get_decoder_pos()`; `mtmd_context_params_default()` now sets `image_marker = nullptr` (throws `"custom image_marker is not supported anymore"` if non-null); upstream server adds randomized `get_media_marker()` in `server-common.h` — our `server.hpp` is unaffected since it does not include that header and uses `mtmd_default_marker()` consistently | ## Build Commands diff --git a/CMakeLists.txt b/CMakeLists.txt index c1eec724..fe5482e1 100644 --- a/CMakeLists.txt +++ b/CMakeLists.txt @@ -97,7 +97,7 @@ set(GGML_AVX512 OFF CACHE BOOL "" FORCE) FetchContent_Declare( llama.cpp GIT_REPOSITORY https://github.com/ggerganov/llama.cpp.git - GIT_TAG b8778 + GIT_TAG b8808 ) FetchContent_MakeAvailable(llama.cpp) diff --git a/README.md b/README.md index 53f5de7d..1f157d59 100644 --- a/README.md +++ b/README.md @@ -1,5 +1,5 @@ ![Java 8+](https://img.shields.io/badge/Java-8%2B-informational) -[![llama.cpp b8778](https://img.shields.io/badge/llama.cpp-%23b8778-informational)](https://github.com/ggml-org/llama.cpp/releases/tag/b8778) +[![llama.cpp b8808](https://img.shields.io/badge/llama.cpp-%23b8808-informational)](https://github.com/ggml-org/llama.cpp/releases/tag/b8808) # Java Bindings for [llama.cpp](https://github.com/ggerganov/llama.cpp)