Claude/identify cpp boilerplate xvi2a#76
Merged
bernardladenthin merged 34 commits intomasterfrom Apr 5, 2026
Merged
Conversation
… helpers Finding 1 (pointer extraction): - Add get_jllama_context_impl() to jni_helpers.hpp beside get_server_context_impl(), documenting the deliberate difference: returns the outer jllama_context* (not .server), and does NOT throw on null (delete no-op is valid). - Add get_jllama_context() convenience wrapper in jllama.cpp namespace. - Replace raw GetLongField + reinterpret_cast in Java_..._delete with get_jllama_context(), removing the only raw extraction outside helpers. Finding 2 (repeated catch blocks): - Add throw_invalid_request(env, e) static helper in jllama.cpp namespace. - Replace 8 verbatim 3-line catch blocks across requestCompletion, handleChatCompletions (×2), requestChatCompletion (×2), handleCompletions, handleCompletionsOai, handleInfill with a single call to the helper. Tests: - Add 4 new GoogleTest cases in test_jni_helpers.cpp for get_jllama_context_impl: null-handle returns nullptr without throw, valid handle returns wrapper not inner server, and a contract-comparison test that asserts the two helpers differ on null-handle throw behaviour. https://claude.ai/code/session_01FuMTQYYhDX9bPB8WPNwGda
Extract the verbatim 13-line result-collection loop that appeared identically in handleCompletions, handleCompletionsOai, and handleInfill into a single static collect_task_results() helper in jllama.cpp. The helper takes (env, ctx_server, task_ids, out): - iterates, receiving one server_task_result_ptr per id - on error: cleans up waiting ids, throws via JNI, returns false - on success: fills out, cleans up waiting ids, returns true Callers reduce to a single guarded line: if (!collect_task_results(env, ctx_server, task_ids, results)) return nullptr; Tests: - test_server.cpp: documents why a C++ unit test is not feasible (anonymous-namespace linkage + server.hpp dependency) and where coverage actually comes from (LlamaModelTest success path; ErrorHandlingTest error path). - ErrorHandlingTest.java: three new tests covering the std::exception catch guard that sits immediately before collect_task_results is reached — missing "prompt" in handleCompletions, handleCompletionsOai, and a well-formed handleInfill (confirms no uncaught C++ exception escapes the JNI boundary on the refactored path). https://claude.ai/code/session_01FuMTQYYhDX9bPB8WPNwGda
The previous implementation buried collect_task_results in jllama.cpp's anonymous namespace and excused the lack of unit tests with a comment. That is a design smell: testability should be a first-class concern. Fix: introduce src/main/cpp/jni_server_helpers.hpp — a new header with server.hpp in its include surface (distinct from the server-free jni_helpers.hpp). collect_task_results_impl() lives there, taking explicit (server_response&, jclass) parameters so it is fully testable without module-level globals. The thin collect_task_results() wrapper in jllama.cpp's namespace supplies ctx_server->queue_results and c_llama_error as before; call sites are unchanged. Tests (test_jni_server_helpers.cpp, registered in CMakeLists.txt): - single success result → out filled, no throw, returns true - single error result → ThrowNew called with correct message, returns false - multiple success results → all collected, returns true - first ok / second error → stops on error, throws - success path: waiting_task_ids cleared after collect - error path: waiting_task_ids cleared after error Tests use a pre-seeded server_response (send() before recv() avoids blocking) and the same mock JNIEnv stub pattern as test_jni_helpers.cpp. Also removes the "NOT covered here" apology comment from test_server.cpp. https://claude.ai/code/session_01FuMTQYYhDX9bPB8WPNwGda
server.hpp has no #pragma once or header guard, so including it from jni_server_helpers.hpp caused redefinition errors in jllama.cpp which also includes server.hpp directly. Fix: remove #include "server.hpp" from jni_server_helpers.hpp and document the required include order in a header comment. Update test_jni_server_helpers.cpp to include server.hpp explicitly before jni_server_helpers.hpp. https://claude.ai/code/session_01FuMTQYYhDX9bPB8WPNwGda
….cpp jni_server_helpers.hpp requires server.hpp to be included first. Move #include "jni_server_helpers.hpp" to after #include "server.hpp". https://claude.ai/code/session_01FuMTQYYhDX9bPB8WPNwGda
oaicompat_completion_params_parse() throws std::runtime_error when "prompt" is absent. In handleCompletionsOai the call was outside any try-catch block, so the exception escaped the JNI boundary and called std::terminate, crashing the JVM. Wrap the call in a try-catch using throw_invalid_request, matching the identical guard pattern already present in handleChatCompletions and requestChatCompletion. The bug was pre-existing; testHandleCompletionsOai MissingPromptThrows in ErrorHandlingTest exposed it. https://claude.ai/code/session_01FuMTQYYhDX9bPB8WPNwGda
… loops
The 17-line task-construction try-block (tokenise prompt, loop to build
server_task per token sequence, set type/oaicompat/id fields) was
copy-pasted across six JNI functions with only the task_type and
oaicompat enum values differing.
Fix: add build_completion_tasks_impl() to jni_server_helpers.hpp — same
header / same _impl+wrapper pattern as collect_task_results_impl.
The thin build_completion_tasks() wrapper in jllama.cpp supplies
c_llama_error so call sites stay clean.
Six call sites replaced (one line each):
requestCompletion → type (COMPLETION or INFILL) / NONE
handleChatCompletions → COMPLETION / CHAT
requestChatCompletion → COMPLETION / NONE (template pre-applied, comment kept)
handleCompletions → COMPLETION / NONE
handleCompletionsOai → COMPLETION / COMPLETION
handleInfill → INFILL / NONE
The original analysis listed 5 functions; requestCompletion was also
using the identical block and is included here.
Tests (test_jni_server_helpers.cpp):
- MissingPrompt_ReturnsFalseAndThrows: data without "prompt" → ThrowNew,
returns false, tasks empty. ctx_server=nullptr is safe because
data.at("prompt") is evaluated in its own statement before any
ctx_server member is accessed.
- MissingPrompt_TaskTypeDoesNotAffectErrorBehaviour: same with INFILL type.
Success path requires a live model and is covered by LlamaModelTest.
https://claude.ai/code/session_01FuMTQYYhDX9bPB8WPNwGda
…tion recv blocks Each of the four handleSlotAction switch cases (LIST/SAVE/RESTORE/ERASE) contained the same 9-line recv → check → return block. Extract the pattern into recv_slot_task_result_impl (jni_server_helpers.hpp) and a thin wrapper recv_slot_task_result (jllama.cpp), then replace all four duplicates with a single call. Unit tests added in test_jni_server_helpers.cpp: - success path returns non-null jstring and calls NewStringUTF with result JSON - error path returns nullptr and calls ThrowNew with the error message - waiting task id is removed from the queue on both success and error paths Requires stub_NewStringUTF in the mock JNI table (added alongside the tests). https://claude.ai/code/session_01FuMTQYYhDX9bPB8WPNwGda
All five helper functions whose return value encodes success/failure or carries
a non-owning pointer must not have their result silently discarded. Adding
[[nodiscard]] turns an accidental discard into a compile-time diagnostic:
jni_helpers.hpp:
get_server_context_impl — returns nullptr on error (exception pending)
get_jllama_context_impl — returns nullptr when handle is 0
jni_server_helpers.hpp:
build_completion_tasks_impl — returns false on error (exception pending)
recv_slot_task_result_impl — returns nullptr on error (exception pending)
collect_task_results_impl — returns false on error (exception pending)
jllama.cpp (wrappers):
get_server_context, get_jllama_context, build_completion_tasks,
recv_slot_task_result, collect_task_results — all marked [[nodiscard]]
No behaviour change. C++17 is already required by the build (cxx_std_17 on
jllama_test; jllama itself uses cxx_std_11 but [[nodiscard]] is C++17 and GCC/
Clang accept it silently as an extension under -std=c++11 with no errors).
https://claude.ai/code/session_01FuMTQYYhDX9bPB8WPNwGda
…c_log_format delete JNI_OnLoad creates a NewGlobalRef for both c_log_level (line 335) and c_log_format (line 336). JNI_OnUnload had two consecutive DeleteGlobalRef calls for c_log_level and none for c_log_format: - double-deleting a global ref is undefined behaviour (JNI spec §4.4) - c_log_format was leaked on every library unload Fix: replace the duplicate c_log_level line with c_log_format so each of the 14 global class refs created in JNI_OnLoad is deleted exactly once in JNI_OnUnload. https://claude.ai/code/session_01FuMTQYYhDX9bPB8WPNwGda
During Finding 3, three handlers were updated to use collect_task_results(). handleChatCompletions was overlooked and still contained the identical 13-line manual recv loop. Replace it with the same collect_task_results() call used by handleCompletions, handleCompletionsOai, and handleInfill. No behaviour change; collect_task_results_impl is already unit-tested. https://claude.ai/code/session_01FuMTQYYhDX9bPB8WPNwGda
…ng+json::parse pairs Every JNI handler that receives a jstring parameter begins with the same two lines: std::string c_params = parse_jstring(env, jparams); json data = json::parse(c_params); Extract to a one-liner helper parse_json_params(JNIEnv*, jstring) placed immediately after parse_jstring in the anonymous namespace, so the related helpers are co-located. Replace all 9 call sites: requestCompletion, applyTemplate, handleChatCompletions, requestChatCompletion, handleCompletions, handleCompletionsOai, handleInfill, handleEmbeddings, configureParallelInference The ordered_json variant in jsonSchemaToGrammarBytes uses a different type and is intentionally left unchanged. https://claude.ai/code/session_01FuMTQYYhDX9bPB8WPNwGda
…t_list_id+post triples Every batch dispatch point in the file repeats the same three lines: ctx_server->queue_results.add_waiting_tasks(tasks); auto task_ids = server_task::get_list_id(tasks); ctx_server->queue_tasks.post(std::move(tasks)); Extract to dispatch_tasks(server_context*, vector<server_task>&) placed between build_completion_tasks and recv_slot_task_result, maintaining the natural pipeline order: build → dispatch → recv. The single function covers all call sites — completion (requestCompletion, handleChatCompletions, requestChatCompletion, handleCompletions, handleCompletionsOai, handleInfill) and embedding/rerank (embed, handleRerank, handleEmbeddings) — because the three-line body is identical in all nine cases. Side-fix: handleEmbeddings loop bound changed from tasks.size() to task_ids.size() — tasks is in a moved-from state after dispatch_tasks, so reading its size was technically UB; task_ids has the same count and is valid. https://claude.ai/code/session_01FuMTQYYhDX9bPB8WPNwGda
…lisation blocks
Four handlers (handleChatCompletions, handleCompletions, handleCompletionsOai,
handleInfill) each ended with an identical 9-line block that assembles a JSON
response from a results vector and returns it as a jstring:
json response;
if (results.size() == 1) { response = results[0]->to_json(); }
else { response = json::array(); for (...) response.push_back(...); }
std::string s = response.dump();
return env->NewStringUTF(s.c_str());
Extract to results_to_jstring_impl (jni_server_helpers.hpp) with a thin
results_to_jstring wrapper in jllama.cpp placed immediately after
collect_task_results, keeping collect → serialise helpers adjacent.
The helper is NOT applied to receiveCompletionJson (streaming path that also
sets a "stop" field) or handleRerank (builds a different JSON schema directly
into a results_json array) — those retain their own serialisation logic.
Unit tests added in test_jni_server_helpers.cpp:
- single result → bare JSON object with correct content field
- multiple results → JSON array with correct elements in order
- empty vector → empty JSON array (degenerate, documents the contract)
https://claude.ai/code/session_01FuMTQYYhDX9bPB8WPNwGda
The embed function had three code-quality issues left from an earlier refactor:
1. `json responses = json::array();` — declared after dispatch_tasks but
never read; the function builds no array of responses.
2. `json error = nullptr;` — declared but never assigned or read.
3. `result->to_json()` was called three times (lines 727, 729, 739) for the
same result object. Lines 727 (`json response_str`) and 739 (`out_res`)
both captured the return value; line 729 called it a third time inline
just to read ["message"]. Removed the unused `response_str` alias so
`out_res` is now the single authoritative capture.
No behaviour change.
https://claude.ai/code/session_01FuMTQYYhDX9bPB8WPNwGda
…uard blocks
requestCompletion and requestChatCompletion both ended with the same 5-line
pattern that verifies exactly one task was produced and returns its id:
if (task_ids.size() != 1) {
env->ThrowNew(c_llama_error, "multitasking currently not supported");
return 0;
}
return *task_ids.begin();
Extract to require_single_task_id_impl (jni_helpers.hpp) with the _impl /
wrapper convention used by the other helpers, placed directly after
get_jllama_context_impl so all handle/id validation helpers are co-located.
Thin wrapper require_single_task_id in jllama.cpp supplies c_llama_error.
Unit tests added to test_jni_helpers.cpp:
- exactly one id → returns that id, no throw
- empty set → returns 0, ThrowNew("multitasking currently not supported")
- multiple ids → returns 0, ThrowNew("multitasking currently not supported")
https://claude.ai/code/session_01FuMTQYYhDX9bPB8WPNwGda
…in 2 functions
decodeBytes and handleDetokenize both contained the same 5-line conditional:
if (!ctx_server->is_vocab_only()) {
text = tokens_to_str(ctx_server->ctx, tokens.cbegin(), tokens.cend());
} else {
text = tokens_to_str(ctx_server->vocab, tokens.cbegin(), tokens.cend());
}
Extract to a static detokenize(server_context*, vector<llama_token>&) helper
placed immediately before decodeBytes so the helper and its two call sites are
all adjacent.
Side-clean in decodeBytes: ReleaseIntArrayElements now comes before the
detokenize call (tokens vector already copied), which was the correct order
before the change; moving it next to the array acquisition makes the lifetime
clearer.
No behaviour change. No new tests: the dispatch itself depends on the
server_context runtime state (is_vocab_only) and is covered by the existing
integration test suite.
https://claude.ai/code/session_01FuMTQYYhDX9bPB8WPNwGda
…+post blocks in handleSlotAction Each of the four handleSlotAction switch cases (LIST/SAVE/RESTORE/ERASE) ended with the same 4-line pattern: int tid = task.id; ctx_server->queue_results.add_waiting_task_id(tid); ctx_server->queue_tasks.post(std::move(task)[, priority]); return recv_slot_task_result(env, ctx_server, tid); Extract to dispatch_single_task(server_context*, server_task&, bool priority) placed immediately after dispatch_tasks so both single-task and batch dispatch helpers are adjacent. The priority parameter (default false) covers case 0 (LIST/metrics), which uses post(task, true). All four cases now resolve to task construction + one dispatch_single_task call wrapped in recv_slot_task_result, eliminating the repeated bookkeeping. No behaviour change; no new unit tests (the dispatch itself is integration- level, exercised by the existing slot-action Java tests). https://claude.ai/code/session_01FuMTQYYhDX9bPB8WPNwGda
…ch blocks
handleChatCompletions and requestChatCompletion both contained the same 9-line
try/catch for parsing the OAI chat body:
json data;
try {
std::vector<raw_buffer> files;
data = oaicompat_chat_params_parse(body, ctx_server->oai_parser_opt, files);
} catch (const std::exception &e) {
throw_invalid_request(env, e);
return [nullptr | 0];
}
The only difference was the error sentinel (nullptr vs 0). Extract to
parse_oai_chat_params(env, ctx_server, body, json& out) → bool, placed
immediately after throw_invalid_request so both error-handling helpers are
adjacent. Callers become:
json data;
if (!parse_oai_chat_params(env, ctx_server, body, data)) return [sentinel];
No behaviour change; no new unit tests (the underlying oaicompat_chat_params_
parse is llama.cpp internals; error-path coverage is provided by the existing
Java integration tests).
https://claude.ai/code/session_01FuMTQYYhDX9bPB8WPNwGda
…odiscard casts, applyTemplate guard
Three issues from the CI build:
1. Build error: oaicompat_chat_params_parse takes json& (non-const) but
parse_oai_chat_params declared its body parameter as const json&, causing
a qualifier-discard error. Fix: remove const from the body parameter.
2. applyTemplate called oaicompat_chat_params_parse without a try/catch,
leaving parse errors to escape the JNI boundary. Now routes through
parse_oai_chat_params, matching handleChatCompletions and
requestChatCompletion. Also removes the redundant jtok_str local.
3. Four test call sites deliberately discard [[nodiscard]] return values
(the test only cares about the side-effects, not the return). Add (void)
casts to silence the -Wunused-result warnings:
- test_jni_helpers.cpp: 2 × get_server_context_impl
- test_jni_server_helpers.cpp: 2 × collect_task_results_impl,
2 × recv_slot_task_result_impl
https://claude.ai/code/session_01FuMTQYYhDX9bPB8WPNwGda
The five-line commented block (duplicate of the active LOG_INF call immediately above it) served no purpose and would never be re-enabled in its current form. Deleted to reduce noise. https://claude.ai/code/session_01FuMTQYYhDX9bPB8WPNwGda
…UTF pattern Five JNI functions (receiveCompletionJson, handleRerank, handleEmbeddings, handleTokenize, handleDetokenize) each contained the identical two-liner: std::string response_str = some_json.dump(); return env->NewStringUTF(response_str.c_str()); Extracted to json_to_jstring_impl in jni_server_helpers.hpp (placed next to results_to_jstring_impl, the analogous helper for result vectors) and a thin json_to_jstring wrapper in jllama.cpp. All five call sites updated. Three unit tests added in test_jni_server_helpers.cpp. https://claude.ai/code/session_01FuMTQYYhDX9bPB8WPNwGda
The SAVE (case 1) and RESTORE (case 2) branches of handleSlotAction were identical 8-line blocks differing only in task type and error message. Extracted to exec_slot_file_task(), placed next to dispatch_single_task so the two dispatch helpers sit together. Each case now reduces to a single call expression. https://claude.ai/code/session_01FuMTQYYhDX9bPB8WPNwGda
…string pipeline Four JNI handlers (handleChatCompletions, handleCompletions, handleCompletionsOai, handleInfill) each contained the identical four-line pipeline: std::vector<server_task_result_ptr> results; results.reserve(task_ids.size()); if (!collect_task_results(...)) return nullptr; return results_to_jstring(...); Extracted to collect_and_serialize(), placed next to collect_task_results and results_to_jstring so the three serialisation helpers are co-located. All four call sites updated to a single-expression return. https://claude.ai/code/session_01FuMTQYYhDX9bPB8WPNwGda
…sult exec_slot_file_task called parse_jstring and recv_slot_task_result but was placed before both definitions, causing 'use of undeclared identifier' errors on Clang (macOS CI). Moved to after parse_json_params where both helpers are already visible; note added in the doc comment explaining the placement. https://claude.ai/code/session_01FuMTQYYhDX9bPB8WPNwGda
…cess path recv_slot_task_result_impl had the last remaining inline dump()+NewStringUTF in the header — the same 2-line pattern that json_to_jstring_impl (Finding L) was extracted to replace. Now all serialisation in jni_server_helpers.hpp goes through json_to_jstring_impl. https://claude.ai/code/session_01FuMTQYYhDX9bPB8WPNwGda
results_to_jstring_impl combined two concerns: building the JSON value (single-object vs array) and serialising it to a jstring. The construction logic is now in a standalone results_to_json_impl() that requires no JNI, placed immediately before results_to_jstring_impl which is reduced to a one-liner delegating to results_to_json_impl + json_to_jstring_impl. Three unit tests added in test_jni_server_helpers.cpp covering single, multiple, and empty result sets — no JNI mock needed. https://claude.ai/code/session_01FuMTQYYhDX9bPB8WPNwGda
…ma_token> decodeBytes and handleDetokenize both contained the identical 4-line pattern: GetArrayLength / GetIntArrayElements / vector(elems, elems+len) / Release Extracted to jint_array_to_tokens_impl in jni_helpers.hpp (no server types needed), placed next to parse_jbytes. Uses JNI_ABORT uniformly — decodeBytes previously used flag 0 which caused an unnecessary copy-writeback on some JVM implementations. Thin jint_array_to_tokens wrapper added in jllama.cpp. Three unit tests added in test_jni_helpers.cpp with array-stub fixture verifying element copy and that JNI_ABORT is passed to Release. https://claude.ai/code/session_01FuMTQYYhDX9bPB8WPNwGda
… extraction Five sites used the identical expression: result->to_json()["message"].get<std::string>() Extracted to get_result_error_message() in jni_server_helpers.hpp, placed before collect_task_results_impl (the first caller in the header). Updated in receiveCompletionJson, embed, handleRerank (jllama.cpp) and both _impl functions (jni_server_helpers.hpp). Two unit tests added. https://claude.ai/code/session_01FuMTQYYhDX9bPB8WPNwGda
loadModel had two near-identical 4-line error paths (vocab-only load failure and full-model load failure) differing only in the error message: delete ctx_server; delete jctx; llama_backend_free(); env->ThrowNew(...); Replaced with a local fail_load lambda defined once after jctx is allocated, capturing env/ctx_server/jctx by reference. Each call site is now a single self-documenting expression. https://claude.ai/code/session_01FuMTQYYhDX9bPB8WPNwGda
…ity check handleInfill contained a 10-line block that checked for the three FIM tokens (prefix, suffix, middle) and threw a descriptive error if any were missing. Extracted to check_infill_support_impl() in jni_server_helpers.hpp (vocab and error_class passed explicitly), with a thin check_infill_support wrapper in jllama.cpp placed immediately before handleInfill. The handler body is now a single guard line: if (!check_infill_support(env, ctx_server)) return nullptr; Note: unit testing check_infill_support_impl requires a real llama_vocab (llama_vocab_fim_* are opaque C functions with no link-time seam), so no standalone test is included — coverage comes via the integration test suite. https://claude.ai/code/session_01FuMTQYYhDX9bPB8WPNwGda
…e checks
handleInfill had two identical 3-line guards:
if (!data.contains("input_prefix")) { ThrowNew(...); return nullptr; }
if (!data.contains("input_suffix")) { ThrowNew(...); return nullptr; }
Extracted to require_json_field_impl() in jni_helpers.hpp (json + error_class
passed explicitly), with a thin require_json_field wrapper in jllama.cpp placed
next to parse_json_params. Both call sites now reduce to single guard lines.
Three unit tests added in test_jni_helpers.cpp covering: present field (no throw),
missing field (throws with correct message), empty json object.
https://claude.ai/code/session_01FuMTQYYhDX9bPB8WPNwGda
…allers get_result_error_message and json_to_jstring_impl were defined after recv_slot_task_result_impl and results_to_jstring_impl which call them, causing 'use of undeclared identifier' errors on Clang (macOS CI). Reordered to the correct dependency sequence: 1. get_result_error_message (leaf — no deps) 2. json_to_jstring_impl (leaf — no deps) 3. build_completion_tasks_impl 4. recv_slot_task_result_impl (uses 1 + 2) 5. collect_task_results_impl (uses 1) 6. results_to_json_impl 7. results_to_jstring_impl (uses 2 + 6) 8. check_infill_support_impl Added a declaration-order comment block at the top of the file to prevent future ordering regressions. https://claude.ai/code/session_01FuMTQYYhDX9bPB8WPNwGda
Saves the sharpened duplication-analysis query as a reusable slash command. Covers 8 categories: short repeated expressions, multi-line copy-paste, pipeline compositions, near-identical switch arms, mixed JNI/logic concerns, inconsistent helper usage, local cleanup sequences, and dead code. Reports only — no file modifications. https://claude.ai/code/session_01FuMTQYYhDX9bPB8WPNwGda
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.