Add regression tests for issues #80, #95, #98, #102 by bernardladenthin · Pull Request #185 · bernardladenthin/java-llama.cpp

bernardladenthin · 2026-05-22T20:52:28Z

Summary

Added four JUnit regression tests covering upstream issues Replace collect_and_serialize with dispatch_and_collect #80, Claude/update b8913 compatibility eq8 n8 #95, docs: add 49be664_24918e4.md — full project analysis from baseline to HEAD #98, Upgrade to llama.cpp b9022: vision, reasoning, and speculative decoding fixes #102 to confirm fixes and prevent regressions
Updated docs/history/49be664_open_issues.md to reflect test coverage and upgrade verdicts from LIKELY FIXED → FIXED (pending CI green)
Extended .github/workflows/publish.yml to download nomic-embed-text-v1.5.f16.gguf for embedding test on linux-x86_64
Added CLAUDE.md section documenting local Java test build workflow and optional model properties

Changes

New regression tests (commit 713d426):

MemoryManagementTest#testOpenCloseLoopDoesNotLeak — 20-iteration open/close loop; asserts VmRSS growth < 200 MB on Linux (smoke test on other platforms). Covers issue Upgrade to llama.cpp b9022: vision, reasoning, and speculative decoding fixes #102.
MemoryManagementTest#testOpenCloseWithoutGeneration — 20 open + immediate-close cycles without generation; guards against half-initialized-race segfault. Covers issue Replace collect_and_serialize with dispatch_and_collect #80.
LlamaEmbeddingsTest#testNomicEmbedLoads — loads nomic-embed-text-v1.5.f16.gguf with enableEmbedding() and asserts 768-dim output. Gated on net.ladenthin.llama.nomic.path property. Covers issue docs: add 49be664_24918e4.md — full project analysis from baseline to HEAD #98.
LlamaModelTest#testIteratorTerminatesOnRepetitivePrompt — drives iterator with repetitive prompt at nPredict=30, asserts termination within nPredict+1 outputs. Covers issue Claude/update b8913 compatibility eq8 n8 #95.

Documentation updates:

docs/history/49be664_open_issues.md: Updated issue summaries to reference test names and commit 713d426; updated bottom-line summary and status table to reflect "LIKELY FIXED → FIXED on CI green" verdicts.
CLAUDE.md: Added "Building the native library for local Java tests" section with platform-specific library paths, end-to-end workflow, and optional model property table.
TestConstants.java: Added PROP_NOMIC_MODEL_PATH and NOMIC_EMBED_DIM constants.

CI integration:

.github/workflows/publish.yml: Added NOMIC_EMBED_MODEL_URL and NOMIC_EMBED_MODEL_NAME env vars; linux-x86_64 Java test job downloads the model and passes -Dnet.ladenthin.llama.nomic.path=… to mvn test.

Test plan

All four new JUnit tests compile and self-skip cleanly when their model files are absent
Tests pass locally with required models present
CI will run all four tests green on linux-x86_64 (model auto-downloaded); other platforms skip gracefully
Existing tests remain unaffected

Related issues / PRs

Closes #80, #95, #98, #102 (pending first green CI run)

Checklist

I have read CONTRIBUTING.md and CODE_OF_CONDUCT.md
My commits follow Conventional Commits
No security-sensitive changes

https://claude.ai/code/session_01LR7Gw1pyKS7wvxXfZjnxNW

Adds four small JUnit tests proposed in the verification plan section of docs/history/49be664_open_issues.md to upgrade the corresponding upstream issues from LIKELY FIXED to FIXED: - MemoryManagementTest#testOpenCloseLoopDoesNotLeak (#102) - 20-iteration open/close loop; on Linux asserts VmRSS delta < 200 MB. Degenerates to a no-crash smoke test on non-Linux hosts where /proc/self/status is absent. - MemoryManagementTest#testOpenCloseWithoutGeneration (#80) - 20 open + immediate close without any generation, exercises the half-initialised worker race closed by the double server.terminate() in jllama.cpp. - LlamaModelTest#testIteratorTerminatesOnRepetitivePrompt (#95) - asserts the iterator terminates within nPredict+1 steps on a deliberately repetitive prompt. - LlamaEmbeddingsTest#testNomicEmbedLoads (#98) - gated on system property net.ladenthin.llama.nomic.path; reproduces the reporter's batch/ubatch config plus the fix (enableEmbedding()), and asserts a 768-dim vector for nomic-embed-text-v1.5. Wires up the optional nomic GGUF download in the linux-x86_64 Java test job in .github/workflows/publish.yml. Other test jobs cleanly self-skip via Assume because the system property is unset. Documents the local native-build workflow in CLAUDE.md - per-host output paths, mvn-cmake handoff, optional model handling, and the restricted-network caveat for environments that block huggingface.co. https://claude.ai/code/session_01LR7Gw1pyKS7wvxXfZjnxNW

Updates docs/history/49be664_open_issues.md to reflect that the four JUnit regression tests called for in the verification plan have been added on this branch: - Deep-dive verdict guide now lists each test name and self-skip behaviour next to its issue bullet - Per-issue Status blocks for #80, #95, #98, #102 annotated as "LIKELY FIXED -> FIXED on CI green" with the covering test - Status overview table rows for the same four issues updated - "What the original issues actually contain" feasibility table marks all four as DONE with the commit reference - "Concrete test plan" gains a status callout noting the as-shipped implementation matches the sketches - "Recommended sequencing" step 1 marked DONE and enumerates what shipped; remaining steps (#86 docs, #103/#34 typed image API, Android emulator CI) carried forward as the next deliverables No code or behaviour change, documentation only. https://claude.ai/code/session_01LR7Gw1pyKS7wvxXfZjnxNW

sonarqubecloud · 2026-05-22T20:53:03Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

* docs: mark #80/#95/#98/#102 as FIXED now that PR #185 is merged PR #185 (commit cba693c) merged the four regression tests sketched in the 49be664 open-issues verification plan. Update the per-issue blocks, the status overview table, the top-level deep-dive verdict guide, and the recommended-sequencing section to reflect that #80, #95, #98 and #102 are now FIXED (no longer "LIKELY FIXED → FIXED on CI green"). https://claude.ai/code/session_01R3jVWHsB3zymwAQtj8GT43 * docs: add README "Choosing the right classifier" section Closes the documentation gap for issue #86 (does the CUDA jar fall back to CPU?) and the 32-bit Android tail of #121 (armeabi-v7a not published). The new section enumerates the three published classifiers (default CPU, cuda13-linux-x86-64, opencl-android-aarch64), their backends, target platforms, and runtime requirements. It explicitly states that the CUDA JAR is CUDA-only at runtime — it dlopens libcudart.so.13/libcublas.so.13 and has no automatic CPU fallback — and that Android armeabi-v7a is not shipped as a released artifact. Updates docs/history/49be664_open_issues.md to mark #86 as FIXED-AS-DOCUMENTED and #121 as FIXED (64-bit) with the 32-bit limitation now documented. https://claude.ai/code/session_01R3jVWHsB3zymwAQtj8GT43 --------- Co-authored-by: Claude <noreply@anthropic.com>

claude added 2 commits May 22, 2026 20:24

bernardladenthin had a problem deploying to startgate May 22, 2026 20:52 — with GitHub Actions Error

bernardladenthin merged commit cba693c into main May 22, 2026
4 of 9 checks passed

bernardladenthin deleted the claude/sweet-fermi-1yrRK branch May 22, 2026 20:52

bernardladenthin mentioned this pull request May 22, 2026

Document classifier selection and mark 4 issues FIXED via PR #185 #186

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add regression tests for issues #80, #95, #98, #102#185

Add regression tests for issues #80, #95, #98, #102#185
bernardladenthin merged 2 commits into
mainfrom
claude/sweet-fermi-1yrRK

bernardladenthin commented May 22, 2026

Uh oh!

Uh oh!

sonarqubecloud Bot commented May 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

bernardladenthin commented May 22, 2026

Summary

Changes

Test plan

Related issues / PRs

Checklist

Uh oh!

Uh oh!

sonarqubecloud Bot commented May 22, 2026

Quality Gate passed

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants