feat(benchmarks): standardize LLM/LoRA artifact preflight for benchmarks by Copilot · Pull Request #4665 · makr-code/ThemisDB

Copilot · 2026-04-15T05:17:34Z

LLM/LoRA benchmarks were silently producing invalid results (or crashing) because MultiLoRAManager::loadLoRA() validates file existence but benchmarks passed hardcoded /loras/... paths that never exist. Missing-artifact failures also had no actionable guidance.

Changes

New: Unit tests for preflight utilities (`tests/test_artifact_preflight.cpp`, PF-01..PF-14)

Tests cover modelBaseDir() env-var priority/fallbacks, stubModelsEnabled() case-insensitive parsing, resolveModelPath()/resolveLoraPath() path resolution, and LLMArtifactPreflight::create() success/failure/LoRA-required modes. Verifies error messages contain setup guidance.

`tests/CMakeLists.txt`

Added ${CMAKE_SOURCE_DIR}/benchmarks to include_directories so tests can include benchmark_artifact_preflight.h.

`benchmarks/bench_llm_inference_performance.cpp` — 9 functions

`benchmarks/bench_lora_auto_binding.cpp` — 12 functions

`benchmarks/bench_lora_inline.cpp` — 5 functions

All three files now:

#include "benchmark_artifact_preflight.h"
Guard each LoRA-dependent function with THEMIS_BENCH_SKIP_IF_ARTIFACT_MISSING
Replace every hardcoded /loras/....bin with resolveLoraPath()

// Before – silently benchmarks a failed load
mgr.loadLoRA("adapter-a", "/loras/adapter-a.bin", base_model, 1.0f);

// After – skips with actionable guidance if stub isn't present
THEMIS_BENCH_SKIP_IF_ARTIFACT_MISSING(state, themis::bench::resolveLoraPath(), "LoRA adapter");
const std::string lora_path = themis::bench::resolveLoraPath();
mgr.loadLoRA("adapter-a", lora_path, base_model, 1.0f);

On missing artifact the benchmark emits: "LLM artefact preflight FAILED: LoRA adapter not found. Run scripts/download_models.sh --stub-only or set THEMIS_MODEL_DIR. See docs/BENCHMARK_RUNBOOK.md §LLM/LoRA Model Setup."

Type of Change

Breaking Change Checklist

MAJOR version bump planned in VERSION and CMakeLists.txt
Migration guide added in docs/migration/
Announcement prepared for GitHub Discussions (≥ 2 weeks before release)
CHANGELOG ### Removed / ### Changed section updated

Testing

Unit tests added/updated
Integration tests added/updated
Manual testing performed
Benchmarks run (if performance-sensitive change)

📚 Research & Knowledge (wenn applicable)

Diese PR basiert auf wissenschaftlichen Paper(s) oder Best Practices?
- Falls JA: Research-Dateien in /docs/research/ angelegt?
- Falls JA: Im Modul-README unter "Wissenschaftliche Grundlagen" verlinkt?
- Falls JA: In /docs/research/implementation_influence/ eingetragen?

Relevante Quellen:

Paper:
Best Practice:
Architecture Decision:

Checklist

Code follows project style guidelines (clang-format / clang-tidy)
Self-review completed
Documentation updated (if needed)
CHANGELOG.md updated under [Unreleased]
No new warnings introduced
Security-sensitive paths reviewed by security maintainer (if applicable)

- Create tests/test_artifact_preflight.cpp (PF-01..PF-14) with unit tests for LLMArtifactPreflight, modelBaseDir(), stubModelsEnabled(), resolveModelPath() and resolveLoraPath() - Add ${CMAKE_SOURCE_DIR}/benchmarks to tests/CMakeLists.txt include dirs so test_artifact_preflight.cpp can include the preflight header - bench_llm_inference_performance.cpp: include benchmark_artifact_preflight.h, replace all hardcoded /loras/... paths with resolveLoraPath(), add THEMIS_BENCH_SKIP_IF_ARTIFACT_MISSING guards to every LoRA benchmark - bench_lora_auto_binding.cpp: same treatment – 12 functions updated - bench_lora_inline.cpp: same treatment – 5 functions updated Missing artifact errors now produce actionable SkipWithError messages pointing to scripts/download_models.sh --stub-only and docs/BENCHMARK_RUNBOOK.md §LLM/LoRA Model Setup. Agent-Logs-Url: https://github.com/makr-code/ThemisDB/sessions/c473afc7-d6a4-4be6-8b54-1c300da149c2 Co-authored-by: makr-code <150588092+makr-code@users.noreply.github.com>

Initial plan

624f32e

Copilot AI assigned Copilot and makr-code Apr 15, 2026

Copilot started work on behalf of makr-code April 15, 2026 05:17 View session

Copilot AI linked an issue Apr 15, 2026 that may be closed by this pull request

[Agentic AI][LLM] Standardize model artifact preflight for benchmarks #4540

Closed

Copilot AI changed the title ~~[WIP] Copilot Request~~ feat(benchmarks): standardize LLM/LoRA artifact preflight for benchmarks Apr 15, 2026

Copilot AI requested a review from makr-code April 15, 2026 06:23

Copilot finished work on behalf of makr-code April 15, 2026 06:23

makr-code marked this pull request as ready for review April 15, 2026 06:23

makr-code merged commit 58bc3c4 into develop Apr 15, 2026
10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(benchmarks): standardize LLM/LoRA artifact preflight for benchmarks#4665

feat(benchmarks): standardize LLM/LoRA artifact preflight for benchmarks#4665
makr-code merged 2 commits intodevelopfrom
copilot/fix-150588092-1085539157-da6bd74d-5390-4327-b0df-ba53149f330a

Copilot AI commented Apr 15, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Copilot AI commented Apr 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

New: Unit tests for preflight utilities (tests/test_artifact_preflight.cpp, PF-01..PF-14)

tests/CMakeLists.txt

benchmarks/bench_llm_inference_performance.cpp — 9 functions

benchmarks/bench_lora_auto_binding.cpp — 12 functions

benchmarks/bench_lora_inline.cpp — 5 functions

Type of Change

Breaking Change Checklist

Testing

📚 Research & Knowledge (wenn applicable)

Checklist

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Copilot AI commented Apr 15, 2026 •

edited

Loading

New: Unit tests for preflight utilities (`tests/test_artifact_preflight.cpp`, PF-01..PF-14)

`tests/CMakeLists.txt`

`benchmarks/bench_llm_inference_performance.cpp` — 9 functions

`benchmarks/bench_lora_auto_binding.cpp` — 12 functions

`benchmarks/bench_lora_inline.cpp` — 5 functions