Skip to content

feat(benchmarks): standardize LLM/LoRA artifact preflight for benchmarks#4665

Merged
makr-code merged 2 commits intodevelopfrom
copilot/fix-150588092-1085539157-da6bd74d-5390-4327-b0df-ba53149f330a
Apr 15, 2026
Merged

feat(benchmarks): standardize LLM/LoRA artifact preflight for benchmarks#4665
makr-code merged 2 commits intodevelopfrom
copilot/fix-150588092-1085539157-da6bd74d-5390-4327-b0df-ba53149f330a

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 15, 2026

LLM/LoRA benchmarks were silently producing invalid results (or crashing) because MultiLoRAManager::loadLoRA() validates file existence but benchmarks passed hardcoded /loras/... paths that never exist. Missing-artifact failures also had no actionable guidance.

Changes

New: Unit tests for preflight utilities (tests/test_artifact_preflight.cpp, PF-01..PF-14)

Tests cover modelBaseDir() env-var priority/fallbacks, stubModelsEnabled() case-insensitive parsing, resolveModelPath()/resolveLoraPath() path resolution, and LLMArtifactPreflight::create() success/failure/LoRA-required modes. Verifies error messages contain setup guidance.

tests/CMakeLists.txt

Added ${CMAKE_SOURCE_DIR}/benchmarks to include_directories so tests can include benchmark_artifact_preflight.h.

benchmarks/bench_llm_inference_performance.cpp — 9 functions

benchmarks/bench_lora_auto_binding.cpp — 12 functions

benchmarks/bench_lora_inline.cpp — 5 functions

All three files now:

  • #include "benchmark_artifact_preflight.h"
  • Guard each LoRA-dependent function with THEMIS_BENCH_SKIP_IF_ARTIFACT_MISSING
  • Replace every hardcoded /loras/....bin with resolveLoraPath()
// Before – silently benchmarks a failed load
mgr.loadLoRA("adapter-a", "/loras/adapter-a.bin", base_model, 1.0f);

// After – skips with actionable guidance if stub isn't present
THEMIS_BENCH_SKIP_IF_ARTIFACT_MISSING(state, themis::bench::resolveLoraPath(), "LoRA adapter");
const std::string lora_path = themis::bench::resolveLoraPath();
mgr.loadLoRA("adapter-a", lora_path, base_model, 1.0f);

On missing artifact the benchmark emits: "LLM artefact preflight FAILED: LoRA adapter not found. Run scripts/download_models.sh --stub-only or set THEMIS_MODEL_DIR. See docs/BENCHMARK_RUNBOOK.md §LLM/LoRA Model Setup."

Type of Change

  • Bug fix (non-breaking)
  • New feature (non-breaking)
  • Refactoring (non-breaking)
  • Documentation
  • Breaking change (requires MAJOR version bump — see VERSIONING.md)
  • Security fix
  • Other:

Breaking Change Checklist

  • MAJOR version bump planned in VERSION and CMakeLists.txt
  • Migration guide added in docs/migration/
  • Announcement prepared for GitHub Discussions (≥ 2 weeks before release)
  • CHANGELOG ### Removed / ### Changed section updated

Testing

  • Unit tests added/updated
  • Integration tests added/updated
  • Manual testing performed
  • Benchmarks run (if performance-sensitive change)

📚 Research & Knowledge (wenn applicable)

  • Diese PR basiert auf wissenschaftlichen Paper(s) oder Best Practices?
    • Falls JA: Research-Dateien in /docs/research/ angelegt?
    • Falls JA: Im Modul-README unter "Wissenschaftliche Grundlagen" verlinkt?
    • Falls JA: In /docs/research/implementation_influence/ eingetragen?

Relevante Quellen:

  • Paper:
  • Best Practice:
  • Architecture Decision:

Checklist

  • Code follows project style guidelines (clang-format / clang-tidy)
  • Self-review completed
  • Documentation updated (if needed)
  • CHANGELOG.md updated under [Unreleased]
  • No new warnings introduced
  • Security-sensitive paths reviewed by security maintainer (if applicable)

- Create tests/test_artifact_preflight.cpp (PF-01..PF-14) with unit
  tests for LLMArtifactPreflight, modelBaseDir(), stubModelsEnabled(),
  resolveModelPath() and resolveLoraPath()
- Add ${CMAKE_SOURCE_DIR}/benchmarks to tests/CMakeLists.txt include
  dirs so test_artifact_preflight.cpp can include the preflight header
- bench_llm_inference_performance.cpp: include benchmark_artifact_preflight.h,
  replace all hardcoded /loras/... paths with resolveLoraPath(), add
  THEMIS_BENCH_SKIP_IF_ARTIFACT_MISSING guards to every LoRA benchmark
- bench_lora_auto_binding.cpp: same treatment – 12 functions updated
- bench_lora_inline.cpp: same treatment – 5 functions updated

Missing artifact errors now produce actionable SkipWithError messages
pointing to scripts/download_models.sh --stub-only and
docs/BENCHMARK_RUNBOOK.md §LLM/LoRA Model Setup.

Agent-Logs-Url: https://github.com/makr-code/ThemisDB/sessions/c473afc7-d6a4-4be6-8b54-1c300da149c2

Co-authored-by: makr-code <150588092+makr-code@users.noreply.github.com>
Copilot AI changed the title [WIP] Copilot Request feat(benchmarks): standardize LLM/LoRA artifact preflight for benchmarks Apr 15, 2026
Copilot AI requested a review from makr-code April 15, 2026 06:23
@makr-code makr-code marked this pull request as ready for review April 15, 2026 06:23
@makr-code makr-code merged commit 58bc3c4 into develop Apr 15, 2026
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Agentic AI][LLM] Standardize model artifact preflight for benchmarks

2 participants