fix: resolve intermittent LLM unavailable prompts with layered defense by blackaxgit · Pull Request #7 · blackaxgit/clx

blackaxgit · 2026-03-17T04:23:30Z

Summary

Shorter health check timeout (2s instead of 60s) — reduces delay when Ollama is unreachable from ~60s to ~2s
default_decision fallback — all three LLM failure paths (client creation, health check, generation) now respect user's configured default_decision instead of hardcoding "ask"
SQLite decision cache — previously-validated commands skip Ollama entirely on repeat invocations (Allow TTL: 1h, Ask TTL: 15m)
File-based health cache — shares Ollama health status between short-lived hook processes via ~/.clx/data/ollama_health (30s TTL)

Details

Each hook invocation is a separate OS process with no shared in-memory state. When Ollama (Docker) is briefly unreachable, every command triggered a "requires confirmation" prompt. This PR implements 4 layered defenses per the runbook (docs/runbook-llm-unavailable-fix.md).

Files changed (10 files, +856/-30)

File	Change
`clx-core/src/ollama.rs`	2s health check timeout, 1 retry max
`clx-core/src/config.rs`	`cache_enabled`, `cache_allow_ttl_secs`, `cache_ask_ttl_secs` fields
`clx-core/src/storage/migration.rs`	Schema v3→v4, `validation_cache` table
`clx-core/src/storage/validation_cache.rs`	NEW — cache CRUD + cleanup (7 tests)
`clx-core/src/ollama_health.rs`	NEW — file-based health cache (5 tests)
`clx-core/src/lib.rs`	Export `ollama_health` module
`clx-hook/src/hooks/pre_tool_use.rs`	Integrate all 4 fixes + LLM generation failure handling
`clx-hook/tests/integration.rs`	3 new integration tests for default_decision fallback

Test plan

All workspace tests pass (cargo test --workspace — 394+ tests)
cargo clippy --workspace clean
Live test: unreachable Ollama + default_decision: allow → auto-allows
Live test: unreachable Ollama + default_decision: deny → auto-denies
Live test: whitelisted commands still auto-allow via L0 (no regression)
Live test: dangerous commands still blocked by L0 rules
Live test: health check completes in ~1-2s (was 60s+)
Schema migration v3→v4 verified on real database

🤖 Generated with Claude Code

Each CLX hook invocation is a separate OS process, so the in-memory ValidationCache is always empty. Add a persistent SQLite-backed cache so previously-validated commands skip Ollama entirely. - Migrate schema to v4 with validation_cache table and expiry index - Add get_cached_decision, cache_decision, cleanup_expired_cache methods - Export CachedDecision struct from storage module - Add cache_enabled, cache_allow_ttl_secs, cache_ask_ttl_secs config fields with env var overrides and sensible defaults

- Reduce health check timeout from 60s to 2s with 1 retry (Fix 1) - Use config default_decision instead of hardcoded "ask" on Ollama failure (Fix 2) - Integrate SQLite decision cache in PreToolUse hook so repeated commands skip Ollama entirely (Fix 3D) - Add file-based health cache to share Ollama status across hook processes (Fix 4)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- 7 unit tests for SQLite validation cache (hit/miss/expiry/upsert/cleanup) - 2 integration tests verifying default_decision config is respected when Ollama is unavailable (allow and deny fallbacks)

- Integration test for default_decision: ask when Ollama unavailable - Unit test for stale health file (>30s) returning Unknown

evaluate_with_llm returns Ask("LLM unavailable") when ollama.generate() fails — a third failure path distinct from is_available() and client creation. Apply the same default_decision fallback and update health cache to prevent repeated timeouts.

blackaxgit and others added 7 commits March 16, 2026 18:55

test(clx-core): add unit tests for validation cache storage

8ac7089

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

test: add tests for validation cache and default_decision fallback

b8d6a12

- 7 unit tests for SQLite validation cache (hit/miss/expiry/upsert/cleanup) - 2 integration tests verifying default_decision config is respected when Ollama is unavailable (allow and deny fallbacks)

test: add backward-compat ask fallback and stale health cache tests

56b243f

- Integration test for default_decision: ask when Ollama unavailable - Unit test for stale health file (>30s) returning Unknown

chore: fix rustfmt formatting for CI

53b7aad

blackaxgit merged commit 64a3114 into main Mar 18, 2026
7 checks passed

blackaxgit deleted the fix/llm-unavailable-fallback branch March 18, 2026 03:50

blackaxgit mentioned this pull request Mar 19, 2026

chore: release v0.1.0 #9

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: resolve intermittent LLM unavailable prompts with layered defense#7

fix: resolve intermittent LLM unavailable prompts with layered defense#7
blackaxgit merged 7 commits intomainfrom
fix/llm-unavailable-fallback

blackaxgit commented Mar 17, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

blackaxgit commented Mar 17, 2026

Summary

Details

Files changed (10 files, +856/-30)

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant