Problem
Running the same eval suite repeatedly makes redundant LLM API calls for identical inputs. This wastes cost and time when iterating on evaluator logic.
Proposal
Cache provider responses keyed by hash(provider + model + input + config):
execution:
cache: true # default: false
cache_path: .agentv/cache # default location
CLI
agentv run --target my-agent evals/ # uses cache if configured
agentv run --target my-agent evals/ --no-cache # bypass cache
Rules
- Cache agent/provider responses only (the expensive LLM calls)
- Never cache evaluator results (evaluator logic may change)
- Temperature > 0 not cached by default (non-deterministic)
- Cache is a directory of hashed response files — portable, inspectable
Why No agentv cache Subcommand?
Per design principle #5 (AI-First): minimize commands. rm -rf .agentv/cache clears the cache. No need for a dedicated command.
Design Principles Alignment
- ✅ Lightweight Core — infrastructure concern, intercepts provider layer
- ✅ Non-Breaking Extension — opt-in via config, existing behavior unchanged
- ✅ AI-First — fewer commands, simple mental model
Acceptance Criteria
Problem
Running the same eval suite repeatedly makes redundant LLM API calls for identical inputs. This wastes cost and time when iterating on evaluator logic.
Proposal
Cache provider responses keyed by
hash(provider + model + input + config):CLI
Rules
Why No
agentv cacheSubcommand?Per design principle #5 (AI-First): minimize commands.
rm -rf .agentv/cacheclears the cache. No need for a dedicated command.Design Principles Alignment
Acceptance Criteria
--no-cacheflag