Add LLM cache for query expansion and reranking

Context
QMD caches LLM outputs to keep query expansion and reranking consistent and inexpensive on repeated queries.

Goal
Reduce latency and cost for repeated queries while keeping outputs consistent.

Scope
- Cache expansion and rerank results keyed by query, model, and version.
- Define cache invalidation rules and a retention policy.

Acceptance Criteria
- Repeated queries avoid redundant LLM calls without correctness regressions.

References
- https://github.com/tobi/qmd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add LLM cache for query expansion and reranking #18

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add LLM cache for query expansion and reranking #18

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions