Release v2.2.0 — Native async retrieval + CacheMonitor · irfanalidv/ragfallback

What's new

Native async retrieval

AdaptiveRAGRetriever.aquery_with_fallback() — real coroutine using LangChain ainvoke(). Enables true concurrent eval in GoldenRunner and production FastAPI backends. Falls back to thread pool automatically if the underlying model doesn't implement ainvoke.

CacheMonitor

ragfallback.tracking.CacheMonitor wraps any LangChain retriever and tracks hit rate, per-category latency (hit vs miss), TTL expiry, and LRU eviction. Zero new dependencies — stdlib only. Pass to GoldenRunner via cache_monitor= and cache stats appear in GoldenReport alongside RAGAS scores and P95 latency.

GoldenRunner upgrade

run_async() now uses native aquery_with_fallback() — 75 queries run concurrently instead of serializing through a thread pool.

Install

pip install ragfallback==2.2.0

Numbers

102 unit tests passing (Python 3.9 / 3.10 / 3.11)
CI regression gate green on SQuAD golden dataset
21 new tests added this release

Full changelog

See CHANGELOG.md for complete details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v2.2.0 — Native async retrieval + CacheMonitor

Choose a tag to compare

Sorry, something went wrong.