feat: add run_async to CacheChecker#11271
Conversation
Adds an async counterpart to `CacheChecker.run()` that calls `filter_documents_async` on the document store for each cache item, enabling `CacheChecker` to be used in `AsyncPipeline` without blocking the event loop. Follows the same pattern used by `FilterRetriever.run_async`.
|
@Aftabbs is attempting to deploy a commit to the deepset Team on Vercel. A member of the Team first needs to authorize it. |
Coverage reportClick to see where and how coverage changed
This report was generated by python-coverage-comment-action |
||||||||||||||||||||||||||||||||||||||||||
|
Hi @Aftabbs thank you for opening this PR. There is a format error in releasenotes/notes/add-run-async-for-CacheChecker-a42fa8062c33466b.yaml:4: Found single backticks. Use double backticks ( |
| for item in items: | ||
| filters = {"field": self.cache_field, "operator": "==", "value": item} | ||
| # 'ignore' since filter_documents_async is not defined in the Protocol but exists in the implementations | ||
| found = await self.document_store.filter_documents_async(filters=filters) # type: ignore[attr-defined] |
There was a problem hiding this comment.
We are similarly simply assuming in other parts of the code base that there is an implementation of filter_documents_async, for example here:
In other parts of the code base (DocumentWriter), we raise an error:
if not hasattr(self.document_store, "write_documents_async"):
raise TypeError(f"Document store {type(self.document_store).__name__} does not provide async support.")
There was a problem hiding this comment.
An alternative would be to use a fallback:
If document_store has callable filter_documents_async: await it.
Else: await asyncio.to_thread(document_store.filter_documents, filters=filters).
|
Looks good to me now. @Aftabbs Thank you for opening this pull request! I applied smaller changes directly and will merge the PR now. |
|
The latest updates on your projects. Learn more about Vercel for GitHub. 1 Skipped Deployment
|
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Summary
CacheChecker.run_async()method that mirrorsrun()but usesfilter_documents_asyncon the document storeCacheCheckerto participate inAsyncPipelinewithout blocking the event loopFilterRetriever.run_async(including the# type: ignore[attr-defined]comment sincefilter_documents_asyncis not in theDocumentStoreprotocol but exists in all concrete implementations)Changes
haystack/components/caching/cache_checker.py— addedrun_asyncmethodtest/components/caching/test_cache_checker_async.py— new test file with 4 async tests covering hits, misses, all-hits, all-misses, and filter syntax verificationTest plan
python -m pytest test/components/caching/ -vpasses (11 tests, all green)Related
CacheCheckerwas the only caching component and one of the few document-store-backed components still missingrun_async. This brings it in line withFilterRetriever,AutoMergingRetriever, andSentenceWindowRetriever.