feat(celery Wave 6 #37): wire EmbeddingService.embed_image into application cache#1735
Merged
Merged
Conversation
…cation cache Wave 5 P2 chunk 4 (#1733) landed the canonical multimodal embedding API surface (`EmbeddingService.embed_image(image_bytes, alt_text)`) but left the call un-cached — every vision modality embed now hits the LiteLLM provider, even for the same image. This wires it into the canonical `aperag.cache.NAMESPACE_EMBEDDING` infra (PR #1734) mirroring the existing text `_embed_batch` pattern. Cache key shape (per `aperag/cache/README.md` no-raw-bytes policy): { "kind": "image", "provider": ..., "model": ..., "api_base": ..., "api_key_hash": sha256(api_key), "file_hash": sha256(image_bytes), "alt_text": ..., "multimodal": True, } Image bytes are identified by their sha256 hex digest so the Redis key stays bounded; alt_text is part of the key because providers that accept paired text+image inputs return a different vector when the textual hint changes (alt_text="" collapses to one key for image-only callers). Tests ----- New `tests/unit_test/llm/test_embed_image_cache.py` (7 tests): * identical (bytes, alt_text) → second call hits cache (no upstream) * same bytes + different alt_text → distinct keys, both compute * different bytes + same alt_text → distinct keys, both compute * key shape uses sha256 file_hash, raw bytes never appear in key * `caching=False` bypasses cache (always upstream) * `multimodal=False` raises EmbeddingError (defense-in-depth) * empty image_bytes raises EmptyTextError Full unit suite: 1022 passed, 29 skipped, ruff + format clean. Out of scope (per task #37 boundary) ------------------------------------ Provider-specific multimodal embedder format variations (Voyage / Jina v3 / OpenAI multimodal SDK input shapes) stay on task #39 per PM dispatch + simple-stable directive (`feedback_simple_stable_zero _maintenance.md`).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Task #37 (Wave 6, multimodal embedding cache wiring) — ~30 LOC production change wrapping
EmbeddingService.embed_imagewith the canonicalaperag.cache.NAMESPACE_EMBEDDINGinfra (PR #1734). Mirrors the existing text_embed_batchcache pattern.Wave 5 P2 chunk 4 landed
embed_image(image_bytes, alt_text)API; this PR adds the cache layer so identical image embeds short-circuit the LiteLLM provider call.Cache key shape
Per
aperag/cache/README.mdno-raw-bytes policy:{ "kind": "image", "provider": ..., "model": ..., "api_base": ..., "api_key_hash": sha256(api_key), "file_hash": sha256(image_bytes), "alt_text": ..., "multimodal": True, }alt_textpart of key (paired text+image inputs return different vectors when hint changes).kind="image"discriminates from textembed_querykeys in the sameembeddingnamespace.Tests
New
tests/unit_test/llm/test_embed_image_cache.py(7 tests):caching=Falsebypasses cache (always upstream)multimodal=Falseraises EmbeddingError (defense-in-depth)Full unit suite: 1022 passed, 29 skipped, ruff + format clean.
Out of scope
Provider-specific multimodal embedder format variations (Voyage / Jina v3 / OpenAI multimodal SDK input shapes) stay on task #39 per PM dispatch + simple-stable directive.
Test plan
uv run pytest tests/unit_test/— 1022 passeduvx ruff check + format— clean