Skip to content

feat(celery Wave 6 #37): wire EmbeddingService.embed_image into application cache#1735

Merged
earayu merged 1 commit into
mainfrom
ming-shu/wave6-37-multimodal-cache
Apr 27, 2026
Merged

feat(celery Wave 6 #37): wire EmbeddingService.embed_image into application cache#1735
earayu merged 1 commit into
mainfrom
ming-shu/wave6-37-multimodal-cache

Conversation

@earayu
Copy link
Copy Markdown
Collaborator

@earayu earayu commented Apr 27, 2026

Summary

Task #37 (Wave 6, multimodal embedding cache wiring) — ~30 LOC production change wrapping EmbeddingService.embed_image with the canonical aperag.cache.NAMESPACE_EMBEDDING infra (PR #1734). Mirrors the existing text _embed_batch cache pattern.

Wave 5 P2 chunk 4 landed embed_image(image_bytes, alt_text) API; this PR adds the cache layer so identical image embeds short-circuit the LiteLLM provider call.

Cache key shape

Per aperag/cache/README.md no-raw-bytes policy:

{
    "kind": "image",
    "provider": ...,
    "model": ...,
    "api_base": ...,
    "api_key_hash": sha256(api_key),
    "file_hash": sha256(image_bytes),
    "alt_text": ...,
    "multimodal": True,
}
  • Image bytes identified by sha256 hex digest → bounded Redis key.
  • alt_text part of key (paired text+image inputs return different vectors when hint changes).
  • kind="image" discriminates from text embed_query keys in the same embedding namespace.

Tests

New tests/unit_test/llm/test_embed_image_cache.py (7 tests):

  • ✅ identical (bytes, alt_text) → second call hits cache (no upstream)
  • ✅ same bytes + different alt_text → distinct keys, both compute
  • ✅ different bytes + same alt_text → distinct keys, both compute
  • ✅ key shape uses sha256 file_hash, raw bytes never appear in key
  • caching=False bypasses cache (always upstream)
  • multimodal=False raises EmbeddingError (defense-in-depth)
  • ✅ empty image_bytes raises EmptyTextError

Full unit suite: 1022 passed, 29 skipped, ruff + format clean.

Out of scope

Provider-specific multimodal embedder format variations (Voyage / Jina v3 / OpenAI multimodal SDK input shapes) stay on task #39 per PM dispatch + simple-stable directive.

Test plan

  • uv run pytest tests/unit_test/ — 1022 passed
  • uvx ruff check + format — clean
  • Cache key shape verified via unit test (no raw bytes in key)
  • Cache hit/miss + alt_text + bytes-difference behavior pinned

…cation cache

Wave 5 P2 chunk 4 (#1733) landed the canonical multimodal embedding
API surface (`EmbeddingService.embed_image(image_bytes, alt_text)`)
but left the call un-cached — every vision modality embed now hits
the LiteLLM provider, even for the same image. This wires it into
the canonical `aperag.cache.NAMESPACE_EMBEDDING` infra (PR #1734)
mirroring the existing text `_embed_batch` pattern.

Cache key shape (per `aperag/cache/README.md` no-raw-bytes policy):

    {
      "kind": "image",
      "provider": ...,
      "model": ...,
      "api_base": ...,
      "api_key_hash": sha256(api_key),
      "file_hash": sha256(image_bytes),
      "alt_text": ...,
      "multimodal": True,
    }

Image bytes are identified by their sha256 hex digest so the Redis
key stays bounded; alt_text is part of the key because providers
that accept paired text+image inputs return a different vector when
the textual hint changes (alt_text="" collapses to one key for
image-only callers).

Tests
-----
New `tests/unit_test/llm/test_embed_image_cache.py` (7 tests):

* identical (bytes, alt_text) → second call hits cache (no upstream)
* same bytes + different alt_text → distinct keys, both compute
* different bytes + same alt_text → distinct keys, both compute
* key shape uses sha256 file_hash, raw bytes never appear in key
* `caching=False` bypasses cache (always upstream)
* `multimodal=False` raises EmbeddingError (defense-in-depth)
* empty image_bytes raises EmptyTextError

Full unit suite: 1022 passed, 29 skipped, ruff + format clean.

Out of scope (per task #37 boundary)
------------------------------------
Provider-specific multimodal embedder format variations (Voyage /
Jina v3 / OpenAI multimodal SDK input shapes) stay on task #39 per
PM dispatch + simple-stable directive (`feedback_simple_stable_zero
_maintenance.md`).
@earayu earayu merged commit 80fff83 into main Apr 27, 2026
4 checks passed
@earayu earayu deleted the ming-shu/wave6-37-multimodal-cache branch April 27, 2026 11:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant