Skip to content

Cu-86d2e1ka4: simplify inference module for image embedding generation only#262

Merged
vizsatiz merged 3 commits intodevelopfrom
CU-86d2e1ka4-Simplify-inference-module
Apr 6, 2026
Merged

Cu-86d2e1ka4: simplify inference module for image embedding generation only#262
vizsatiz merged 3 commits intodevelopfrom
CU-86d2e1ka4-Simplify-inference-module

Conversation

@vizsatiz
Copy link
Copy Markdown
Member

@vizsatiz vizsatiz commented Mar 26, 2026

Summary by CodeRabbit

  • New Features

    • Added batch image-embeddings endpoint and simplified single-image embeddings API.
    • App now preloads embedding service at startup.
  • Removed

    • Generic model inference endpoint and related preprocessing/model-inference functionality.
    • Image quality/clarity analysis service.
    • Cloud provider configuration for AWS/GCP and related cloud model-loading integration.
  • Chores

    • Switched PostgreSQL driver to psycopg2-binary.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 26, 2026

📝 Walkthrough

Walkthrough

Removed cloud storage and model-inference infrastructure; streamlined the app to image-embedding only, added a batch embeddings endpoint and shared base64 decoding helper, removed image-clarity/preprocessing/model-repo services, and updated DI/startup to preload the image-embedding singleton.

Changes

Cohort / File(s) Summary
Configuration
wavefront/server/apps/inference_app/inference_app/config.ini
Removed [aws], [gcp], and [cloud_config] sections and related keys.
Dependency Injection Container
wavefront/server/apps/inference_app/inference_app/inference_app_container.py
Removed providers: cache_manager, cloud_storage_manager, model_repository, model_inference, image_analyser; retained image_embedding singleton.
Controller
wavefront/server/apps/inference_app/inference_app/controllers/inference_controller.py
Removed generic model-inference route. Added ImagePayload and ImageBatchPayload models, extract_decoded_image_data(...) helper, /v1/query/embeddings single-image handler, and /v1/query/embeddings/batch batch handler with base64 validation.
Services — Removed
wavefront/server/apps/inference_app/inference_app/service/image_analyser.py, wavefront/server/apps/inference_app/inference_app/service/model_inference.py, wavefront/server/apps/inference_app/inference_app/service/model_repository.py
Deleted entire modules: image clarity analysis, model inference & preprocessing, and model repository/cloud-loading logic.
Service — Image Embedding
wavefront/server/apps/inference_app/inference_app/service/image_embedding.py
Introduced module-level CLIP_MODEL_NAME/DINO_MODEL_NAME constants and added query_embed_batch(image_batch: list[bytes]) to decode images, run batched processing, L2-normalize, and return per-model batched embeddings.
Startup / Server
wavefront/server/apps/inference_app/inference_app/server.py
Added FastAPI lifespan asynccontextmanager to preload image_embedding on startup; updated app initialization accordingly.
Dependency Changes (pyproject)
wavefront/server/apps/floconsole/pyproject.toml, wavefront/server/apps/inference_app/pyproject.toml, wavefront/server/modules/insights_module/pyproject.toml
Replaced psycopg2 with psycopg2-binary in three pyproject files (same version constraints).

Sequence Diagram(s)

sequenceDiagram
  participant Client as Client
  participant API as FastAPI App
  participant Ctrl as InferenceController
  participant Embed as ImageEmbedding
  participant DI as InferenceAppContainer

  Note over DI,Embed: App startup
  DI->>Embed: preload image_embedding()
  Note over DI,Embed: models/embedders initialized

  Client->>API: POST /v1/query/embeddings/batch (image_batch)
  API->>Ctrl: forward request payload
  Ctrl->>Ctrl: extract_decoded_image_data on each entry
  Ctrl->>Embed: query_embed_batch(list[bytes])
  Embed->>Embed: decode images, batch-process, normalize
  Embed-->>Ctrl: batched embeddings
  Ctrl-->>API: 200 OK with embeddings
  API-->>Client: response
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested reviewers

  • vishnurk6247

Poem

🐰 Hopped through configs, nibbled out the cloud,

Batch embeddings now sing, simple, and proud.
No more long pipelines or cached dusty lore,
Just pixels to vectors — quick, batch, explore! 🥕✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 14.29% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly and clearly describes the main objective of the changeset: simplifying the inference module to focus exclusively on image embedding generation.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch CU-86d2e1ka4-Simplify-inference-module

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
wavefront/server/apps/inference_app/inference_app/service/image_embedding.py (1)

17-26: ⚠️ Potential issue | 🔴 Critical

AttributeError: class references non-existent attributes self.CLIP_MODEL_NAME and self.DINO_MODEL_NAME.

The model name constants were moved to module-level (lines 8-9), but the __init__ method still references them via self.CLIP_MODEL_NAME and self.DINO_MODEL_NAME. This will raise AttributeError at runtime.

🐛 Proposed fix
-        self.clip_processor = CLIPProcessor.from_pretrained(self.CLIP_MODEL_NAME)
-        self.clip_model = CLIPModel.from_pretrained(self.CLIP_MODEL_NAME).to(
+        self.clip_processor = CLIPProcessor.from_pretrained(CLIP_MODEL_NAME)
+        self.clip_model = CLIPModel.from_pretrained(CLIP_MODEL_NAME).to(
             self.device
         )
         self.clip_model.eval()

-        self.dino_processor = AutoImageProcessor.from_pretrained(self.DINO_MODEL_NAME)
+        self.dino_processor = AutoImageProcessor.from_pretrained(DINO_MODEL_NAME)
         self.dino_model = AutoModel.from_pretrained(
-            self.DINO_MODEL_NAME, trust_remote_code=True
+            DINO_MODEL_NAME, trust_remote_code=True
         ).to(self.device)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@wavefront/server/apps/inference_app/inference_app/service/image_embedding.py`
around lines 17 - 26, The __init__ references non-existent instance attributes
self.CLIP_MODEL_NAME and self.DINO_MODEL_NAME; update the CLIP and DINO model
construction to use the module-level constants CLIP_MODEL_NAME and
DINO_MODEL_NAME instead. Specifically, change the calls that create
self.clip_processor (CLIPProcessor.from_pretrained), self.clip_model
(CLIPModel.from_pretrained), self.dino_processor
(AutoImageProcessor.from_pretrained) and self.dino_model
(AutoModel.from_pretrained) to pass the module-level names CLIP_MODEL_NAME and
DINO_MODEL_NAME rather than self.CLIP_MODEL_NAME/self.DINO_MODEL_NAME so the
models initialize without raising AttributeError.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@wavefront/server/apps/inference_app/inference_app/controllers/inference_controller.py`:
- Around line 72-75: The helper extract_decoded_image_data currently calls
base64.b64decode which can raise binascii.Error on malformed input; wrap the
decode call in a try/except that catches binascii.Error (import binascii) and
re-raise a clear ValueError or custom exception (e.g., "Invalid base64 image
data") so caller endpoints can return a 4xx response instead of an unhandled
500; update any call sites (the single-image and multi-image endpoints) to catch
that ValueError and convert it to an appropriate HTTP error response.
- Around line 51-69: The batch endpoint image_embedding_batch currently always
returns HTTP 200 and doesn't handle decode errors; update image_embedding_batch
to (1) catch binascii.Error (raised by extract_decoded_image_data) and return an
HTTP 400 using response_formatter.buildSuccessResponse or an appropriate error
response, and (2) after calling image_embedding_service.query_embed_batch, check
if embeddings is empty or falsy and return HTTP 400 (mirroring the single-image
flow) instead of always returning 200; reference extract_decoded_image_data,
query_embed_batch, image_embedding_batch, and
response_formatter.buildSuccessResponse to locate where to add the try/except
and the empty-result conditional.

In
`@wavefront/server/apps/inference_app/inference_app/inference_app_container.py`:
- Around line 6-9: The failure is caused by server.py calling
InferenceAppContainer(cache_manager=None) but the container no longer declares a
cache_manager provider; restore compatibility by adding a dependency provider
named cache_manager to InferenceAppContainer (e.g., add a line cache_manager =
providers.Dependency() in the class) so the existing instantiation with
cache_manager=None succeeds, or alternatively remove the cache_manager argument
from the call site in server.py; update the symbol InferenceAppContainer to
include cache_manager = providers.Dependency() if you choose the former.

In
`@wavefront/server/apps/inference_app/inference_app/service/image_embedding.py`:
- Around line 92-100: The current loop that opens images (the block that calls
Image.open on items from image_batch and appends to images) silently returns []
on any decode error, making failures indistinguishable from an empty input;
change this to raise a descriptive exception instead: when an exception occurs
while opening an image (use the same except block that catches Exception as e),
log the error and then raise a ValueError (or a custom DecodeError) that
includes the failing index (idx) and the original exception (e) so callers can
distinguish a decode failure from an empty batch and handle it explicitly.

---

Outside diff comments:
In
`@wavefront/server/apps/inference_app/inference_app/service/image_embedding.py`:
- Around line 17-26: The __init__ references non-existent instance attributes
self.CLIP_MODEL_NAME and self.DINO_MODEL_NAME; update the CLIP and DINO model
construction to use the module-level constants CLIP_MODEL_NAME and
DINO_MODEL_NAME instead. Specifically, change the calls that create
self.clip_processor (CLIPProcessor.from_pretrained), self.clip_model
(CLIPModel.from_pretrained), self.dino_processor
(AutoImageProcessor.from_pretrained) and self.dino_model
(AutoModel.from_pretrained) to pass the module-level names CLIP_MODEL_NAME and
DINO_MODEL_NAME rather than self.CLIP_MODEL_NAME/self.DINO_MODEL_NAME so the
models initialize without raising AttributeError.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 3329984f-e9a3-4146-8a0d-c1c121a392b4

📥 Commits

Reviewing files that changed from the base of the PR and between 90760a1 and 5d07db0.

📒 Files selected for processing (7)
  • wavefront/server/apps/inference_app/inference_app/config.ini
  • wavefront/server/apps/inference_app/inference_app/controllers/inference_controller.py
  • wavefront/server/apps/inference_app/inference_app/inference_app_container.py
  • wavefront/server/apps/inference_app/inference_app/service/image_analyser.py
  • wavefront/server/apps/inference_app/inference_app/service/image_embedding.py
  • wavefront/server/apps/inference_app/inference_app/service/model_inference.py
  • wavefront/server/apps/inference_app/inference_app/service/model_repository.py
💤 Files with no reviewable changes (4)
  • wavefront/server/apps/inference_app/inference_app/service/image_analyser.py
  • wavefront/server/apps/inference_app/inference_app/config.ini
  • wavefront/server/apps/inference_app/inference_app/service/model_repository.py
  • wavefront/server/apps/inference_app/inference_app/service/model_inference.py

Comment on lines +72 to +75
def extract_decoded_image_data(image_data: str) -> bytes:
parts = image_data.split(',')
base64_data = parts[1] if len(parts) == 2 else parts[0]
return base64.b64decode(base64_data)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

extract_decoded_image_data can raise unhandled binascii.Error on malformed base64.

base64.b64decode raises binascii.Error for invalid base64 input. This function is called directly without try/except in both endpoints. For the single-image endpoint (line 36), this would cause an HTTP 500. Consider adding error handling here or at call sites.

💡 Option: Add validation with a clearer error
+import binascii
+
+class InvalidImageDataError(Exception):
+    pass
+
 def extract_decoded_image_data(image_data: str) -> bytes:
     parts = image_data.split(',')
     base64_data = parts[1] if len(parts) == 2 else parts[0]
-    return base64.b64decode(base64_data)
+    try:
+        return base64.b64decode(base64_data)
+    except binascii.Error as e:
+        raise InvalidImageDataError('Invalid base64 encoded image data') from e
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def extract_decoded_image_data(image_data: str) -> bytes:
parts = image_data.split(',')
base64_data = parts[1] if len(parts) == 2 else parts[0]
return base64.b64decode(base64_data)
import binascii
class InvalidImageDataError(Exception):
pass
def extract_decoded_image_data(image_data: str) -> bytes:
parts = image_data.split(',')
base64_data = parts[1] if len(parts) == 2 else parts[0]
try:
return base64.b64decode(base64_data)
except binascii.Error as e:
raise InvalidImageDataError('Invalid base64 encoded image data') from e
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@wavefront/server/apps/inference_app/inference_app/controllers/inference_controller.py`
around lines 72 - 75, The helper extract_decoded_image_data currently calls
base64.b64decode which can raise binascii.Error on malformed input; wrap the
decode call in a try/except that catches binascii.Error (import binascii) and
re-raise a clear ValueError or custom exception (e.g., "Invalid base64 image
data") so caller endpoints can return a 4xx response instead of an unhandled
500; update any call sites (the single-image and multi-image endpoints) to catch
that ValueError and convert it to an appropriate HTTP error response.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@wavefront/server/apps/inference_app/inference_app/controllers/inference_controller.py`:
- Around line 63-74: The batch endpoint currently only catches binascii.Error
when decoding images but not ValueError raised by
image_embedding_service.query_embed_batch (per image_embedding.py) when PIL
fails to decode a corrupted image; update the try/except around creating and
embedding the image batch to also catch ValueError (in addition to
binascii.Error) and return the same JSONResponse with status
HTTP_400_BAD_REQUEST and the 'Invalid base64 image data in batch' error via
response_formatter.buildErrorResponse so corrupted-but-base64 images produce 400
instead of 500; you should reference extract_decoded_image_data,
payload.image_batch, and image_embedding_service.query_embed_batch when locating
the code to change.
- Around line 36-38: The single-image endpoint calls
extract_decoded_image_data(payload.image_data) and then
image_embedding_service.query_embed(image_data) without handling malformed
base64; wrap the call to extract_decoded_image_data in a try/except that catches
binascii.Error (same as the batch endpoint) and convert it into a user-facing
400 response (e.g., raise HTTPException(status_code=400, detail="Malformed
base64 image data") or return the same error object used by the batch path) so
malformed inputs return 400 instead of propagating to the global 500 handler.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 7cd5d164-49ab-450d-8fc8-9c2d0b8b4232

📥 Commits

Reviewing files that changed from the base of the PR and between 5d07db0 and 6c053ef.

⛔ Files ignored due to path filters (1)
  • wavefront/server/uv.lock is excluded by !**/*.lock
📒 Files selected for processing (6)
  • wavefront/server/apps/floconsole/pyproject.toml
  • wavefront/server/apps/inference_app/inference_app/controllers/inference_controller.py
  • wavefront/server/apps/inference_app/inference_app/server.py
  • wavefront/server/apps/inference_app/inference_app/service/image_embedding.py
  • wavefront/server/apps/inference_app/pyproject.toml
  • wavefront/server/modules/insights_module/pyproject.toml
✅ Files skipped from review due to trivial changes (3)
  • wavefront/server/modules/insights_module/pyproject.toml
  • wavefront/server/apps/floconsole/pyproject.toml
  • wavefront/server/apps/inference_app/pyproject.toml

Comment on lines +36 to +38
# 1. Decode Base64 string
image_data = extract_decoded_image_data(payload.image_data)
embeddings = image_embedding_service.query_embed(image_data)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Single endpoint missing binascii.Error handling.

The batch endpoint (lines 63-73) wraps the decode in a try/except for binascii.Error, but the single endpoint calls extract_decoded_image_data without similar protection. Malformed base64 input will propagate to the global exception handler and return HTTP 500 instead of a user-friendly 400.

🛡️ Proposed fix for consistent error handling
 `@inference_app_router.post`('/v1/query/embeddings')
 `@inject`
 async def image_embedding(
     payload: ImagePayload,
     response_formatter: ResponseFormatter = Depends(
         Provide[CommonContainer.response_formatter]
     ),
     image_embedding_service: ImageEmbedding = Depends(
         Provide[InferenceAppContainer.image_embedding]
     ),
 ):
-    # 1. Decode Base64 string
-    image_data = extract_decoded_image_data(payload.image_data)
+    try:
+        image_data = extract_decoded_image_data(payload.image_data)
+    except binascii.Error:
+        return JSONResponse(
+            status_code=status.HTTP_400_BAD_REQUEST,
+            content=response_formatter.buildErrorResponse(
+                'Invalid base64 image data'
+            ),
+        )
     embeddings = image_embedding_service.query_embed(image_data)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@wavefront/server/apps/inference_app/inference_app/controllers/inference_controller.py`
around lines 36 - 38, The single-image endpoint calls
extract_decoded_image_data(payload.image_data) and then
image_embedding_service.query_embed(image_data) without handling malformed
base64; wrap the call to extract_decoded_image_data in a try/except that catches
binascii.Error (same as the batch endpoint) and convert it into a user-facing
400 response (e.g., raise HTTPException(status_code=400, detail="Malformed
base64 image data") or return the same error object used by the batch path) so
malformed inputs return 400 instead of propagating to the global 500 handler.

@vizsatiz vizsatiz merged commit 7a6e647 into develop Apr 6, 2026
10 checks passed
@vizsatiz vizsatiz deleted the CU-86d2e1ka4-Simplify-inference-module branch April 6, 2026 14:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant