Cu-86d2e1ka4: simplify inference module for image embedding generation only by vizsatiz · Pull Request #262 · rootflo/wavefront

vizsatiz · 2026-03-26T08:55:12Z

Summary by CodeRabbit

New Features
- Added batch image-embeddings endpoint and simplified single-image embeddings API.
- App now preloads embedding service at startup.
Removed
- Generic model inference endpoint and related preprocessing/model-inference functionality.
- Image quality/clarity analysis service.
- Cloud provider configuration for AWS/GCP and related cloud model-loading integration.
Chores
- Switched PostgreSQL driver to psycopg2-binary.

coderabbitai · 2026-03-26T08:55:30Z

📝 Walkthrough

Walkthrough

Removed cloud storage and model-inference infrastructure; streamlined the app to image-embedding only, added a batch embeddings endpoint and shared base64 decoding helper, removed image-clarity/preprocessing/model-repo services, and updated DI/startup to preload the image-embedding singleton.

Changes

Cohort / File(s)	Summary
Configuration `wavefront/server/apps/inference_app/inference_app/config.ini`	Removed `[aws]`, `[gcp]`, and `[cloud_config]` sections and related keys.
Dependency Injection Container `wavefront/server/apps/inference_app/inference_app/inference_app_container.py`	Removed providers: `cache_manager`, `cloud_storage_manager`, `model_repository`, `model_inference`, `image_analyser`; retained `image_embedding` singleton.
Controller `wavefront/server/apps/inference_app/inference_app/controllers/inference_controller.py`	Removed generic model-inference route. Added `ImagePayload` and `ImageBatchPayload` models, `extract_decoded_image_data(...)` helper, `/v1/query/embeddings` single-image handler, and `/v1/query/embeddings/batch` batch handler with base64 validation.
Services — Removed `wavefront/server/apps/inference_app/inference_app/service/image_analyser.py`, `wavefront/server/apps/inference_app/inference_app/service/model_inference.py`, `wavefront/server/apps/inference_app/inference_app/service/model_repository.py`	Deleted entire modules: image clarity analysis, model inference & preprocessing, and model repository/cloud-loading logic.
Service — Image Embedding `wavefront/server/apps/inference_app/inference_app/service/image_embedding.py`	Introduced module-level `CLIP_MODEL_NAME`/`DINO_MODEL_NAME` constants and added `query_embed_batch(image_batch: list[bytes])` to decode images, run batched processing, L2-normalize, and return per-model batched embeddings.
Startup / Server `wavefront/server/apps/inference_app/inference_app/server.py`	Added FastAPI `lifespan` asynccontextmanager to preload `image_embedding` on startup; updated app initialization accordingly.
Dependency Changes (pyproject) `wavefront/server/apps/floconsole/pyproject.toml`, `wavefront/server/apps/inference_app/pyproject.toml`, `wavefront/server/modules/insights_module/pyproject.toml`	Replaced `psycopg2` with `psycopg2-binary` in three pyproject files (same version constraints).

Sequence Diagram(s)

sequenceDiagram
  participant Client as Client
  participant API as FastAPI App
  participant Ctrl as InferenceController
  participant Embed as ImageEmbedding
  participant DI as InferenceAppContainer

  Note over DI,Embed: App startup
  DI->>Embed: preload image_embedding()
  Note over DI,Embed: models/embedders initialized

  Client->>API: POST /v1/query/embeddings/batch (image_batch)
  API->>Ctrl: forward request payload
  Ctrl->>Ctrl: extract_decoded_image_data on each entry
  Ctrl->>Embed: query_embed_batch(list[bytes])
  Embed->>Embed: decode images, batch-process, normalize
  Embed-->>Ctrl: batched embeddings
  Ctrl-->>API: 200 OK with embeddings
  API-->>Client: response

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

chore: update redshift functions #235 — touches cloud storage/provider wiring and appears related to removal/rename of cloud_storage_manager in the inference app.

Suggested reviewers

vishnurk6247

Poem

🐰 Hopped through configs, nibbled out the cloud,

Batch embeddings now sing, simple, and proud.
No more long pipelines or cached dusty lore,
Just pixels to vectors — quick, batch, explore! 🥕✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 14.29% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title directly and clearly describes the main objective of the changeset: simplifying the inference module to focus exclusively on image embedding generation.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch CU-86d2e1ka4-Simplify-inference-module

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

wavefront/server/apps/inference_app/inference_app/service/image_embedding.py (1)

17-26: ⚠️ Potential issue | 🔴 Critical

AttributeError: class references non-existent attributes self.CLIP_MODEL_NAME and self.DINO_MODEL_NAME.

The model name constants were moved to module-level (lines 8-9), but the __init__ method still references them via self.CLIP_MODEL_NAME and self.DINO_MODEL_NAME. This will raise AttributeError at runtime.

🐛 Proposed fix

-        self.clip_processor = CLIPProcessor.from_pretrained(self.CLIP_MODEL_NAME)
-        self.clip_model = CLIPModel.from_pretrained(self.CLIP_MODEL_NAME).to(
+        self.clip_processor = CLIPProcessor.from_pretrained(CLIP_MODEL_NAME)
+        self.clip_model = CLIPModel.from_pretrained(CLIP_MODEL_NAME).to(
             self.device
         )
         self.clip_model.eval()

-        self.dino_processor = AutoImageProcessor.from_pretrained(self.DINO_MODEL_NAME)
+        self.dino_processor = AutoImageProcessor.from_pretrained(DINO_MODEL_NAME)
         self.dino_model = AutoModel.from_pretrained(
-            self.DINO_MODEL_NAME, trust_remote_code=True
+            DINO_MODEL_NAME, trust_remote_code=True
         ).to(self.device)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@wavefront/server/apps/inference_app/inference_app/service/image_embedding.py`
around lines 17 - 26, The __init__ references non-existent instance attributes
self.CLIP_MODEL_NAME and self.DINO_MODEL_NAME; update the CLIP and DINO model
construction to use the module-level constants CLIP_MODEL_NAME and
DINO_MODEL_NAME instead. Specifically, change the calls that create
self.clip_processor (CLIPProcessor.from_pretrained), self.clip_model
(CLIPModel.from_pretrained), self.dino_processor
(AutoImageProcessor.from_pretrained) and self.dino_model
(AutoModel.from_pretrained) to pass the module-level names CLIP_MODEL_NAME and
DINO_MODEL_NAME rather than self.CLIP_MODEL_NAME/self.DINO_MODEL_NAME so the
models initialize without raising AttributeError.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@wavefront/server/apps/inference_app/inference_app/controllers/inference_controller.py`:
- Around line 72-75: The helper extract_decoded_image_data currently calls
base64.b64decode which can raise binascii.Error on malformed input; wrap the
decode call in a try/except that catches binascii.Error (import binascii) and
re-raise a clear ValueError or custom exception (e.g., "Invalid base64 image
data") so caller endpoints can return a 4xx response instead of an unhandled
500; update any call sites (the single-image and multi-image endpoints) to catch
that ValueError and convert it to an appropriate HTTP error response.
- Around line 51-69: The batch endpoint image_embedding_batch currently always
returns HTTP 200 and doesn't handle decode errors; update image_embedding_batch
to (1) catch binascii.Error (raised by extract_decoded_image_data) and return an
HTTP 400 using response_formatter.buildSuccessResponse or an appropriate error
response, and (2) after calling image_embedding_service.query_embed_batch, check
if embeddings is empty or falsy and return HTTP 400 (mirroring the single-image
flow) instead of always returning 200; reference extract_decoded_image_data,
query_embed_batch, image_embedding_batch, and
response_formatter.buildSuccessResponse to locate where to add the try/except
and the empty-result conditional.

In
`@wavefront/server/apps/inference_app/inference_app/inference_app_container.py`:
- Around line 6-9: The failure is caused by server.py calling
InferenceAppContainer(cache_manager=None) but the container no longer declares a
cache_manager provider; restore compatibility by adding a dependency provider
named cache_manager to InferenceAppContainer (e.g., add a line cache_manager =
providers.Dependency() in the class) so the existing instantiation with
cache_manager=None succeeds, or alternatively remove the cache_manager argument
from the call site in server.py; update the symbol InferenceAppContainer to
include cache_manager = providers.Dependency() if you choose the former.

In
`@wavefront/server/apps/inference_app/inference_app/service/image_embedding.py`:
- Around line 92-100: The current loop that opens images (the block that calls
Image.open on items from image_batch and appends to images) silently returns []
on any decode error, making failures indistinguishable from an empty input;
change this to raise a descriptive exception instead: when an exception occurs
while opening an image (use the same except block that catches Exception as e),
log the error and then raise a ValueError (or a custom DecodeError) that
includes the failing index (idx) and the original exception (e) so callers can
distinguish a decode failure from an empty batch and handle it explicitly.

---

Outside diff comments:
In
`@wavefront/server/apps/inference_app/inference_app/service/image_embedding.py`:
- Around line 17-26: The __init__ references non-existent instance attributes
self.CLIP_MODEL_NAME and self.DINO_MODEL_NAME; update the CLIP and DINO model
construction to use the module-level constants CLIP_MODEL_NAME and
DINO_MODEL_NAME instead. Specifically, change the calls that create
self.clip_processor (CLIPProcessor.from_pretrained), self.clip_model
(CLIPModel.from_pretrained), self.dino_processor
(AutoImageProcessor.from_pretrained) and self.dino_model
(AutoModel.from_pretrained) to pass the module-level names CLIP_MODEL_NAME and
DINO_MODEL_NAME rather than self.CLIP_MODEL_NAME/self.DINO_MODEL_NAME so the
models initialize without raising AttributeError.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 3329984f-e9a3-4146-8a0d-c1c121a392b4

📥 Commits

Reviewing files that changed from the base of the PR and between 90760a1 and 5d07db0.

📒 Files selected for processing (7)

wavefront/server/apps/inference_app/inference_app/config.ini
wavefront/server/apps/inference_app/inference_app/controllers/inference_controller.py
wavefront/server/apps/inference_app/inference_app/inference_app_container.py
wavefront/server/apps/inference_app/inference_app/service/image_analyser.py
wavefront/server/apps/inference_app/inference_app/service/image_embedding.py
wavefront/server/apps/inference_app/inference_app/service/model_inference.py
wavefront/server/apps/inference_app/inference_app/service/model_repository.py

💤 Files with no reviewable changes (4)

wavefront/server/apps/inference_app/inference_app/service/image_analyser.py
wavefront/server/apps/inference_app/inference_app/config.ini
wavefront/server/apps/inference_app/inference_app/service/model_repository.py
wavefront/server/apps/inference_app/inference_app/service/model_inference.py

wavefront/server/apps/inference_app/inference_app/controllers/inference_controller.py

coderabbitai · 2026-03-26T09:01:38Z

wavefront/server/apps/inference_app/inference_app/controllers/inference_controller.py

+def extract_decoded_image_data(image_data: str) -> bytes:
+    parts = image_data.split(',')
+    base64_data = parts[1] if len(parts) == 2 else parts[0]
+    return base64.b64decode(base64_data)


⚠️ Potential issue | 🟡 Minor

extract_decoded_image_data can raise unhandled binascii.Error on malformed base64.

base64.b64decode raises binascii.Error for invalid base64 input. This function is called directly without try/except in both endpoints. For the single-image endpoint (line 36), this would cause an HTTP 500. Consider adding error handling here or at call sites.

💡 Option: Add validation with a clearer error

+import binascii + +class InvalidImageDataError(Exception): + pass + def extract_decoded_image_data(image_data: str) -> bytes: parts = image_data.split(',') base64_data = parts[1] if len(parts) == 2 else parts[0] - return base64.b64decode(base64_data) + try: + return base64.b64decode(base64_data) + except binascii.Error as e: + raise InvalidImageDataError('Invalid base64 encoded image data') from e

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def extract_decoded_image_data(image_data: str) -> bytes:

parts = image_data.split(',')

base64_data = parts[1] if len(parts) == 2 else parts[0]

return base64.b64decode(base64_data)

import binascii

class InvalidImageDataError(Exception):

pass

def extract_decoded_image_data(image_data: str) -> bytes:

parts = image_data.split(',')

base64_data = parts[1] if len(parts) == 2 else parts[0]

try:

return base64.b64decode(base64_data)

except binascii.Error as e:

raise InvalidImageDataError('Invalid base64 encoded image data') from e

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@wavefront/server/apps/inference_app/inference_app/controllers/inference_controller.py` around lines 72 - 75, The helper extract_decoded_image_data currently calls base64.b64decode which can raise binascii.Error on malformed input; wrap the decode call in a try/except that catches binascii.Error (import binascii) and re-raise a clear ValueError or custom exception (e.g., "Invalid base64 image data") so caller endpoints can return a 4xx response instead of an unhandled 500; update any call sites (the single-image and multi-image endpoints) to catch that ValueError and convert it to an appropriate HTTP error response.

wavefront/server/apps/inference_app/inference_app/inference_app_container.py

wavefront/server/apps/inference_app/inference_app/service/image_embedding.py

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@wavefront/server/apps/inference_app/inference_app/controllers/inference_controller.py`:
- Around line 63-74: The batch endpoint currently only catches binascii.Error
when decoding images but not ValueError raised by
image_embedding_service.query_embed_batch (per image_embedding.py) when PIL
fails to decode a corrupted image; update the try/except around creating and
embedding the image batch to also catch ValueError (in addition to
binascii.Error) and return the same JSONResponse with status
HTTP_400_BAD_REQUEST and the 'Invalid base64 image data in batch' error via
response_formatter.buildErrorResponse so corrupted-but-base64 images produce 400
instead of 500; you should reference extract_decoded_image_data,
payload.image_batch, and image_embedding_service.query_embed_batch when locating
the code to change.
- Around line 36-38: The single-image endpoint calls
extract_decoded_image_data(payload.image_data) and then
image_embedding_service.query_embed(image_data) without handling malformed
base64; wrap the call to extract_decoded_image_data in a try/except that catches
binascii.Error (same as the batch endpoint) and convert it into a user-facing
400 response (e.g., raise HTTPException(status_code=400, detail="Malformed
base64 image data") or return the same error object used by the batch path) so
malformed inputs return 400 instead of propagating to the global 500 handler.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 7cd5d164-49ab-450d-8fc8-9c2d0b8b4232

📥 Commits

Reviewing files that changed from the base of the PR and between 5d07db0 and 6c053ef.

⛔ Files ignored due to path filters (1)

wavefront/server/uv.lock is excluded by !**/*.lock

📒 Files selected for processing (6)

wavefront/server/apps/floconsole/pyproject.toml
wavefront/server/apps/inference_app/inference_app/controllers/inference_controller.py
wavefront/server/apps/inference_app/inference_app/server.py
wavefront/server/apps/inference_app/inference_app/service/image_embedding.py
wavefront/server/apps/inference_app/pyproject.toml
wavefront/server/modules/insights_module/pyproject.toml

✅ Files skipped from review due to trivial changes (3)

wavefront/server/modules/insights_module/pyproject.toml
wavefront/server/apps/floconsole/pyproject.toml
wavefront/server/apps/inference_app/pyproject.toml

coderabbitai · 2026-04-06T12:36:04Z

wavefront/server/apps/inference_app/inference_app/controllers/inference_controller.py

+    # 1. Decode Base64 string
+    image_data = extract_decoded_image_data(payload.image_data)
+    embeddings = image_embedding_service.query_embed(image_data)


⚠️ Potential issue | 🟠 Major

Single endpoint missing binascii.Error handling.

The batch endpoint (lines 63-73) wraps the decode in a try/except for binascii.Error, but the single endpoint calls extract_decoded_image_data without similar protection. Malformed base64 input will propagate to the global exception handler and return HTTP 500 instead of a user-friendly 400.

🛡️ Proposed fix for consistent error handling

`@inference_app_router.post`('/v1/query/embeddings') `@inject` async def image_embedding( payload: ImagePayload, response_formatter: ResponseFormatter = Depends( Provide[CommonContainer.response_formatter] ), image_embedding_service: ImageEmbedding = Depends( Provide[InferenceAppContainer.image_embedding] ), ): - # 1. Decode Base64 string - image_data = extract_decoded_image_data(payload.image_data) + try: + image_data = extract_decoded_image_data(payload.image_data) + except binascii.Error: + return JSONResponse( + status_code=status.HTTP_400_BAD_REQUEST, + content=response_formatter.buildErrorResponse( + 'Invalid base64 image data' + ), + ) embeddings = image_embedding_service.query_embed(image_data)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@wavefront/server/apps/inference_app/inference_app/controllers/inference_controller.py` around lines 36 - 38, The single-image endpoint calls extract_decoded_image_data(payload.image_data) and then image_embedding_service.query_embed(image_data) without handling malformed base64; wrap the call to extract_decoded_image_data in a try/except that catches binascii.Error (same as the batch endpoint) and convert it into a user-facing 400 response (e.g., raise HTTPException(status_code=400, detail="Malformed base64 image data") or return the same error object used by the batch path) so malformed inputs return 400 instead of propagating to the global 500 handler.

wavefront/server/apps/inference_app/inference_app/controllers/inference_controller.py

vizsatiz added 2 commits March 25, 2026 16:50

Simply have image embedding generation only

4f987d6

Batch api for image embeddings

5d07db0

coderabbitai bot reviewed Mar 26, 2026

View reviewed changes

fix for simplified controller

6c053ef

coderabbitai bot reviewed Apr 6, 2026

View reviewed changes

vizsatiz merged commit 7a6e647 into develop Apr 6, 2026
10 checks passed

vizsatiz deleted the CU-86d2e1ka4-Simplify-inference-module branch April 6, 2026 14:03

coderabbitai bot mentioned this pull request Apr 7, 2026

Adjusting the streaming batchsize to 100 #271

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cu-86d2e1ka4: simplify inference module for image embedding generation only#262

Cu-86d2e1ka4: simplify inference module for image embedding generation only#262
vizsatiz merged 3 commits intodevelopfrom
CU-86d2e1ka4-Simplify-inference-module

vizsatiz commented Mar 26, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Mar 26, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

coderabbitai bot Mar 26, 2026

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Apr 6, 2026

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

vizsatiz commented Mar 26, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vizsatiz commented Mar 26, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 26, 2026 •

edited

Loading