Skip to content

add HF detection processor#377

Merged
dangusev merged 19 commits intomainfrom
add-transformers-detection
Mar 6, 2026
Merged

add HF detection processor#377
dangusev merged 19 commits intomainfrom
add-transformers-detection

Conversation

@maxkahan
Copy link
Contributor

@maxkahan maxkahan commented Feb 25, 2026

This pull request introduces major enhancements to the HuggingFace plugin for Vision Agents, expanding support for both cloud and local inference, adding new utilities for video annotation, and providing comprehensive documentation and examples. The most important changes are grouped below by theme.

API and Core Enhancements

  • Added DetectedObject type to the core events module (vision_agents/core/events/base.py, vision_agents/core/events/__init__.py) to standardize object detection results. [1] [2] [3]
  • Introduced a shared annotation utility function annotate_image for video processor plugins, allowing easy drawing of bounding boxes and labels on images (vision_agents/core/utils/annotation.py).

Plugin and Documentation Updates

  • Updated the HuggingFace plugin README to clarify installation and usage for both cloud and local inference, including new sections for local LLM, VLM, and object detection with quantization support (plugins/huggingface/README.md). [1] [2]
  • Added supervision and opencv-python as dependencies for local annotation and detection features (plugins/huggingface/pyproject.toml).

Examples and Usage

  • Added new example scripts demonstrating usage of the HuggingFace plugin for cloud VLM inference (including Baseten integration), local object detection via Transformers, and event subscription for detection results (plugins/huggingface/examples/inference_api/baseten_vlm_example.py, plugins/huggingface/examples/inference_api/inference_api_vlm_example.py, plugins/huggingface/examples/transformers/transformers_detection_example.py). [1] [2] [3]

Summary by CodeRabbit

  • New Features

    • Local Transformers-based object-detection processor with optional on-frame annotation, class filtering, and emitted detection-completed events.
    • Centralized image-annotation utility and a new public DetectedObject type.
    • Option to supply a custom HuggingFace API base URL for cloud/self-hosted inference.
  • Documentation

    • Expanded HuggingFace plugin guide with cloud vs. local workflows, configuration, and examples.
  • Tests

    • Added unit and integration tests covering detection processing and class filtering.
  • Examples

    • New example scripts demonstrating local and cloud VLM/LLM and detection workflows.

Note

Medium Risk
Medium risk due to a cross-cutting API change to LLM.simple_response (removing the processors argument) that touches many plugins/tests and could break downstream integrations. New local detection/annotation dependencies (opencv-python, supervision) and threaded inference add some runtime/performance uncertainty.

Overview
Adds local HuggingFace Transformers object detection. Introduces TransformersDetectionProcessor with optional frame annotation, class filtering, and a new DetectionCompletedEvent carrying detected objects + image metadata; includes new examples and tests covering detection output/annotation behavior.

Expands HuggingFace cloud configuration. HuggingFaceLLM/HuggingFaceVLM now accept base_url for custom OpenAI-compatible endpoints (mutually exclusive with provider), and the plugin exposes the new processor/event in __init__ while adding opencv-python/supervision optional deps.

Simplifies the core LLM interface. Removes the unused processors argument from LLM.simple_response and updates all provider implementations, agent call sites, and tests accordingly (plus minor example/doc cleanup and a small Roboflow annotation utility refactor).

Written by Cursor Bugbot for commit 1bd4e1d. This will update automatically on new commits. Configure here.

@coderabbitai
Copy link

coderabbitai bot commented Feb 25, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds a DetectedObject TypedDict and exports, a DetectionCompletedEvent, a TransformersDetectionProcessor for local HuggingFace object detection (with annotation utilities), new examples and tests, optional supervision/opencv extras, base_url support for HF LLM/VLM, and Roboflow processors updated to use the centralized annotator.

Changes

Cohort / File(s) Summary
Core Events & Exports
agents-core/vision_agents/core/events/base.py, agents-core/vision_agents/core/events/__init__.py
Add DetectedObject TypedDict and export it via package __all__.
Core Utilities (new)
agents-core/vision_agents/core/utils/annotation.py
New annotate_image helper using supervision/OpenCV/Numpy for masks, boxes, labels, optional dimming and mask opacity; guarded imports and configurable label styling.
HuggingFace Detection Processor (new)
plugins/huggingface/vision_agents/plugins/huggingface/transformers_detection.py
Introduce TransformersDetectionProcessor and DetectionResources: model loading, device/dtype handling, threaded per-frame inference, class filtering, optional annotation via core helper, event emission, lifecycle (warmup, publish/stop, unload).
HuggingFace Plugin Events
plugins/huggingface/vision_agents/plugins/huggingface/events.py
Add DetectionCompletedEvent (extends VideoProcessorDetectionEvent) with objects, image_width, image_height, and detection_count set in __post_init__.
HuggingFace Plugin Exports
plugins/huggingface/vision_agents/plugins/huggingface/__init__.py
Conditionally import/export TransformersDetectionProcessor and DetectionCompletedEvent; note supervision among optional dependencies in ImportError guidance.
HuggingFace LLM/VLM changes
plugins/huggingface/vision_agents/plugins/huggingface/huggingface_llm.py, .../huggingface_vlm.py
Add base_url parameter, enforce mutual exclusivity with provider, adjust client initialization paths, and update simple_response typing to LLMResponseEvent[Any].
HuggingFace Examples
plugins/huggingface/examples/.../*.py
Add example agents: Baseten VLM, inference API VLM, and local Transformers detection examples with create_agent/join_call and Runner entrypoints.
HuggingFace Tests
plugins/huggingface/tests/test_transformers_detection.py
New tests and fixtures validating TransformersDetectionProcessor: class filtering, detection outputs, event emission, and edge cases.
HuggingFace Config
plugins/huggingface/pyproject.toml
Add optional extras to transformers group: supervision>=0.21.0 and opencv-python>=4.8.0.
HuggingFace Docs
plugins/huggingface/README.md
Rewrite README to separate cloud (API) and local (Transformers) usage, update examples, parameters, and detection usage guidance.
Roboflow: annotate_image relocation
plugins/roboflow/vision_agents/.../roboflow_cloud_processor.py, .../roboflow_local_processor.py, plugins/roboflow/vision_agents/plugins/roboflow/utils.py
Remove local annotate_image; processors now import centralized agents-core annotation utility; local function deleted.
Other small changes
plugins/openai/examples/qwen_vl_example/qwen_vl_example.py, CLAUDE.md
Minor example model/string and call-flow tweaks; two documentation guidelines added to CLAUDE.md.

Sequence Diagram

sequenceDiagram
    participant Agent
    participant Processor as TransformersDetectionProcessor
    participant Track as VideoTrack
    participant Model as DetectionModel
    participant EventMgr as EventManager

    Agent->>Processor: attach_agent()
    Agent->>Processor: process_video(track)
    Track->>Processor: av.VideoFrame
    Processor->>Processor: _process_frame()
    Processor->>Model: _run_inference(image)
    Model-->>Processor: raw detections
    Processor->>Processor: _detect() → list[DetectedObject]
    alt annotate enabled
        Processor->>Processor: _annotate(image, objects) → annotated image
        Processor->>Track: output annotated frame
    end
    Processor->>EventMgr: emit DetectionCompletedEvent(objects, image_width, image_height)
    EventMgr->>Agent: event notification
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Poem

The lens keeps threshing, a bright knife;
boxes bloom like small verdicts across the face.
Confidence bleeds into the white of the frame,
names nailed to moving shadows, cool as bone.
The machine catalogs the dark—no mercy, no grace.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'add HF detection processor' clearly refers to the main change: adding a HuggingFace detection processor (TransformersDetectionProcessor) to enable local object detection.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch add-transformers-detection

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 11

🧹 Nitpick comments (7)
plugins/openai/examples/qwen_vl_example/qwen_vl_example.py (1)

26-31: Indentation bleeds into the instructions string.

The triple-quoted block includes a leading newline and 8 spaces of indentation before each line, all of which are passed verbatim to Agent. Use textwrap.dedent to strip them, or collapse to a plain concatenated string.

♻️ Proposed fix using `textwrap.dedent`
+import textwrap
 ...
-        instructions="""
-        You're a helpful video AI assistant. 
-        Analyze the video frames and respond to user questions about what you see. 
-        Keep responses to one sentence. 
-        Be concise and direct.
-        """,
+        instructions=textwrap.dedent("""\
+            You're a helpful video AI assistant.
+            Analyze the video frames and respond to user questions about what you see.
+            Keep responses to one sentence.
+            Be concise and direct.
+        """),
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/openai/examples/qwen_vl_example/qwen_vl_example.py` around lines 26 -
31, The multi-line instructions string assigned to instructions includes a
leading newline and indentation that will be passed verbatim to Agent; wrap the
string with textwrap.dedent (and import textwrap at top) or replace with a
concatenated single-line string so the Agent receives no extra leading
whitespace—update the assignment to use textwrap.dedent("""...""") around the
block (or collapse to "You're a helpful video AI assistant. Analyze..." ) and
ensure the import for textwrap is added.
plugins/huggingface/examples/inference_api/baseten_vlm_example.py (1)

44-45: Validate env vars with explicit errors instead of raw KeyError.

Using os.environ[...] fails with a less actionable traceback when variables are missing. Validate first and raise a descriptive ValueError.

🛠️ Suggested validation improvement
 async def create_agent(**kwargs) -> Agent:
     """Create the agent with a Baseten-hosted VLM."""
+    base_url = os.getenv("BASETEN_BASE_URL")
+    api_key = os.getenv("BASETEN_API_KEY")
+    if not base_url:
+        raise ValueError("BASETEN_BASE_URL environment variable is required.")
+    if not api_key:
+        raise ValueError("BASETEN_API_KEY environment variable is required.")
+
     agent = Agent(
         edge=getstream.Edge(),
         agent_user=User(name="Baseten VLM Agent", id="agent"),
@@
         llm=huggingface.VLM(
             model="Qwen/Qwen2.5-VL-72B-Instruct",
-            base_url=os.environ["BASETEN_BASE_URL"],
-            api_key=os.environ["BASETEN_API_KEY"],
+            base_url=base_url,
+            api_key=api_key,
             fps=1,
             frame_buffer_seconds=3,
         ),
As per coding guidelines, "Raise `ValueError` with a descriptive message for invalid constructor arguments; prefer custom domain exceptions over generic ones".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/huggingface/examples/inference_api/baseten_vlm_example.py` around
lines 44 - 45, Replace direct os.environ[...] access with explicit validation
that raises ValueError with a clear message when env vars are missing: check the
environment for "BASETEN_BASE_URL" and "BASETEN_API_KEY" before constructing the
client (the parameters base_url and api_key where os.environ is currently used),
and if either is absent raise ValueError("BASETEN_BASE_URL is required") or
ValueError("BASETEN_API_KEY is required") respectively so callers see a
descriptive error instead of a raw KeyError.
plugins/huggingface/README.md (1)

18-53: Consider adding a base_url cloud example for OpenAI-compatible endpoints.

The code now supports base_url in huggingface.LLM/huggingface.VLM; a short snippet here would make that path discoverable (e.g., Baseten-style usage).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/huggingface/README.md` around lines 18 - 53, Add an additional small
example showing how to initialize huggingface.LLM and huggingface.VLM with the
base_url parameter for OpenAI-compatible endpoints (e.g., Baseten-style usage);
update README.md to include a short snippet demonstrating passing
base_url="https://your-openai-compatible-endpoint" alongside model and provider
(referencing LLM and VLM constructors) and show calling simple_response so users
can discover the cloud base_url path.
plugins/huggingface/vision_agents/plugins/huggingface/transformers_detection.py (3)

131-132: Local imports in _load_model_sync and _annotate violate the top-of-module import rule.

Lines 132 and 328–330 use local imports for transformers and supervision. These are optional dependencies, so if the intent is lazy loading for optional extras, consider guarding them at the top of the module (e.g., inside a try/except ImportError block at module level, similar to how __init__.py already handles the optional import of this entire module). That way the guideline is satisfied and the lazy-load intent is preserved.

As per coding guidelines, "Do not use local imports; import at the top of the module".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@plugins/huggingface/vision_agents/plugins/huggingface/transformers_detection.py`
around lines 131 - 132, The local imports of optional dependencies inside
_load_model_sync and _annotate (AutoImageProcessor/AutoModelForObjectDetection
from transformers and supervision) violate the top-of-module import rule; move
these optional imports to the top of the module inside a guarded try/except
ImportError block (e.g., attempt to import transformers and supervision at
module scope and set module-level flags or None for missing packages), then
update _load_model_sync and _annotate to reference those module-level names or
raise a clear error if the optional dependency is unavailable so the lazy-load
intent is preserved while keeping imports at module top.

339-344: unload is a public method placed after all private helpers.

Per the class method ordering guideline (__init__, public lifecycle, properties, public feature methods, private helpers, dunder methods), unload should sit with the other public lifecycle/feature methods (near close/stop_processing), not after the private helpers.

As per coding guidelines, "Order class methods as: __init__, public lifecycle methods, properties, public feature methods, private helpers, dunder methods".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@plugins/huggingface/vision_agents/plugins/huggingface/transformers_detection.py`
around lines 339 - 344, unload is a public lifecycle method but is placed after
private helpers; move the unload method so it sits with the other public
lifecycle/feature methods (near close and stop_processing) following __init__,
then public lifecycle methods, properties, public features, private helpers,
then dunder methods; locate the unload definition and cut/paste it into the
class section where close/stop_processing are implemented to restore the correct
method ordering.

196-202: close() should be async-first with resource cleanup in finally.

If stop_processing() or unload() raises, subsequent cleanup steps are skipped. Wrapping in try/finally ensures the executor and video track are always shut down, even on failure. This aligns with the guideline on cleaning up resources in finally blocks.

Proposed fix
     async def close(self) -> None:
-        await self.stop_processing()
-        self._closed = True
-        self._executor.shutdown(wait=False)
-        self._video_track.stop()
-        self.unload()
-        logger.info("Transformers detection processor closed")
+        try:
+            await self.stop_processing()
+        finally:
+            self._closed = True
+            self._executor.shutdown(wait=False)
+            self._video_track.stop()
+            self.unload()
+            logger.info("Transformers detection processor closed")

As per coding guidelines, "Use asyncio.Lock, asyncio.Task, asyncio.gather for concurrency; clean up resources in finally blocks".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@plugins/huggingface/vision_agents/plugins/huggingface/transformers_detection.py`
around lines 196 - 202, The close() method currently runs cleanup sequentially
so if await self.stop_processing() or self.unload() raises, subsequent shutdown
steps are skipped; refactor close() to be async-first and ensure critical
cleanup always runs by surrounding the awaited work in try/finally: await
self.stop_processing() and self.unload() in the try block (or perform awaited
operations first), set self._closed = True there, and in the finally block
always call self._executor.shutdown(wait=False) and self._video_track.stop()
(and log afterward) so the executor and video track are shut down even on
errors; reference the close(), stop_processing(), unload(),
self._executor.shutdown and self._video_track.stop symbols when making the
change.
plugins/huggingface/vision_agents/plugins/huggingface/__init__.py (1)

13-18: Silent pass on inner ImportError gives no signal when detection deps are missing.

If a user has torch and transformers installed but is missing supervision or opencv-python, TransformersDetectionProcessor silently vanishes from the public API with no warning. The outer block helpfully warns — consider doing the same here so users know to install the [transformers] extra with detection dependencies.

Proposed fix
     try:
         from .transformers_detection import TransformersDetectionProcessor

         __all__ += ["TransformersDetectionProcessor"]
-    except ImportError:
-        pass
+    except ImportError as exc:
+        import warnings
+
+        warnings.warn(
+            f"Optional dependency '{exc.name}' is not installed. "
+            "Install the [transformers] extra with detection support.",
+            stacklevel=2,
+        )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/huggingface/vision_agents/plugins/huggingface/__init__.py` around
lines 13 - 18, Replace the silent "except ImportError: pass" around the import
of TransformersDetectionProcessor with the same warning behavior used in the
outer block: catch ImportError as e, emit a clear warning (via warnings.warn or
the module logger) that detection dependencies are missing and advise installing
the "[transformers]" extra, include the exception message for context, and keep
__all__ unchanged when the import fails so the public API reflects availability
correctly; target the import of TransformersDetectionProcessor and the
surrounding __all__ manipulation in __init__.py.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@agents-core/vision_agents/core/utils/annotation.py`:
- Line 66: The current list comprehension labels = [classes[class_id] for
class_id in detected_class_ids] can KeyError for class_ids missing from classes;
update the lookup in annotation.py to guard against missing keys by using
classes.get(class_id, fallback) or by filtering detected_class_ids first (e.g.,
valid_ids = [id for id in detected_class_ids if id in classes]) and then mapping
to labels; also add a small log or placeholder label (e.g.,
f"unknown_{class_id}") for any skipped/missing ids so annotation-related code
(detected_class_ids, classes, labels) won’t crash when the model returns unseen
class IDs.
- Around line 48-53: The current check "if dim_factor:" ignores valid 0.0
inputs; update the logic in the annotation routine that builds the mask from
detections.xyxy and applies dimming so it uses an explicit "dim_factor is not
None" guard, validate that dim_factor is a number within [0.0, 1.0] (raise
ValueError on invalid), and then apply the multiplication even when dim_factor
== 0.0 so pixels outside bounding boxes become fully black; keep the existing
mask creation and cv2.rectangle calls but replace the truthy check with the
explicit None check and bounds validation before using dim_factor.
- Line 7: Remove the forbidden "from __future__ import annotations" line and
move the runtime imports (cv2, numpy, supervision) out of the function into
module-level imports guarded by optional-dependency checks: for type-only
imports use TYPE_CHECKING from typing, and for runtime packages wrap imports in
try/except ImportError to raise a clear error or set a sentinel if missing;
update any function-local import sites (the imports currently inside functions
in this module) to use the module-level names (e.g., cv2, np, supervision) so
functions like the ones referencing those local imports use the guarded
module-level symbols.

In `@plugins/huggingface/examples/inference_api/baseten_vlm_example.py`:
- Around line 13-16: The Requirements docstring is missing BASETEN_BASE_URL
while the script reads BASETEN_BASE_URL (environment variable) later; update the
top-of-file requirements block (the docstring listing required env vars) to
include BASETEN_BASE_URL so readers know to set it, ensuring the same exact env
var name BASETEN_BASE_URL is added alongside BASETEN_API_KEY, STREAM_API_KEY,
STREAM_API_SECRET, and DEEPGRAM_API_KEY in the docstring above the script.

In `@plugins/huggingface/tests/test_transformers_detection.py`:
- Around line 115-119: The test event handlers use future:
asyncio.Future[DetectionCompletedEvent] and call future.set_result(event)
unconditionally, which raises asyncio.InvalidStateError if the handler runs more
than once; update each on_event handler (the functions subscribed via
events_manager.subscribe around lines with future declarations) to check
future.done() before calling future.set_result (or catch InvalidStateError) so
the second and subsequent events are ignored and the test future is only
resolved once; apply the same guard to all three handlers referenced in the
comment.
- Line 7: Replace MagicMock usage in this test: remove the MagicMock import and
stop mocking in the agent_mock fixture and TestDetectClassFiltering; instead
implement minimal concrete stubs that satisfy the interfaces used by the tests.
For agent_mock, construct a simple stub or instantiate a real Agent with only
the events dependency (provide a lightweight events object) so tests call real
methods rather than a MagicMock; for TestDetectClassFiltering create tiny
FakeModel and FakeProcessor classes that implement the specific methods used by
the detection code (e.g., predict/process/forward) and return deterministic,
testable outputs rather than mocking the whole model; update the fixture and
test code to use these stubs and remove MagicMock references.

In `@plugins/huggingface/vision_agents/plugins/huggingface/huggingface_llm.py`:
- Around line 51-52: The constructor currently allows both provider and base_url
(and similar overloads) and silently prefers base_url; update
HuggingFaceLLM.__init__ (and any other constructors/factory functions mentioned
around the same areas) to validate that provider and base_url are mutually
exclusive: if both are provided, raise a ValueError with a clear message (e.g.,
"provider and base_url are mutually exclusive; provide only one"). Also apply
the same mutual-exclusion check where client and base_url/provider combinations
are accepted (referencing AsyncInferenceClient parameter) to ensure consistent
validation and raise the same descriptive ValueError when invalid argument
combinations are passed.

In `@plugins/huggingface/vision_agents/plugins/huggingface/huggingface_vlm.py`:
- Around line 57-58: The constructor of HuggingFaceVLM (the __init__ method
accepting provider and base_url) must enforce mutual exclusivity instead of
silently preferring one: validate that not both provider and base_url are
provided and not both are None when one is required, and raise a ValueError with
a clear message (e.g., "provider and base_url are mutually exclusive; provide
exactly one") when the contract is violated; apply the same validation logic to
the other constructor/initialization places that accept these params (the
alternative overloads/initializers around the provider/base_url handling) so all
entry points raise ValueError on invalid argument combinations.
- Line 12: The code imports the internal type alias PROVIDER_OR_POLICY_T from
huggingface_hub.inference._providers; replace this with the public stable type
Optional[str] by removing the private import, adding "from typing import
Optional" if not present, and updating any annotations that use
PROVIDER_OR_POLICY_T (e.g., function parameters or variables named provider) to
use Optional[str] instead.

In
`@plugins/huggingface/vision_agents/plugins/huggingface/transformers_detection.py`:
- Around line 221-226: Replace the bare except in the inference block that wraps
the call to self._run_inference(image) so it only catches expected failures
(e.g., RuntimeError and ValueError) instead of all Exceptions; update the except
clause(s) to catch those specific exceptions, keep the logger.exception("Frame
detection failed") call (or include the caught exception info) and still call
await self._video_track.add_frame(frame) before returning; refer to the
try/except around self._run_inference, logger.exception, and
self._video_track.add_frame to locate the exact spot.
- Line 21: Remove the forbidden statement "from __future__ import annotations"
from the top of transformers_detection.py and ensure any postponed/string
annotations are converted back to real evaluated types or handled via
typing.TYPE_CHECKING or explicit imports; inspect dataclass fields and any
isinstance/type checks in this module (e.g., any dataclass definitions or uses
of annotations) and replace string-literal type annotations with actual types or
guarded imports so runtime behavior and isinstance checks remain correct.

---

Nitpick comments:
In `@plugins/huggingface/examples/inference_api/baseten_vlm_example.py`:
- Around line 44-45: Replace direct os.environ[...] access with explicit
validation that raises ValueError with a clear message when env vars are
missing: check the environment for "BASETEN_BASE_URL" and "BASETEN_API_KEY"
before constructing the client (the parameters base_url and api_key where
os.environ is currently used), and if either is absent raise
ValueError("BASETEN_BASE_URL is required") or ValueError("BASETEN_API_KEY is
required") respectively so callers see a descriptive error instead of a raw
KeyError.

In `@plugins/huggingface/README.md`:
- Around line 18-53: Add an additional small example showing how to initialize
huggingface.LLM and huggingface.VLM with the base_url parameter for
OpenAI-compatible endpoints (e.g., Baseten-style usage); update README.md to
include a short snippet demonstrating passing
base_url="https://your-openai-compatible-endpoint" alongside model and provider
(referencing LLM and VLM constructors) and show calling simple_response so users
can discover the cloud base_url path.

In `@plugins/huggingface/vision_agents/plugins/huggingface/__init__.py`:
- Around line 13-18: Replace the silent "except ImportError: pass" around the
import of TransformersDetectionProcessor with the same warning behavior used in
the outer block: catch ImportError as e, emit a clear warning (via warnings.warn
or the module logger) that detection dependencies are missing and advise
installing the "[transformers]" extra, include the exception message for
context, and keep __all__ unchanged when the import fails so the public API
reflects availability correctly; target the import of
TransformersDetectionProcessor and the surrounding __all__ manipulation in
__init__.py.

In
`@plugins/huggingface/vision_agents/plugins/huggingface/transformers_detection.py`:
- Around line 131-132: The local imports of optional dependencies inside
_load_model_sync and _annotate (AutoImageProcessor/AutoModelForObjectDetection
from transformers and supervision) violate the top-of-module import rule; move
these optional imports to the top of the module inside a guarded try/except
ImportError block (e.g., attempt to import transformers and supervision at
module scope and set module-level flags or None for missing packages), then
update _load_model_sync and _annotate to reference those module-level names or
raise a clear error if the optional dependency is unavailable so the lazy-load
intent is preserved while keeping imports at module top.
- Around line 339-344: unload is a public lifecycle method but is placed after
private helpers; move the unload method so it sits with the other public
lifecycle/feature methods (near close and stop_processing) following __init__,
then public lifecycle methods, properties, public features, private helpers,
then dunder methods; locate the unload definition and cut/paste it into the
class section where close/stop_processing are implemented to restore the correct
method ordering.
- Around line 196-202: The close() method currently runs cleanup sequentially so
if await self.stop_processing() or self.unload() raises, subsequent shutdown
steps are skipped; refactor close() to be async-first and ensure critical
cleanup always runs by surrounding the awaited work in try/finally: await
self.stop_processing() and self.unload() in the try block (or perform awaited
operations first), set self._closed = True there, and in the finally block
always call self._executor.shutdown(wait=False) and self._video_track.stop()
(and log afterward) so the executor and video track are shut down even on
errors; reference the close(), stop_processing(), unload(),
self._executor.shutdown and self._video_track.stop symbols when making the
change.

In `@plugins/openai/examples/qwen_vl_example/qwen_vl_example.py`:
- Around line 26-31: The multi-line instructions string assigned to instructions
includes a leading newline and indentation that will be passed verbatim to
Agent; wrap the string with textwrap.dedent (and import textwrap at top) or
replace with a concatenated single-line string so the Agent receives no extra
leading whitespace—update the assignment to use textwrap.dedent("""...""")
around the block (or collapse to "You're a helpful video AI assistant.
Analyze..." ) and ensure the import for textwrap is added.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between f684ece and d5e8dd2.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (15)
  • agents-core/vision_agents/core/events/__init__.py
  • agents-core/vision_agents/core/events/base.py
  • agents-core/vision_agents/core/utils/annotation.py
  • plugins/huggingface/README.md
  • plugins/huggingface/examples/inference_api/baseten_vlm_example.py
  • plugins/huggingface/examples/inference_api/inference_api_vlm_example.py
  • plugins/huggingface/examples/transformers/transformers_detection_example.py
  • plugins/huggingface/pyproject.toml
  • plugins/huggingface/tests/test_transformers_detection.py
  • plugins/huggingface/vision_agents/plugins/huggingface/__init__.py
  • plugins/huggingface/vision_agents/plugins/huggingface/events.py
  • plugins/huggingface/vision_agents/plugins/huggingface/huggingface_llm.py
  • plugins/huggingface/vision_agents/plugins/huggingface/huggingface_vlm.py
  • plugins/huggingface/vision_agents/plugins/huggingface/transformers_detection.py
  • plugins/openai/examples/qwen_vl_example/qwen_vl_example.py

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
plugins/roboflow/vision_agents/plugins/roboflow/roboflow_local_processor.py (1)

29-30: Import order violates guidelines: local package import placed after relative import.

The import from vision_agents.core.utils.annotation belongs with the other local package imports (lines 22–27), not after the relative import from .events.

🔧 Suggested fix
 from vision_agents.core.warmup import Warmable
+from vision_agents.core.utils.annotation import annotate_image

 from .events import DetectedObject, DetectionCompletedEvent
-from vision_agents.core.utils.annotation import annotate_image

As per coding guidelines: "Order imports as: stdlib, third-party, local package, relative."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/roboflow/vision_agents/plugins/roboflow/roboflow_local_processor.py`
around lines 29 - 30, Move the local package import for annotate_image into the
block with other local package imports (so it appears before the relative import
from .events); specifically, place the import of
vision_agents.core.utils.annotation.annotate_image alongside the existing local
imports and keep the relative imports for DetectedObject and
DetectionCompletedEvent after them to satisfy the required import order (stdlib,
third-party, local package, relative).
plugins/huggingface/tests/test_transformers_detection.py (1)

51-96: Test logic is sound, but accesses private method _detect.

This test directly invokes processor._detect(image) which is a private method (prefixed with _). While this is acceptable for unit-testing internal filtering logic that's difficult to verify via integration tests alone, be aware this couples the test to implementation details.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/huggingface/tests/test_transformers_detection.py` around lines 51 -
96, The test calls the private method _detect on TransformersDetectionProcessor
(coupling the test to internals); change the test to use a public API instead by
replacing processor._detect(image) with processor.detect(image), and if
TransformersDetectionProcessor lacks a detect(...) public method add a simple
public wrapper detect(self, image) that delegates to the existing _detect(...)
so tests exercise the public contract rather than a private implementation
detail; update TestDetectClassFiltering accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@plugins/huggingface/examples/inference_api/baseten_vlm_example.py`:
- Line 33: The functions create_agent and join_call use untyped variadic kwargs;
update their signatures to annotate **kwargs with typing.Any (e.g., change
create_agent(**kwargs) and join_call(**kwargs) to create_agent(**kwargs: Any)
and join_call(**kwargs: Any)) and add the necessary import (from typing import
Any) so the module has explicit type annotations for these parameters.
- Around line 43-46: Add explicit validation for the environment variables used
in the VLM instantiation: check for "BASETEN_BASE_URL" and "BASETEN_API_KEY"
before calling huggingface.VLM (the block creating llm=huggingface.VLM(...)) and
if any are missing raise a ValueError listing which variables are absent so the
script fails fast with a descriptive message rather than a KeyError.

---

Nitpick comments:
In `@plugins/huggingface/tests/test_transformers_detection.py`:
- Around line 51-96: The test calls the private method _detect on
TransformersDetectionProcessor (coupling the test to internals); change the test
to use a public API instead by replacing processor._detect(image) with
processor.detect(image), and if TransformersDetectionProcessor lacks a
detect(...) public method add a simple public wrapper detect(self, image) that
delegates to the existing _detect(...) so tests exercise the public contract
rather than a private implementation detail; update TestDetectClassFiltering
accordingly.

In `@plugins/roboflow/vision_agents/plugins/roboflow/roboflow_local_processor.py`:
- Around line 29-30: Move the local package import for annotate_image into the
block with other local package imports (so it appears before the relative import
from .events); specifically, place the import of
vision_agents.core.utils.annotation.annotate_image alongside the existing local
imports and keep the relative imports for DetectedObject and
DetectionCompletedEvent after them to satisfy the required import order (stdlib,
third-party, local package, relative).

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between d5e8dd2 and c37f4bb.

📒 Files selected for processing (11)
  • CLAUDE.md
  • agents-core/vision_agents/core/utils/annotation.py
  • plugins/huggingface/examples/inference_api/baseten_vlm_example.py
  • plugins/huggingface/tests/test_transformers_detection.py
  • plugins/huggingface/vision_agents/plugins/huggingface/__init__.py
  • plugins/huggingface/vision_agents/plugins/huggingface/huggingface_llm.py
  • plugins/huggingface/vision_agents/plugins/huggingface/huggingface_vlm.py
  • plugins/huggingface/vision_agents/plugins/huggingface/transformers_detection.py
  • plugins/roboflow/vision_agents/plugins/roboflow/roboflow_cloud_processor.py
  • plugins/roboflow/vision_agents/plugins/roboflow/roboflow_local_processor.py
  • plugins/roboflow/vision_agents/plugins/roboflow/utils.py
💤 Files with no reviewable changes (1)
  • plugins/roboflow/vision_agents/plugins/roboflow/utils.py
✅ Files skipped from review due to trivial changes (2)
  • plugins/roboflow/vision_agents/plugins/roboflow/roboflow_cloud_processor.py
  • CLAUDE.md
🚧 Files skipped from review as they are similar to previous changes (1)
  • agents-core/vision_agents/core/utils/annotation.py

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (3)
plugins/huggingface/vision_agents/plugins/huggingface/transformers_detection.py (3)

93-93: Use modern X | None union syntax instead of Optional[X].

The coding guidelines specify modern type annotation syntax with X | Y unions. Replace Optional[list[str]] with list[str] | None.

♻️ Proposed fix
     def __init__(
         self,
         model: str = "PekingU/rtdetr_v2_r101vd",
         conf_threshold: float = 0.5,
         fps: int = 10,
-        classes: Optional[list[str]] = None,
+        classes: list[str] | None = None,
         device: DeviceType = "auto",
         annotate: bool = True,
     ):

As per coding guidelines, "Use type annotations everywhere with modern syntax: X | Y unions".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@plugins/huggingface/vision_agents/plugins/huggingface/transformers_detection.py`
at line 93, Replace the legacy Optional typing for the parameter named "classes"
in transformers_detection.py: change its annotation from Optional[list[str]] to
the modern union form list[str] | None; update any other occurrences in the same
function or nearby signature (e.g., the function/method where "classes:
Optional[list[str]] = None" appears) to use the X | None syntax consistently.

160-165: Use modern union syntax in method signature.

♻️ Proposed fix
     async def process_video(
         self,
         track: "aiortc.VideoStreamTrack",
-        participant_id: Optional[str],
-        shared_forwarder: Optional[VideoForwarder] = None,
+        participant_id: str | None,
+        shared_forwarder: VideoForwarder | None = None,
     ) -> None:

As per coding guidelines, "Use type annotations everywhere with modern syntax: X | Y unions".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@plugins/huggingface/vision_agents/plugins/huggingface/transformers_detection.py`
around lines 160 - 165, Update the process_video method signature to use modern
union syntax instead of typing.Optional: replace "participant_id: Optional[str]"
with "participant_id: str | None" and "shared_forwarder:
Optional[VideoForwarder] = None" with "shared_forwarder: VideoForwarder | None =
None" in the async def process_video(...) declaration (keep the same parameter
order and default values); ensure any type imports remain valid for
VideoStreamTrack and VideoForwarder and adjust import of Optional if it's now
unused.

108-113: Use modern X | None union syntax for instance attributes.

The attribute type hints use Optional[T] but should use T | None per project guidelines.

♻️ Proposed fix
-        self._resources: Optional[DetectionResources] = None
-        self._events: Optional[EventManager] = None
+        self._resources: DetectionResources | None = None
+        self._events: EventManager | None = None

         self._closed = False
         self._last_log_time: float = 0.0
-        self._video_forwarder: Optional[VideoForwarder] = None
+        self._video_forwarder: VideoForwarder | None = None

As per coding guidelines, "Use type annotations everywhere with modern syntax: X | Y unions".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@plugins/huggingface/vision_agents/plugins/huggingface/transformers_detection.py`
around lines 108 - 113, The instance attribute type hints use Optional[T];
update them to modern union syntax using T | None by replacing
Optional[DetectionResources] with DetectionResources | None for _resources,
Optional[EventManager] with EventManager | None for _events, and
Optional[VideoForwarder] with VideoForwarder | None for _video_forwarder (leave
non-nullable primitives like _closed and _last_log_time unchanged) so the
annotations in the class/init match the project's `X | None` style.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@plugins/huggingface/vision_agents/plugins/huggingface/huggingface_llm.py`:
- Line 50: In the HuggingFaceLLM constructor (__init__) validate the base_url
parameter by checking if base_url is not None and base_url.strip() is non-empty;
if the check fails raise a ValueError with a clear message (e.g. "base_url must
be a non-empty string"). Apply the same non-empty strip validation for any other
constructor/config-entry handling in the same class (lines referenced around
75-79) so invalid/empty URLs are rejected early; reference the HuggingFaceLLM
class and its __init__ (and any factory/from_config methods) when making the
change.

In `@plugins/huggingface/vision_agents/plugins/huggingface/huggingface_vlm.py`:
- Around line 57-58: The constructor in huggingface_vlm.py should validate that
the base_url parameter is not just non-None but also a non-empty string; inside
the class __init__ (and any other initializer/factory handling base_url around
the lines referenced) add a check like: if base_url is not None and not
base_url.strip(): raise ValueError("base_url must be a non-empty string when
provided") (or raise the project-specific domain exception if one exists) so
callers fail fast instead of deferring to request time; apply the same
non-empty-string validation to the other initializer block mentioned (lines
~85-89) that accepts base_url.

---

Nitpick comments:
In
`@plugins/huggingface/vision_agents/plugins/huggingface/transformers_detection.py`:
- Line 93: Replace the legacy Optional typing for the parameter named "classes"
in transformers_detection.py: change its annotation from Optional[list[str]] to
the modern union form list[str] | None; update any other occurrences in the same
function or nearby signature (e.g., the function/method where "classes:
Optional[list[str]] = None" appears) to use the X | None syntax consistently.
- Around line 160-165: Update the process_video method signature to use modern
union syntax instead of typing.Optional: replace "participant_id: Optional[str]"
with "participant_id: str | None" and "shared_forwarder:
Optional[VideoForwarder] = None" with "shared_forwarder: VideoForwarder | None =
None" in the async def process_video(...) declaration (keep the same parameter
order and default values); ensure any type imports remain valid for
VideoStreamTrack and VideoForwarder and adjust import of Optional if it's now
unused.
- Around line 108-113: The instance attribute type hints use Optional[T]; update
them to modern union syntax using T | None by replacing
Optional[DetectionResources] with DetectionResources | None for _resources,
Optional[EventManager] with EventManager | None for _events, and
Optional[VideoForwarder] with VideoForwarder | None for _video_forwarder (leave
non-nullable primitives like _closed and _last_log_time unchanged) so the
annotations in the class/init match the project's `X | None` style.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 5c16ab8 and cf69846.

📒 Files selected for processing (3)
  • plugins/huggingface/vision_agents/plugins/huggingface/huggingface_llm.py
  • plugins/huggingface/vision_agents/plugins/huggingface/huggingface_vlm.py
  • plugins/huggingface/vision_agents/plugins/huggingface/transformers_detection.py

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
plugins/huggingface/tests/test_transformers_detection.py (1)

7-7: ⚠️ Potential issue | 🟠 Major

Replace MagicMock usage with concrete test stubs.

MagicMock is still used in this test module (fixture and class-filtering test), which violates the repo’s test rules. Please switch to lightweight fake classes/objects (e.g., FakeAgent, FakeModel, FakeImageProcessor) with deterministic behavior.

Suggested direction
-from unittest.mock import MagicMock
@@
-@pytest.fixture()
-def agent_mock(events_manager: EventManager) -> Agent:
-    agent = MagicMock()
-    agent.events = events_manager
-    return agent
+class FakeAgent:
+    def __init__(self, events: EventManager) -> None:
+        self.events = events
+
+@pytest.fixture()
+def agent_mock(events_manager: EventManager) -> Agent:
+    return FakeAgent(events_manager)  # or real Agent(...) if lightweight

@@
-        model = MagicMock()
-        model.config.id2label = {0: "cat", 1: "dog", 2: "person"}
+        class FakeConfig:
+            id2label = {0: "cat", 1: "dog", 2: "person"}
+
+        class FakeModel:
+            config = FakeConfig()

-        image_processor = MagicMock()
+        class FakeImageProcessor:
+            def __call__(self, *args, **kwargs):
+                return {"pixel_values": torch.zeros(1, 3, 224, 224)}
+
+            def post_process_object_detection(self, *args, **kwargs):
+                return [{
+                    "scores": torch.tensor([0.95, 0.90, 0.85]),
+                    "labels": torch.tensor([0, 1, 2]),
+                    "boxes": torch.tensor([
+                        [10.0, 20.0, 100.0, 200.0],
+                        [50.0, 60.0, 150.0, 250.0],
+                        [200.0, 100.0, 400.0, 300.0],
+                    ]),
+                }]
+        image_processor = FakeImageProcessor()

As per coding guidelines, "Never mock in tests; use pytest for testing" and "Use pytest framework; never mock tests".

Also applies to: 45-48, 60-83

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/huggingface/tests/test_transformers_detection.py` at line 7, Replace
uses of MagicMock in the test_transformers_detection module with small concrete
test stubs: implement lightweight classes (e.g., FakeAgent, FakeModel,
FakeImageProcessor) that expose the same public methods/attributes the tests
expect and return deterministic, hard-coded values; update the fixture(s) and
the class-filtering test to instantiate and use those fakes instead of MagicMock
(remove any unittest.mock imports and references to MagicMock) and ensure the
fakes simulate edge cases the tests cover so behavior remains identical but
deterministic.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@plugins/huggingface/tests/test_transformers_detection.py`:
- Around line 51-57: Tests for the same SUT are split across
TestDetectClassFiltering and TestTransformersDetectionProcessor; consolidate
them into a single test class (for example, keep
TestTransformersDetectionProcessor) by moving all test methods from
TestDetectClassFiltering into that class, ensuring imports, fixtures, and any
shared setup/teardown used by TransformersDetectionProcessor tests are unified
and duplicate names removed so all unit tests for TransformersDetectionProcessor
live under one test class.

---

Duplicate comments:
In `@plugins/huggingface/tests/test_transformers_detection.py`:
- Line 7: Replace uses of MagicMock in the test_transformers_detection module
with small concrete test stubs: implement lightweight classes (e.g., FakeAgent,
FakeModel, FakeImageProcessor) that expose the same public methods/attributes
the tests expect and return deterministic, hard-coded values; update the
fixture(s) and the class-filtering test to instantiate and use those fakes
instead of MagicMock (remove any unittest.mock imports and references to
MagicMock) and ensure the fakes simulate edge cases the tests cover so behavior
remains identical but deterministic.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between cf69846 and 48108af.

📒 Files selected for processing (1)
  • plugins/huggingface/tests/test_transformers_detection.py

@maxkahan maxkahan force-pushed the add-transformers-detection branch from 41e327b to 92a22fc Compare March 3, 2026 15:21
@dangusev dangusev force-pushed the add-transformers-detection branch from 92a22fc to 468105f Compare March 6, 2026 11:55
At this point, duplication is a better option than messing with dependencies in core
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Autofix Details

Bugbot Autofix prepared fixes for both issues found in the latest run.

  • ✅ Fixed: Duplicated annotate_image utility across two plugins
    • I moved the shared implementation to agents-core/vision_agents/core/utils/annotation.py and changed both plugin annotation modules to import and re-export that single core utility.
  • ✅ Fixed: cv2 missing from optional dependency import guard
    • I added cv2 to the optional dependency set in the HuggingFace plugin import guard so missing OpenCV now triggers the friendly extras-install warning path.

Create PR

Or push these changes by commenting:

@cursor push 5f0d6014c3
Preview (5f0d6014c3)
diff --git a/agents-core/vision_agents/core/utils/annotation.py b/agents-core/vision_agents/core/utils/annotation.py
new file mode 100644
--- /dev/null
+++ b/agents-core/vision_agents/core/utils/annotation.py
@@ -1,0 +1,62 @@
+"""Annotation utilities for drawing detection results on video frames."""
+
+from typing import Iterable, Optional
+
+import cv2
+import numpy as np
+import supervision as sv
+
+
+def annotate_image(
+    image: np.ndarray,
+    detections: sv.Detections,
+    classes: dict[int, str],
+    dim_factor: Optional[float] = None,
+    mask_opacity: Optional[float] = None,
+    text_scale: float = 0.75,
+    text_padding: int = 1,
+    box_thickness: int = 2,
+    text_position: sv.Position | None = None,
+) -> np.ndarray:
+    """Draw bounding boxes, labels, and optional mask overlays on an image.
+
+    Args:
+        image: RGB image as a numpy array (H, W, 3).
+        detections: Supervision ``Detections`` object.
+        classes: Mapping of class ID to label string.
+        dim_factor: Dim background outside detected boxes (0–1). ``None`` to skip.
+        mask_opacity: Mask overlay opacity (0–1). ``None`` to skip mask annotation.
+            Only applied when ``detections.mask`` is not ``None``.
+        text_scale: Label text scale.
+        text_padding: Label text padding.
+        box_thickness: Bounding box line thickness.
+        text_position: Label position relative to box.
+    """
+    if text_position is None:
+        text_position = sv.Position.TOP_CENTER
+
+    annotated = image.copy()
+
+    if dim_factor:
+        mask = np.zeros(annotated.shape[:2], dtype=np.uint8)
+        for xyxy in detections.xyxy:
+            x1, y1, x2, y2 = xyxy.astype(int)
+            cv2.rectangle(mask, (x1, y1), (x2, y2), 255, -1)
+        annotated[mask == 0] = (annotated[mask == 0] * dim_factor).astype(np.uint8)
+
+    if mask_opacity is not None and detections.mask is not None:
+        annotated = sv.MaskAnnotator(opacity=mask_opacity).annotate(annotated, detections)
+
+    annotated = sv.BoxAnnotator(thickness=box_thickness).annotate(annotated, detections)
+    detected_class_ids: Iterable[int] = (
+        detections.class_id if detections.class_id is not None else []
+    )
+    labels = [
+        classes.get(int(class_id), str(int(class_id))) for class_id in detected_class_ids
+    ]
+    annotated = sv.LabelAnnotator(
+        text_position=text_position,
+        text_scale=text_scale,
+        text_padding=text_padding,
+    ).annotate(annotated, detections, labels)
+    return annotated

diff --git a/plugins/huggingface/vision_agents/plugins/huggingface/__init__.py b/plugins/huggingface/vision_agents/plugins/huggingface/__init__.py
--- a/plugins/huggingface/vision_agents/plugins/huggingface/__init__.py
+++ b/plugins/huggingface/vision_agents/plugins/huggingface/__init__.py
@@ -13,7 +13,15 @@
 except ImportError as e:
     import warnings
 
-    optional = {"torch", "transformers", "av", "aiortc", "jinja2", "supervision"}
+    optional = {
+        "torch",
+        "transformers",
+        "av",
+        "aiortc",
+        "jinja2",
+        "supervision",
+        "cv2",
+    }
     if e.name in optional:
         warnings.warn(
             f"Optional dependency '{e.name}' is not installed. "

diff --git a/plugins/huggingface/vision_agents/plugins/huggingface/annotation.py b/plugins/huggingface/vision_agents/plugins/huggingface/annotation.py
--- a/plugins/huggingface/vision_agents/plugins/huggingface/annotation.py
+++ b/plugins/huggingface/vision_agents/plugins/huggingface/annotation.py
@@ -1,65 +1,3 @@
-"""Annotation utilities for drawing detection results on video frames."""
+from vision_agents.core.utils.annotation import annotate_image
 
-from typing import Iterable, Optional
-
-import cv2
-import numpy as np
-import supervision as sv
-
-
-def annotate_image(
-    image: np.ndarray,
-    detections: sv.Detections,
-    classes: dict[int, str],
-    dim_factor: Optional[float] = None,
-    mask_opacity: Optional[float] = None,
-    text_scale: float = 0.75,
-    text_padding: int = 1,
-    box_thickness: int = 2,
-    text_position: sv.Position | None = None,
-) -> np.ndarray:
-    """Draw bounding boxes, labels, and optional mask overlays on an image.
-
-    Args:
-        image: RGB image as a numpy array (H, W, 3).
-        detections: Supervision ``Detections`` object.
-        classes: Mapping of class ID to label string.
-        dim_factor: Dim background outside detected boxes (0–1). ``None`` to skip.
-        mask_opacity: Mask overlay opacity (0–1). ``None`` to skip mask annotation.
-            Only applied when ``detections.mask`` is not ``None``.
-        text_scale: Label text scale.
-        text_padding: Label text padding.
-        box_thickness: Bounding box line thickness.
-        text_position: Label position relative to box.
-    """
-    if text_position is None:
-        text_position = sv.Position.TOP_CENTER
-
-    annotated = image.copy()
-
-    if dim_factor:
-        mask = np.zeros(annotated.shape[:2], dtype=np.uint8)
-        for xyxy in detections.xyxy:
-            x1, y1, x2, y2 = xyxy.astype(int)
-            cv2.rectangle(mask, (x1, y1), (x2, y2), 255, -1)
-        annotated[mask == 0] = (annotated[mask == 0] * dim_factor).astype(np.uint8)
-
-    if mask_opacity is not None and detections.mask is not None:
-        annotated = sv.MaskAnnotator(opacity=mask_opacity).annotate(
-            annotated, detections
-        )
-
-    annotated = sv.BoxAnnotator(thickness=box_thickness).annotate(annotated, detections)
-    detected_class_ids: Iterable[int] = (
-        detections.class_id if detections.class_id is not None else []
-    )
-    labels = [
-        classes.get(int(class_id), str(int(class_id)))
-        for class_id in detected_class_ids
-    ]
-    annotated = sv.LabelAnnotator(
-        text_position=text_position,
-        text_scale=text_scale,
-        text_padding=text_padding,
-    ).annotate(annotated, detections, labels)
-    return annotated
+__all__ = ["annotate_image"]

diff --git a/plugins/roboflow/vision_agents/plugins/roboflow/annotation.py b/plugins/roboflow/vision_agents/plugins/roboflow/annotation.py
--- a/plugins/roboflow/vision_agents/plugins/roboflow/annotation.py
+++ b/plugins/roboflow/vision_agents/plugins/roboflow/annotation.py
@@ -1,65 +1,3 @@
-"""Annotation utilities for drawing detection results on video frames."""
+from vision_agents.core.utils.annotation import annotate_image
 
-from typing import Iterable, Optional
-
-import cv2
-import numpy as np
-import supervision as sv
-
-
-def annotate_image(
-    image: np.ndarray,
-    detections: sv.Detections,
-    classes: dict[int, str],
-    dim_factor: Optional[float] = None,
-    mask_opacity: Optional[float] = None,
-    text_scale: float = 0.75,
-    text_padding: int = 1,
-    box_thickness: int = 2,
-    text_position: sv.Position | None = None,
-) -> np.ndarray:
-    """Draw bounding boxes, labels, and optional mask overlays on an image.
-
-    Args:
-        image: RGB image as a numpy array (H, W, 3).
-        detections: Supervision ``Detections`` object.
-        classes: Mapping of class ID to label string.
-        dim_factor: Dim background outside detected boxes (0–1). ``None`` to skip.
-        mask_opacity: Mask overlay opacity (0–1). ``None`` to skip mask annotation.
-            Only applied when ``detections.mask`` is not ``None``.
-        text_scale: Label text scale.
-        text_padding: Label text padding.
-        box_thickness: Bounding box line thickness.
-        text_position: Label position relative to box.
-    """
-    if text_position is None:
-        text_position = sv.Position.TOP_CENTER
-
-    annotated = image.copy()
-
-    if dim_factor:
-        mask = np.zeros(annotated.shape[:2], dtype=np.uint8)
-        for xyxy in detections.xyxy:
-            x1, y1, x2, y2 = xyxy.astype(int)
-            cv2.rectangle(mask, (x1, y1), (x2, y2), 255, -1)
-        annotated[mask == 0] = (annotated[mask == 0] * dim_factor).astype(np.uint8)
-
-    if mask_opacity is not None and detections.mask is not None:
-        annotated = sv.MaskAnnotator(opacity=mask_opacity).annotate(
-            annotated, detections
-        )
-
-    annotated = sv.BoxAnnotator(thickness=box_thickness).annotate(annotated, detections)
-    detected_class_ids: Iterable[int] = (
-        detections.class_id if detections.class_id is not None else []
-    )
-    labels = [
-        classes.get(int(class_id), str(int(class_id)))
-        for class_id in detected_class_ids
-    ]
-    annotated = sv.LabelAnnotator(
-        text_position=text_position,
-        text_scale=text_scale,
-        text_padding=text_padding,
-    ).annotate(annotated, detections, labels)
-    return annotated
+__all__ = ["annotate_image"]
This Bugbot Autofix run was free. To enable autofix for future PRs, go to the Cursor dashboard.

@dangusev dangusev dismissed Nash0x7E2’s stale review March 6, 2026 14:15

Changes were made

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

dangusev added 2 commits March 6, 2026 16:26
The param has never been used, and tests were failing because of some missing imports
@dangusev dangusev merged commit c755488 into main Mar 6, 2026
6 checks passed
@dangusev dangusev deleted the add-transformers-detection branch March 6, 2026 15:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants