Skip to content

Refactor screenshot capture module for cohesion and extensibility#202

Merged
Jeomon merged 7 commits into
CursorTouch:mainfrom
JezaChen:refactor/screenshot-capture
Apr 11, 2026
Merged

Refactor screenshot capture module for cohesion and extensibility#202
Jeomon merged 7 commits into
CursorTouch:mainfrom
JezaChen:refactor/screenshot-capture

Conversation

@JezaChen

Copy link
Copy Markdown
Collaborator

Summary

  • Decouple screenshot logic from Desktop: Move screenshot-related state and functions out of Desktop (service.py) into the screenshot module, eliminating unused delegation methods and reducing Desktop's responsibilities.
  • Simplify capture() parameters: Remove unnecessary callable/injection parameters (crop_screenshot, get_monitors_rect, dxcam_module, mss_module, camera_cache) that were passed through solely for testability but made the API confusing and brittle. Each backend now accesses what it needs directly.
  • Introduce ScreenshotBackend class hierarchy: Replace the flat capture_with_* functions and manual if/elif dispatch with an OOP design using __init_subclass__ for automatic backend registration. Adding a new backend now only requires defining a single class with name and priority.
  • Add comprehensive unit tests: New test_screenshot_capture.py with 38 test cases covering the capture() API, each backend class, the auto-registration mechanism, and edge cases like fallback chains and error recovery.

Motivation

The Desktop class had accumulated several screenshot-related methods (_capture_with_dxcam, _capture_with_pillow, _get_dxcam_camera, _resolve_dxcam_region, _get_screenshot_backend, _crop_screenshot, _build_crop_box) that were thin wrappers delegating to screenshot.py functions. These wrappers added no value — they simply forwarded calls — yet inflated Desktop's surface area and obscured where the real logic lived.

The capture() function signature had grown to accept 7 parameters, most of which were dependency-injection hooks introduced purely for unit testing:

# Before: confusing signature with test-only parameters
def capture(
    capture_rect,
    crop_screenshot: Callable,      # always Desktop._crop_screenshot
    get_monitors_rect: Callable,    # always uia.GetMonitorsRect
    camera_cache: dict,             # always Desktop._dxcam_cameras
    backend: str | None = None,
    dxcam_module=None,              # always the dxcam module
    mss_module=None,                # always the mss module
) -> tuple[Image.Image, str]:

This created several problems:

  • Unnecessary indirection: Every call site passed the same objects. If the actual argument never varies, the function should just use it directly.
  • Backend-specific parameters leaking into the generic API: camera_cache and dxcam_module only matter for the dxcam backend; passing them through the generic capture() is misleading — callers might expect them to affect other backends.
  • Module-level globals introduced for testability: _DXCAM_CAMERA_CACHE was a module global dict that existed primarily so tests could reset it, making the code harder to follow.

What changed

screenshot.py — Major restructure:

  • Introduced _ScreenshotBackend base class with __init_subclass__ auto-registration into registry.
  • Three backend subclasses: _DxcamBackend (priority=10), _MssBackend (priority=20), _PillowBackend (priority=100).
  • Each backend encapsulates its own availability check (is_available) and capture logic (capture).
  • _DxcamBackend owns its camera cache as an instance attribute and its monitor region resolution as a static method.
  • capture() reduced to 2 parameters: capture(capture_rect, backend=None).
  • get_screenshot_backend() validates against the dynamic registry instead of a hardcoded set.
  • Removed: capture_with_dxcam(), capture_with_mss(), capture_with_pillow(), get_dxcam_camera(), _DXCAM_CAMERA_CACHE, resolve_dxcam_region(), _auto_backend_chain().

service.py — Simplified:

  • Removed Desktop._dxcam_cameras, _get_screenshot_backend, _resolve_dxcam_region, _get_dxcam_camera, _capture_with_dxcam, _capture_with_pillow, _crop_screenshot, _build_crop_box.
  • Removed module-level dxcam = screenshot_capture.dxcam / mss = screenshot_capture.mss aliases.
  • Desktop.get_screenshot() now simply calls screenshot_capture.capture(capture_rect).

test_snapshot_display_filter.py — Updated mock targets:

  • All patch("windows_mcp.desktop.service.dxcam", ...) / service.mss references replaced with screenshot.* targets.
  • _DXCAM_CAMERA_CACHE mock replaced with _backend_instances reset.

test_screenshot_capture.py — New file, 38 test cases:

  • TestBackendRegistry: auto-registration, priority ordering, incomplete subclass handling.
  • TestGetScreenshotBackend: valid/invalid env vars, defaults, case insensitivity.
  • TestDxcamBackend: availability checks, region resolution (exact match, sub-region coordinates, cross-monitor), camera caching, capture success/failure.
  • TestMssBackend: availability, monitor dict construction, full-screen path.
  • TestPillowBackend: always-available, error fallback with/without capture rect.
  • TestCapture: explicit backend, unknown backend error, auto chain fallback, exception recovery, safety fallback, image content verification.

Test plan

  • ruff check . && ruff format --check . — lint passes
  • pytest tests/test_screenshot_capture.py -v — 38 new tests pass
  • pytest tests/test_snapshot_display_filter.py -v — 18 existing tests pass (no regressions)

JezaChen added 6 commits April 3, 2026 21:24
Move dxcam/mss references, camera cache, crop helpers, and backend
selection out of Desktop into screenshot.py to reduce Desktop's
responsibilities. Update test patches to target screenshot module
and isolate _DXCAM_CAMERA_CACHE between tests.
…h targets

Remove Desktop._crop_screenshot and Desktop._build_crop_box that were
duplicated in screenshot.py. Update test to call module-level
_crop_screenshot and patch screenshot.uia.GetVirtualScreenRect instead
of service.uia.GetVirtualScreenRect.
Have resolve_dxcam_region call uia.GetMonitorsRect() directly instead
of receiving it as a callback. Remove unused Callable import and update
test mock targets to screenshot.uia.GetMonitorsRect.
…tration

Replace flat capture_with_* functions with an OOP design:
- ScreenshotBackend base class with __init_subclass__ auto-registration
- DxcamBackend (priority=10), MssBackend (priority=20), PillowBackend (priority=100)
- capture() now iterates registered backends by priority instead of if/elif chain
- get_screenshot_backend() validates against dynamic registry
- DxcamBackend encapsulates camera cache and resolve_region logic
- Adding a new backend only requires defining a subclass with name and priority
Add tests/test_screenshot_capture.py with 35 test cases covering:
- Backend auto-registration via __init_subclass__
- Environment variable parsing in get_screenshot_backend()
- DxcamBackend: is_available, _resolve_region coordinate math, camera cache
- MssBackend: availability check, monitor dict construction
- PillowBackend: always-available, error fallback paths
- capture() API: explicit backend, unknown backend, auto chain fallback,
  exception recovery, and image content verification
- Verify _get_backend singleton caching behavior
- Assert exact kwargs sequence in pillow primary-screen fallback
- Test explicit backend capture failure triggers pillow safety fallback
- Test explicit dxcam with cross-monitor rect falls back to pillow
Copilot AI review requested due to automatic review settings April 11, 2026 06:42
@qodo-code-review

Copy link
Copy Markdown

Review Summary by Qodo

Refactor screenshot capture with OOP backends and simplified API

✨ Enhancement 🧪 Tests

Grey Divider

Walkthroughs

Description
• Refactor screenshot module with OOP backend hierarchy using __init_subclass__ auto-registration
• Simplify capture() API from 7 parameters to 2 by eliminating test-only dependency injection
• Move screenshot logic out of Desktop class, reducing its responsibilities and surface area
• Add 35+ comprehensive unit tests covering backends, registry, and capture API edge cases
Diagram
flowchart LR
  A["Desktop.get_screenshot"] -->|calls| B["capture()"]
  B -->|selects backend| C["_ScreenshotBackend registry"]
  C -->|instantiates| D["_DxcamBackend"]
  C -->|instantiates| E["_MssBackend"]
  C -->|instantiates| F["_PillowBackend"]
  D -->|captures| G["Image"]
  E -->|captures| G
  F -->|captures| G
  B -->|returns| H["Image + backend_name"]
Loading

Grey Divider

File Changes

1. src/windows_mcp/desktop/screenshot.py ✨ Enhancement +209/-131

Introduce OOP backend hierarchy with auto-registration

• Introduced _ScreenshotBackend base class with __init_subclass__ auto-registration mechanism
• Created three backend subclasses: _DxcamBackend (priority=10), _MssBackend (priority=20),
 _PillowBackend (priority=100)
• Moved _crop_screenshot() and _build_crop_box() from Desktop into module-level utilities
• Simplified capture() signature from 7 parameters to 2 (capture_rect, backend)
• Replaced flat capture_with_* functions with OOP backend classes encapsulating their own logic
• Added _get_backend() singleton caching for backend instances
• Updated get_screenshot_backend() to validate against dynamic registry instead of hardcoded set

src/windows_mcp/desktop/screenshot.py


2. src/windows_mcp/desktop/service.py ✨ Enhancement +1/-56

Remove screenshot delegation methods from Desktop

• Removed _dxcam_cameras instance attribute from Desktop.__init__
• Deleted wrapper methods: _get_screenshot_backend(), _resolve_dxcam_region(),
 _get_dxcam_camera(), _capture_with_dxcam(), _capture_with_pillow()
• Deleted utility methods: _crop_screenshot(), _build_crop_box()
• Removed module-level aliases dxcam and mss that were only used for delegation
• Simplified get_screenshot() to call screenshot_capture.capture(capture_rect) with no
 dependency injection parameters

src/windows_mcp/desktop/service.py


3. tests/test_screenshot_capture.py 🧪 Tests +443/-0

Add comprehensive screenshot capture unit tests

• Added 35+ unit tests covering backend auto-registration, environment variable parsing, and each
 backend class
• Tests verify _DxcamBackend coordinate math, camera caching, and cross-monitor detection
• Tests verify _MssBackend availability checks and monitor dict construction
• Tests verify _PillowBackend fallback behavior on capture errors
• Tests verify capture() API: explicit backend selection, unknown backend errors, auto-chain
 fallback, exception recovery
• Tests verify singleton caching of backend instances and image content verification

tests/test_screenshot_capture.py


View more (1)
4. tests/test_snapshot_display_filter.py 🧪 Tests +11/-15

Update test patches to target screenshot module

• Updated test fixture to remove desktop._dxcam_cameras = {} initialization (no longer exists)
• Updated patch targets from service.dxcam and service.mss to screenshot.dxcam and
 screenshot.mss
• Updated patch targets from service.uia.GetVirtualScreenRect to
 screenshot.uia.GetVirtualScreenRect
• Changed desktop._crop_screenshot() calls to module-level _crop_screenshot() function
• Added _backend_instances reset in dxcam test to isolate backend state
• Added get_screenshot_backend() mock to ensure Pillow path is tested

tests/test_snapshot_display_filter.py


Grey Divider

Qodo Logo

@qodo-code-review

qodo-code-review Bot commented Apr 11, 2026

Copy link
Copy Markdown

Code Review by Qodo

🐞 Bugs (2)   📘 Rule violations (3)   📎 Requirement gaps (0)   🎨 UX Issues (0)
🐞\ ≡ Correctness (1) ☼ Reliability (1)
📘\ ⚙ Maintainability (2) ➹ Performance (1)

Grey Divider


Action required

1. No 1920x1080 cap enforced 📘
Description
Screenshot capture paths can return images wider than 1920px (e.g., multi-monitor virtual screen)
because no resize/cap is applied before returning. This violates the documented maximum screenshot
resolution requirement and can increase payload/token usage significantly.
Code

src/windows_mcp/desktop/screenshot.py[R158-178]

+    def capture(self, capture_rect: uia.Rect | None) -> Image.Image:
+        grab_kwargs: dict[str, object] = {"all_screens": True}
        if capture_rect is not None:
-            logger.warning(
-                "Failed to capture selected region directly, falling back to virtual screen crop"
+            grab_kwargs["bbox"] = (
+                capture_rect.left,
+                capture_rect.top,
+                capture_rect.right,
+                capture_rect.bottom,
            )
-            return crop_screenshot(ImageGrab.grab(all_screens=True), capture_rect)
-        logger.warning("Failed to capture virtual screen, using primary screen")
-        screenshot = ImageGrab.grab()
-    return crop_screenshot(screenshot, capture_rect)
+        try:
+            screenshot = ImageGrab.grab(**grab_kwargs)
+        except (OSError, RuntimeError, ValueError):
+            if capture_rect is not None:
+                logger.warning(
+                    "Failed to capture selected region directly, "
+                    "falling back to virtual screen crop"
+                )
+                return _crop_screenshot(ImageGrab.grab(all_screens=True), capture_rect)
+            logger.warning("Failed to capture virtual screen, using primary screen")
+            screenshot = ImageGrab.grab()
+        return _crop_screenshot(screenshot, capture_rect)
Evidence
PR Compliance ID 5 requires screenshot outputs be capped at 1920x1080. _PillowBackend.capture()
returns the grabbed image (or cropped image) without any resizing/capping, and the new tests
explicitly assert a 3840x1080 result from capture(), demonstrating outputs can exceed 1920x1080.

CLAUDE.md
src/windows_mcp/desktop/screenshot.py[158-178]
tests/test_screenshot_capture.py[414-430]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
Screenshot capture can return images larger than 1920x1080 (e.g., 3840x1080), violating the maximum resolution requirement.

## Issue Context
`_PillowBackend.capture()` returns the grabbed image (and fallback crop) without any resizing/capping, and `capture()` returns that image unchanged.

## Fix Focus Areas
- src/windows_mcp/desktop/screenshot.py[152-178]
- src/windows_mcp/desktop/screenshot.py[230-261]
- tests/test_screenshot_capture.py[414-430]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


2. MSS crops wrong region 🐞
Description
_MssBackend.capture() grabs an already-cropped region image (local coords start at (0,0)) but then
applies _crop_screenshot() using global screen coordinates, which can crop out-of-bounds and return
black/incorrect pixels. This can silently produce incorrect screenshots for region captures (and
current tests don’t assert against the corrupted output).
Code

src/windows_mcp/desktop/screenshot.py[R190-205]

+    def capture(self, capture_rect: uia.Rect | None) -> Image.Image:
+        if mss is None:
+            raise RuntimeError("mss is not available")
+        with mss.mss() as sct:
+            if capture_rect is None:
+                monitor = sct.monitors[0]
+            else:
+                monitor = {
+                    "left": capture_rect.left,
+                    "top": capture_rect.top,
+                    "width": capture_rect.right - capture_rect.left,
+                    "height": capture_rect.bottom - capture_rect.top,
+                }
+            raw = sct.grab(monitor)
+            image = Image.frombytes("RGB", raw.size, raw.rgb)
+        return _crop_screenshot(image, capture_rect)
Evidence
MSS capture constructs a region-sized image from raw.size and then calls _crop_screenshot(image,
capture_rect), but _crop_screenshot computes the crop box from global coordinates relative to the
virtual screen origin. For any capture_rect not aligned to the virtual origin, that crop box won’t
match the region image’s (0,0)-based coordinate system, producing out-of-bounds crops (black padding
/ shifted content). The added unit test scaffolding explicitly models MSS returning raw.size equal
to the requested region size, which makes this mismatch deterministic under the project’s own
assumptions.

/src/windows_mcp/desktop/screenshot.py[181-206]
/src/windows_mcp/desktop/screenshot.py[26-40]
/tests/test_screenshot_capture.py[217-245]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`_MssBackend.capture()` builds an image from `mss.grab()` for a *specific region* and then calls `_crop_screenshot(image, capture_rect)`. That second crop uses global coordinates (via `GetVirtualScreenRect` offsets) against an image whose origin is already the region’s top-left, causing mis-cropping/out-of-bounds crops.

### Issue Context
For `capture_rect != None`, the `monitor` dict passed to `sct.grab()` already specifies left/top/width/height, so the returned `raw` and resulting `image` are expected to be exactly that region.

### Fix Focus Areas
- src/windows_mcp/desktop/screenshot.py[190-206]
- src/windows_mcp/desktop/screenshot.py[26-40]

### Suggested change
In `_MssBackend.capture()`:
- If `capture_rect is None`: return `image`.
- If `capture_rect is not None` and `image.size == (capture_rect.width(), capture_rect.height())`: return `image` directly (no crop).
- Otherwise (defensive fallback if some platform/library returns a full virtual-screen image): apply `_crop_screenshot(image, capture_rect)`.

### Test hardening
Add an assertion that the returned image content is not shifted/blank for a non-zero `capture_rect.left/top` case (e.g., validate a known pixel pattern or at least `getbbox()` plus a pixel check).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended

3. capture() docstring non-Google 📘
Description
The public function capture() has a brief docstring but it does not follow Google-style (missing
Args: / Returns: sections). This violates the requirement for consistent public API
documentation.
Code

src/windows_mcp/desktop/screenshot.py[R230-235]

def capture(
-    capture_rect,
-    crop_screenshot: Callable[[Image.Image, object], Image.Image],
-    get_monitors_rect: Callable[[], list],
-    camera_cache: dict[int, object],
+    capture_rect: uia.Rect | None,
    backend: str | None = None,
-    dxcam_module=None,
-    mss_module=None,
) -> tuple[Image.Image, str]:
+    """Capture a screenshot and return ``(image, backend_name_used)``."""
    selected = backend or get_screenshot_backend()
-    chain = _auto_backend_chain() if selected == "auto" else [selected]
Evidence
PR Compliance ID 4 requires Google-style docstrings for public functions/classes. capture() is
public and its docstring does not include Google-style sections such as Args: and Returns:.

CLAUDE.md
src/windows_mcp/desktop/screenshot.py[230-235]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`capture()` is a public API but its docstring is not Google-style.

## Issue Context
Compliance requires Google-style docstrings for public functions/classes.

## Fix Focus Areas
- src/windows_mcp/desktop/screenshot.py[230-235]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


4. Test functions missing type hints 📘
Description
New test functions/methods were added without parameter and return type annotations. This violates
the requirement that all new/modified function signatures include type hints.
Code

tests/test_screenshot_capture.py[R33-37]

+@pytest.fixture(autouse=True)
+def _isolate_backend_instances(monkeypatch):
+    """Ensure each test gets a fresh backend instance pool."""
+    monkeypatch.setattr(screenshot, "_backend_instances", {})
+
Evidence
PR Compliance ID 3 requires type hints on new/modified function signatures. The new test module adds
multiple functions/methods (including a fixture and backend test methods) with unannotated
parameters and missing return type annotations.

CLAUDE.md
tests/test_screenshot_capture.py[33-37]
tests/test_screenshot_capture.py[44-58]
tests/test_screenshot_capture.py[104-115]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
New test functions/methods are missing type hints for parameters and return types.

## Issue Context
Compliance requires type annotations on all new/modified function signatures.

## Fix Focus Areas
- tests/test_screenshot_capture.py[33-37]
- tests/test_screenshot_capture.py[44-58]
- tests/test_screenshot_capture.py[104-115]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


5. is_available exceptions uncaught 🐞
Description
capture() calls inst.is_available() outside the try/except, so any exception in a backend
availability check aborts capture() and prevents fallback to other backends. This undermines the
reliability guarantees of the auto fallback chain.
Code

src/windows_mcp/desktop/screenshot.py[R247-250]

+    for backend_cls in chain:
+        inst = _get_backend(backend_cls.name)
+        if not inst.is_available(capture_rect):
+            continue
Evidence
The backend loop checks inst.is_available(capture_rect) before entering the guarded try: that
catches backend failures. _DxcamBackend.is_available() calls _resolve_region(), which calls
uia.GetMonitorsRect(). GetMonitorsRect() makes direct Windows API calls via
ctypes.windll.user32.EnumDisplayMonitors without any error handling, so any exception raised
during monitor enumeration would bubble out of is_available() and bypass the fallback logic.

/src/windows_mcp/desktop/screenshot.py[230-259]
/src/windows_mcp/desktop/screenshot.py[101-130]
/src/windows_mcp/uia/core.py[643-675]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`capture()` calls `inst.is_available()` outside the existing try/except, so exceptions raised during availability checks prevent fallback to later backends.

### Issue Context
`_DxcamBackend.is_available()` calls into `uia.GetMonitorsRect()` via `_resolve_region()`. Monitor enumeration uses Windows API calls via `ctypes.windll` and is not exception-handled in `uia`, so unexpected failures can bubble up.

### Fix Focus Areas
- src/windows_mcp/desktop/screenshot.py[246-259]
- src/windows_mcp/desktop/screenshot.py[124-130]

### Suggested change
Wrap the availability check in the same fallback behavior as capture failures, e.g.:

```python
for backend_cls in chain:
   inst = _get_backend(backend_cls.name)
   try:
       if not inst.is_available(capture_rect):
           continue
   except Exception:
       logger.warning(
           "Screenshot backend '%s' availability check failed; trying next backend",
           inst.name,
           exc_info=selected != "auto",
       )
       continue

   try:
       return inst.capture(capture_rect), inst.name
   except (OSError, RuntimeError, ValueError):
       ...
```

Optionally also make `_DxcamBackend.is_available()` catch exceptions and return `False` to keep backends self-contained.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

ⓘ The new review experience is currently in Beta. Learn more

Grey Divider

Qodo Logo

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Refactors the Windows screenshot capture implementation to be more cohesive and extensible by moving screenshot responsibilities out of Desktop and into a backend-driven screenshot module, along with updating/adding tests around the new API.

Changes:

  • Introduces an OOP backend registry (_ScreenshotBackend + subclasses) and simplifies the public capture() API.
  • Removes screenshot-related delegation/state from Desktop and routes Desktop.get_screenshot() directly through screenshot.capture().
  • Updates existing tests to patch the new module targets and adds a new dedicated screenshot capture test suite.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

File Description
src/windows_mcp/desktop/screenshot.py Replaces function-based capture with backend classes + registry + simplified capture() entrypoint.
src/windows_mcp/desktop/service.py Removes screenshot helpers/state from Desktop; delegates screenshot acquisition to screenshot.capture().
tests/test_snapshot_display_filter.py Updates patch targets to the new screenshot module and adapts cropping/screenshot tests.
tests/test_screenshot_capture.py Adds a new unit test suite covering backend registration, backend behavior, and capture() fallback logic.
Comments suppressed due to low confidence (1)

tests/test_snapshot_display_filter.py:201

  • In test_get_screenshot_falls_back_to_pillow_when_dxcam_region_is_unsupported, the mocked ImageGrab.grab() returns a (1920, 1080) image even though the bbox passed is (0, 0, 3840, 1080), and the assertion expects the final screenshot size (3840, 1080). This mismatch can mask issues (e.g., relying on PIL crop padding). Consider making the mock return an image whose size matches the requested bbox (or asserting directly on the grab call args instead of the returned size) so the test reflects real ImageGrab behavior.
        with patch("windows_mcp.desktop.screenshot.ImageGrab.grab") as mock_grab:
            mock_grab.return_value = Image.new("RGB", (1920, 1080), "white")
            screenshot = desktop.get_screenshot(capture_rect=capture_rect)

        assert screenshot.size == (3840, 1080)
        assert fake_dxcam.create.call_count == 0
        assert mock_grab.call_args.kwargs == {
            "bbox": (0, 0, 3840, 1080),
            "all_screens": True,
        }

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/windows_mcp/desktop/screenshot.py Outdated
return _crop_screenshot(ImageGrab.grab(all_screens=True), capture_rect)
logger.warning("Failed to capture virtual screen, using primary screen")
screenshot = ImageGrab.grab()
return _crop_screenshot(screenshot, capture_rect)

Copilot AI Apr 11, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When capture_rect is provided, this backend already requests only that region via ImageGrab.grab(bbox=...). Cropping the returned image again using _crop_screenshot(..., capture_rect) applies absolute virtual-screen coordinates to an image whose origin is already (0,0) for the requested region, which can yield padded/blank output. A concrete fix is to return screenshot directly on the success path when bbox was used, and only call _crop_screenshot in the fallback path where you grab the full virtual screen and then need to crop.

Suggested change
return _crop_screenshot(screenshot, capture_rect)
return screenshot

Copilot uses AI. Check for mistakes.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The double-crop issue in _PillowBackend and _MssBackend is a pre-existing bug (late discovery) — it existed in the original capture_with_pillow and capture_with_mss functions before this refactoring. Our refactoring moved the logic into class methods as-is without introducing or fixing it.

Root cause: When capture_rect is provided, both ImageGrab.grab(bbox=...) and mss.grab(monitor) already return an image cropped to the requested region (origin at (0,0)). Calling _crop_screenshot() afterwards applies absolute virtual-screen coordinates to this already-cropped image, which produces out-of-bounds/black output when the region doesn't start at the virtual screen origin.

Practical impact is very limited in the current codebase: the auto chain tries dxcam first (a required dependency), and the dxcam path does not call _crop_screenshot. The bug only manifests when dxcam is unavailable or fails and the capture rect targets a non-primary monitor — an unlikely combination in normal usage.

I'll verify and fix this in a follow-up PR to keep the refactoring scope clean.

"height": capture_rect.bottom - capture_rect.top,
}
raw = sct.grab(monitor)
image = Image.frombytes("RGB", raw.size, raw.rgb)

Copilot AI Apr 11, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When capture_rect is provided, mss.grab() is already capturing exactly that rectangle (based on the monitor dict). Applying _crop_screenshot(image, capture_rect) again uses absolute virtual-screen coordinates against an image whose origin is the capture rectangle, which can crop to the wrong pixels and/or introduce black padding. Consider returning image directly when capture_rect is not None, and reserving _crop_screenshot only for cases where the captured image is the full virtual screen.

Suggested change
image = Image.frombytes("RGB", raw.size, raw.rgb)
image = Image.frombytes("RGB", raw.size, raw.rgb)
if capture_rect is not None:
return image

Copilot uses AI. Check for mistakes.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above

Comment on lines +158 to +178
def capture(self, capture_rect: uia.Rect | None) -> Image.Image:
grab_kwargs: dict[str, object] = {"all_screens": True}
if capture_rect is not None:
logger.warning(
"Failed to capture selected region directly, falling back to virtual screen crop"
grab_kwargs["bbox"] = (
capture_rect.left,
capture_rect.top,
capture_rect.right,
capture_rect.bottom,
)
return crop_screenshot(ImageGrab.grab(all_screens=True), capture_rect)
logger.warning("Failed to capture virtual screen, using primary screen")
screenshot = ImageGrab.grab()
return crop_screenshot(screenshot, capture_rect)
try:
screenshot = ImageGrab.grab(**grab_kwargs)
except (OSError, RuntimeError, ValueError):
if capture_rect is not None:
logger.warning(
"Failed to capture selected region directly, "
"falling back to virtual screen crop"
)
return _crop_screenshot(ImageGrab.grab(all_screens=True), capture_rect)
logger.warning("Failed to capture virtual screen, using primary screen")
screenshot = ImageGrab.grab()
return _crop_screenshot(screenshot, capture_rect)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

1. No 1920x1080 cap enforced 📘 Rule violation ➹ Performance

Screenshot capture paths can return images wider than 1920px (e.g., multi-monitor virtual screen)
because no resize/cap is applied before returning. This violates the documented maximum screenshot
resolution requirement and can increase payload/token usage significantly.
Agent Prompt
## Issue description
Screenshot capture can return images larger than 1920x1080 (e.g., 3840x1080), violating the maximum resolution requirement.

## Issue Context
`_PillowBackend.capture()` returns the grabbed image (and fallback crop) without any resizing/capping, and `capture()` returns that image unchanged.

## Fix Focus Areas
- src/windows_mcp/desktop/screenshot.py[152-178]
- src/windows_mcp/desktop/screenshot.py[230-261]
- tests/test_screenshot_capture.py[414-430]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Comment on lines +190 to +205
def capture(self, capture_rect: uia.Rect | None) -> Image.Image:
if mss is None:
raise RuntimeError("mss is not available")
with mss.mss() as sct:
if capture_rect is None:
monitor = sct.monitors[0]
else:
monitor = {
"left": capture_rect.left,
"top": capture_rect.top,
"width": capture_rect.right - capture_rect.left,
"height": capture_rect.bottom - capture_rect.top,
}
raw = sct.grab(monitor)
image = Image.frombytes("RGB", raw.size, raw.rgb)
return _crop_screenshot(image, capture_rect)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

2. Mss crops wrong region 🐞 Bug ≡ Correctness

_MssBackend.capture() grabs an already-cropped region image (local coords start at (0,0)) but then
applies _crop_screenshot() using global screen coordinates, which can crop out-of-bounds and return
black/incorrect pixels. This can silently produce incorrect screenshots for region captures (and
current tests don’t assert against the corrupted output).
Agent Prompt
### Issue description
`_MssBackend.capture()` builds an image from `mss.grab()` for a *specific region* and then calls `_crop_screenshot(image, capture_rect)`. That second crop uses global coordinates (via `GetVirtualScreenRect` offsets) against an image whose origin is already the region’s top-left, causing mis-cropping/out-of-bounds crops.

### Issue Context
For `capture_rect != None`, the `monitor` dict passed to `sct.grab()` already specifies left/top/width/height, so the returned `raw` and resulting `image` are expected to be exactly that region.

### Fix Focus Areas
- src/windows_mcp/desktop/screenshot.py[190-206]
- src/windows_mcp/desktop/screenshot.py[26-40]

### Suggested change
In `_MssBackend.capture()`:
- If `capture_rect is None`: return `image`.
- If `capture_rect is not None` and `image.size == (capture_rect.width(), capture_rect.height())`: return `image` directly (no crop).
- Otherwise (defensive fallback if some platform/library returns a full virtual-screen image): apply `_crop_screenshot(image, capture_rect)`.

### Test hardening
Add an assertion that the returned image content is not shifted/blank for a non-zero `capture_rect.left/top` case (e.g., validate a known pixel pattern or at least `getbbox()` plus a pixel check).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above

Use cls.__dict__ instead of hasattr in __init_subclass__ so only
classes that explicitly define name and priority are registered.
Add duplicate name detection that raises ValueError on conflicts.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants