Refactor screenshot capture module for cohesion and extensibility by JezaChen · Pull Request #202 · CursorTouch/Windows-MCP

JezaChen · 2026-04-11T06:42:02Z

Summary

Decouple screenshot logic from Desktop: Move screenshot-related state and functions out of Desktop (service.py) into the screenshot module, eliminating unused delegation methods and reducing Desktop's responsibilities.
Simplify capture() parameters: Remove unnecessary callable/injection parameters (crop_screenshot, get_monitors_rect, dxcam_module, mss_module, camera_cache) that were passed through solely for testability but made the API confusing and brittle. Each backend now accesses what it needs directly.
Introduce ScreenshotBackend class hierarchy: Replace the flat capture_with_* functions and manual if/elif dispatch with an OOP design using __init_subclass__ for automatic backend registration. Adding a new backend now only requires defining a single class with name and priority.
Add comprehensive unit tests: New test_screenshot_capture.py with 38 test cases covering the capture() API, each backend class, the auto-registration mechanism, and edge cases like fallback chains and error recovery.

Motivation

The Desktop class had accumulated several screenshot-related methods (_capture_with_dxcam, _capture_with_pillow, _get_dxcam_camera, _resolve_dxcam_region, _get_screenshot_backend, _crop_screenshot, _build_crop_box) that were thin wrappers delegating to screenshot.py functions. These wrappers added no value — they simply forwarded calls — yet inflated Desktop's surface area and obscured where the real logic lived.

The capture() function signature had grown to accept 7 parameters, most of which were dependency-injection hooks introduced purely for unit testing:

# Before: confusing signature with test-only parameters
def capture(
    capture_rect,
    crop_screenshot: Callable,      # always Desktop._crop_screenshot
    get_monitors_rect: Callable,    # always uia.GetMonitorsRect
    camera_cache: dict,             # always Desktop._dxcam_cameras
    backend: str | None = None,
    dxcam_module=None,              # always the dxcam module
    mss_module=None,                # always the mss module
) -> tuple[Image.Image, str]:

This created several problems:

Unnecessary indirection: Every call site passed the same objects. If the actual argument never varies, the function should just use it directly.
Backend-specific parameters leaking into the generic API: camera_cache and dxcam_module only matter for the dxcam backend; passing them through the generic capture() is misleading — callers might expect them to affect other backends.
Module-level globals introduced for testability: _DXCAM_CAMERA_CACHE was a module global dict that existed primarily so tests could reset it, making the code harder to follow.

What changed

screenshot.py — Major restructure:

Introduced _ScreenshotBackend base class with __init_subclass__ auto-registration into registry.
Three backend subclasses: _DxcamBackend (priority=10), _MssBackend (priority=20), _PillowBackend (priority=100).
Each backend encapsulates its own availability check (is_available) and capture logic (capture).
_DxcamBackend owns its camera cache as an instance attribute and its monitor region resolution as a static method.
capture() reduced to 2 parameters: capture(capture_rect, backend=None).
get_screenshot_backend() validates against the dynamic registry instead of a hardcoded set.
Removed: capture_with_dxcam(), capture_with_mss(), capture_with_pillow(), get_dxcam_camera(), _DXCAM_CAMERA_CACHE, resolve_dxcam_region(), _auto_backend_chain().

service.py — Simplified:

Removed Desktop._dxcam_cameras, _get_screenshot_backend, _resolve_dxcam_region, _get_dxcam_camera, _capture_with_dxcam, _capture_with_pillow, _crop_screenshot, _build_crop_box.
Removed module-level dxcam = screenshot_capture.dxcam / mss = screenshot_capture.mss aliases.
Desktop.get_screenshot() now simply calls screenshot_capture.capture(capture_rect).

test_snapshot_display_filter.py — Updated mock targets:

All patch("windows_mcp.desktop.service.dxcam", ...) / service.mss references replaced with screenshot.* targets.
_DXCAM_CAMERA_CACHE mock replaced with _backend_instances reset.

test_screenshot_capture.py — New file, 38 test cases:

TestBackendRegistry: auto-registration, priority ordering, incomplete subclass handling.
TestGetScreenshotBackend: valid/invalid env vars, defaults, case insensitivity.
TestDxcamBackend: availability checks, region resolution (exact match, sub-region coordinates, cross-monitor), camera caching, capture success/failure.
TestMssBackend: availability, monitor dict construction, full-screen path.
TestPillowBackend: always-available, error fallback with/without capture rect.
TestCapture: explicit backend, unknown backend error, auto chain fallback, exception recovery, safety fallback, image content verification.

Test plan

ruff check . && ruff format --check . — lint passes
pytest tests/test_screenshot_capture.py -v — 38 new tests pass
pytest tests/test_snapshot_display_filter.py -v — 18 existing tests pass (no regressions)

Move dxcam/mss references, camera cache, crop helpers, and backend selection out of Desktop into screenshot.py to reduce Desktop's responsibilities. Update test patches to target screenshot module and isolate _DXCAM_CAMERA_CACHE between tests.

…h targets Remove Desktop._crop_screenshot and Desktop._build_crop_box that were duplicated in screenshot.py. Update test to call module-level _crop_screenshot and patch screenshot.uia.GetVirtualScreenRect instead of service.uia.GetVirtualScreenRect.

Have resolve_dxcam_region call uia.GetMonitorsRect() directly instead of receiving it as a callback. Remove unused Callable import and update test mock targets to screenshot.uia.GetMonitorsRect.

…tration Replace flat capture_with_* functions with an OOP design: - ScreenshotBackend base class with __init_subclass__ auto-registration - DxcamBackend (priority=10), MssBackend (priority=20), PillowBackend (priority=100) - capture() now iterates registered backends by priority instead of if/elif chain - get_screenshot_backend() validates against dynamic registry - DxcamBackend encapsulates camera cache and resolve_region logic - Adding a new backend only requires defining a subclass with name and priority

Add tests/test_screenshot_capture.py with 35 test cases covering: - Backend auto-registration via __init_subclass__ - Environment variable parsing in get_screenshot_backend() - DxcamBackend: is_available, _resolve_region coordinate math, camera cache - MssBackend: availability check, monitor dict construction - PillowBackend: always-available, error fallback paths - capture() API: explicit backend, unknown backend, auto chain fallback, exception recovery, and image content verification

- Verify _get_backend singleton caching behavior - Assert exact kwargs sequence in pillow primary-screen fallback - Test explicit backend capture failure triggers pillow safety fallback - Test explicit dxcam with cross-monitor rect falls back to pillow

qodo-code-review · 2026-04-11T06:42:23Z

Review Summary by Qodo

Refactor screenshot capture with OOP backends and simplified API

✨ Enhancement 🧪 Tests

Walkthroughs

Description

• Refactor screenshot module with OOP backend hierarchy using __init_subclass__ auto-registration
• Simplify capture() API from 7 parameters to 2 by eliminating test-only dependency injection
• Move screenshot logic out of Desktop class, reducing its responsibilities and surface area
• Add 35+ comprehensive unit tests covering backends, registry, and capture API edge cases

Diagram

flowchart LR
  A["Desktop.get_screenshot"] -->|calls| B["capture()"]
  B -->|selects backend| C["_ScreenshotBackend registry"]
  C -->|instantiates| D["_DxcamBackend"]
  C -->|instantiates| E["_MssBackend"]
  C -->|instantiates| F["_PillowBackend"]
  D -->|captures| G["Image"]
  E -->|captures| G
  F -->|captures| G
  B -->|returns| H["Image + backend_name"]

File Changes

1. src/windows_mcp/desktop/screenshot.py ✨ Enhancement +209/-131

Introduce OOP backend hierarchy with auto-registration

• Introduced _ScreenshotBackend base class with __init_subclass__ auto-registration mechanism
• Created three backend subclasses: _DxcamBackend (priority=10), _MssBackend (priority=20),
 _PillowBackend (priority=100)
• Moved _crop_screenshot() and _build_crop_box() from Desktop into module-level utilities
• Simplified capture() signature from 7 parameters to 2 (capture_rect, backend)
• Replaced flat capture_with_* functions with OOP backend classes encapsulating their own logic
• Added _get_backend() singleton caching for backend instances
• Updated get_screenshot_backend() to validate against dynamic registry instead of hardcoded set

src/windows_mcp/desktop/screenshot.py

2. src/windows_mcp/desktop/service.py ✨ Enhancement +1/-56

Remove screenshot delegation methods from Desktop

• Removed _dxcam_cameras instance attribute from Desktop.__init__
• Deleted wrapper methods: _get_screenshot_backend(), _resolve_dxcam_region(),
 _get_dxcam_camera(), _capture_with_dxcam(), _capture_with_pillow()
• Deleted utility methods: _crop_screenshot(), _build_crop_box()
• Removed module-level aliases dxcam and mss that were only used for delegation
• Simplified get_screenshot() to call screenshot_capture.capture(capture_rect) with no
 dependency injection parameters

src/windows_mcp/desktop/service.py

3. tests/test_screenshot_capture.py 🧪 Tests +443/-0

Add comprehensive screenshot capture unit tests

• Added 35+ unit tests covering backend auto-registration, environment variable parsing, and each
 backend class
• Tests verify _DxcamBackend coordinate math, camera caching, and cross-monitor detection
• Tests verify _MssBackend availability checks and monitor dict construction
• Tests verify _PillowBackend fallback behavior on capture errors
• Tests verify capture() API: explicit backend selection, unknown backend errors, auto-chain
 fallback, exception recovery
• Tests verify singleton caching of backend instances and image content verification

tests/test_screenshot_capture.py

View more (1)

4. tests/test_snapshot_display_filter.py 🧪 Tests +11/-15

Update test patches to target screenshot module

• Updated test fixture to remove desktop._dxcam_cameras = {} initialization (no longer exists)
• Updated patch targets from service.dxcam and service.mss to screenshot.dxcam and
 screenshot.mss
• Updated patch targets from service.uia.GetVirtualScreenRect to
 screenshot.uia.GetVirtualScreenRect
• Changed desktop._crop_screenshot() calls to module-level _crop_screenshot() function
• Added _backend_instances reset in dxcam test to isolate backend state
• Added get_screenshot_backend() mock to ensure Pillow path is tested

tests/test_snapshot_display_filter.py

qodo-code-review · 2026-04-11T06:42:25Z

Code Review by Qodo

🐞 Bugs (2) 📘 Rule violations (3) 📎 Requirement gaps (0) 🎨 UX Issues (0)

🐞\ ≡ Correctness (1) ☼ Reliability (1)

📘\ ⚙ Maintainability (2) ➹ Performance (1)

1. No 1920x1080 cap enforced 📘 ➹

Description

Screenshot capture paths can return images wider than 1920px (e.g., multi-monitor virtual screen)
because no resize/cap is applied before returning. This violates the documented maximum screenshot
resolution requirement and can increase payload/token usage significantly.

Code

src/windows_mcp/desktop/screenshot.py[R158-178]

+    def capture(self, capture_rect: uia.Rect | None) -> Image.Image:
+        grab_kwargs: dict[str, object] = {"all_screens": True}
        if capture_rect is not None:
-            logger.warning(
-                "Failed to capture selected region directly, falling back to virtual screen crop"
+            grab_kwargs["bbox"] = (
+                capture_rect.left,
+                capture_rect.top,
+                capture_rect.right,
+                capture_rect.bottom,
            )
-            return crop_screenshot(ImageGrab.grab(all_screens=True), capture_rect)
-        logger.warning("Failed to capture virtual screen, using primary screen")
-        screenshot = ImageGrab.grab()
-    return crop_screenshot(screenshot, capture_rect)
+        try:
+            screenshot = ImageGrab.grab(**grab_kwargs)
+        except (OSError, RuntimeError, ValueError):
+            if capture_rect is not None:
+                logger.warning(
+                    "Failed to capture selected region directly, "
+                    "falling back to virtual screen crop"
+                )
+                return _crop_screenshot(ImageGrab.grab(all_screens=True), capture_rect)
+            logger.warning("Failed to capture virtual screen, using primary screen")
+            screenshot = ImageGrab.grab()
+        return _crop_screenshot(screenshot, capture_rect)

Evidence

PR Compliance ID 5 requires screenshot outputs be capped at 1920x1080. _PillowBackend.capture()
returns the grabbed image (or cropped image) without any resizing/capping, and the new tests
explicitly assert a 3840x1080 result from capture(), demonstrating outputs can exceed 1920x1080.

CLAUDE.md
src/windows_mcp/desktop/screenshot.py[158-178]
tests/test_screenshot_capture.py[414-430]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
Screenshot capture can return images larger than 1920x1080 (e.g., 3840x1080), violating the maximum resolution requirement.

## Issue Context
`_PillowBackend.capture()` returns the grabbed image (and fallback crop) without any resizing/capping, and `capture()` returns that image unchanged.

## Fix Focus Areas
- src/windows_mcp/desktop/screenshot.py[152-178]
- src/windows_mcp/desktop/screenshot.py[230-261]
- tests/test_screenshot_capture.py[414-430]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

2. MSS crops wrong region 🐞 ≡

Description

_MssBackend.capture() grabs an already-cropped region image (local coords start at (0,0)) but then
applies _crop_screenshot() using global screen coordinates, which can crop out-of-bounds and return
black/incorrect pixels. This can silently produce incorrect screenshots for region captures (and
current tests don’t assert against the corrupted output).

Code

src/windows_mcp/desktop/screenshot.py[R190-205]

+    def capture(self, capture_rect: uia.Rect | None) -> Image.Image:
+        if mss is None:
+            raise RuntimeError("mss is not available")
+        with mss.mss() as sct:
+            if capture_rect is None:
+                monitor = sct.monitors[0]
+            else:
+                monitor = {
+                    "left": capture_rect.left,
+                    "top": capture_rect.top,
+                    "width": capture_rect.right - capture_rect.left,
+                    "height": capture_rect.bottom - capture_rect.top,
+                }
+            raw = sct.grab(monitor)
+            image = Image.frombytes("RGB", raw.size, raw.rgb)
+        return _crop_screenshot(image, capture_rect)

Evidence

MSS capture constructs a region-sized image from raw.size and then calls _crop_screenshot(image,
capture_rect), but _crop_screenshot computes the crop box from global coordinates relative to the
virtual screen origin. For any capture_rect not aligned to the virtual origin, that crop box won’t
match the region image’s (0,0)-based coordinate system, producing out-of-bounds crops (black padding
/ shifted content). The added unit test scaffolding explicitly models MSS returning raw.size equal
to the requested region size, which makes this mismatch deterministic under the project’s own
assumptions.

/src/windows_mcp/desktop/screenshot.py[181-206]
/src/windows_mcp/desktop/screenshot.py[26-40]
/tests/test_screenshot_capture.py[217-245]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`_MssBackend.capture()` builds an image from `mss.grab()` for a *specific region* and then calls `_crop_screenshot(image, capture_rect)`. That second crop uses global coordinates (via `GetVirtualScreenRect` offsets) against an image whose origin is already the region’s top-left, causing mis-cropping/out-of-bounds crops.

### Issue Context
For `capture_rect != None`, the `monitor` dict passed to `sct.grab()` already specifies left/top/width/height, so the returned `raw` and resulting `image` are expected to be exactly that region.

### Fix Focus Areas
- src/windows_mcp/desktop/screenshot.py[190-206]
- src/windows_mcp/desktop/screenshot.py[26-40]

### Suggested change
In `_MssBackend.capture()`:
- If `capture_rect is None`: return `image`.
- If `capture_rect is not None` and `image.size == (capture_rect.width(), capture_rect.height())`: return `image` directly (no crop).
- Otherwise (defensive fallback if some platform/library returns a full virtual-screen image): apply `_crop_screenshot(image, capture_rect)`.

### Test hardening
Add an assertion that the returned image content is not shifted/blank for a non-zero `capture_rect.left/top` case (e.g., validate a known pixel pattern or at least `getbbox()` plus a pixel check).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

3. capture() docstring non-Google 📘 ⚙

Description

The public function capture() has a brief docstring but it does not follow Google-style (missing
Args: / Returns: sections). This violates the requirement for consistent public API
documentation.

Code

src/windows_mcp/desktop/screenshot.py[R230-235]

def capture(
-    capture_rect,
-    crop_screenshot: Callable[[Image.Image, object], Image.Image],
-    get_monitors_rect: Callable[[], list],
-    camera_cache: dict[int, object],
+    capture_rect: uia.Rect | None,
    backend: str | None = None,
-    dxcam_module=None,
-    mss_module=None,
) -> tuple[Image.Image, str]:
+    """Capture a screenshot and return ``(image, backend_name_used)``."""
    selected = backend or get_screenshot_backend()
-    chain = _auto_backend_chain() if selected == "auto" else [selected]

Evidence

PR Compliance ID 4 requires Google-style docstrings for public functions/classes. capture() is
public and its docstring does not include Google-style sections such as Args: and Returns:.

CLAUDE.md
src/windows_mcp/desktop/screenshot.py[230-235]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`capture()` is a public API but its docstring is not Google-style.

## Issue Context
Compliance requires Google-style docstrings for public functions/classes.

## Fix Focus Areas
- src/windows_mcp/desktop/screenshot.py[230-235]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

4. Test functions missing type hints 📘 ⚙

Description

New test functions/methods were added without parameter and return type annotations. This violates
the requirement that all new/modified function signatures include type hints.

Code

tests/test_screenshot_capture.py[R33-37]

+@pytest.fixture(autouse=True)
+def _isolate_backend_instances(monkeypatch):
+    """Ensure each test gets a fresh backend instance pool."""
+    monkeypatch.setattr(screenshot, "_backend_instances", {})
+

Evidence
PR Compliance ID 3 requires type hints on new/modified function signatures. The new test module adds
multiple functions/methods (including a fixture and backend test methods) with unannotated
parameters and missing return type annotations.
CLAUDE.md
tests/test_screenshot_capture.py[33-37]
tests/test_screenshot_capture.py[44-58]
tests/test_screenshot_capture.py[104-115]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
New test functions/methods are missing type hints for parameters and return types.

## Issue Context
Compliance requires type annotations on all new/modified function signatures.

## Fix Focus Areas
- tests/test_screenshot_capture.py[33-37]
- tests/test_screenshot_capture.py[44-58]
- tests/test_screenshot_capture.py[104-115]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

5. is_available exceptions uncaught 🐞 ☼

Description

capture() calls inst.is_available() outside the try/except, so any exception in a backend
availability check aborts capture() and prevents fallback to other backends. This undermines the
reliability guarantees of the auto fallback chain.

Code

src/windows_mcp/desktop/screenshot.py[R247-250]

+    for backend_cls in chain:
+        inst = _get_backend(backend_cls.name)
+        if not inst.is_available(capture_rect):
+            continue

Evidence

The backend loop checks inst.is_available(capture_rect) before entering the guarded try: that
catches backend failures. _DxcamBackend.is_available() calls _resolve_region(), which calls
uia.GetMonitorsRect(). GetMonitorsRect() makes direct Windows API calls via
ctypes.windll.user32.EnumDisplayMonitors without any error handling, so any exception raised
during monitor enumeration would bubble out of is_available() and bypass the fallback logic.

/src/windows_mcp/desktop/screenshot.py[230-259]
/src/windows_mcp/desktop/screenshot.py[101-130]
/src/windows_mcp/uia/core.py[643-675]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`capture()` calls `inst.is_available()` outside the existing try/except, so exceptions raised during availability checks prevent fallback to later backends.

### Issue Context
`_DxcamBackend.is_available()` calls into `uia.GetMonitorsRect()` via `_resolve_region()`. Monitor enumeration uses Windows API calls via `ctypes.windll` and is not exception-handled in `uia`, so unexpected failures can bubble up.

### Fix Focus Areas
- src/windows_mcp/desktop/screenshot.py[246-259]
- src/windows_mcp/desktop/screenshot.py[124-130]

### Suggested change
Wrap the availability check in the same fallback behavior as capture failures, e.g.:

```python
for backend_cls in chain:
   inst = _get_backend(backend_cls.name)
   try:
       if not inst.is_available(capture_rect):
           continue
   except Exception:
       logger.warning(
           "Screenshot backend '%s' availability check failed; trying next backend",
           inst.name,
           exc_info=selected != "auto",
       )
       continue

   try:
       return inst.capture(capture_rect), inst.name
   except (OSError, RuntimeError, ValueError):
       ...
```

Optionally also make `_DxcamBackend.is_available()` catch exceptions and return `False` to keep backends self-contained.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

ⓘ The new review experience is currently in Beta. Learn more

Copilot

Pull request overview

Refactors the Windows screenshot capture implementation to be more cohesive and extensible by moving screenshot responsibilities out of Desktop and into a backend-driven screenshot module, along with updating/adding tests around the new API.

Changes:

Introduces an OOP backend registry (_ScreenshotBackend + subclasses) and simplifies the public capture() API.
Removes screenshot-related delegation/state from Desktop and routes Desktop.get_screenshot() directly through screenshot.capture().
Updates existing tests to patch the new module targets and adds a new dedicated screenshot capture test suite.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

File	Description
`src/windows_mcp/desktop/screenshot.py`	Replaces function-based capture with backend classes + registry + simplified `capture()` entrypoint.
`src/windows_mcp/desktop/service.py`	Removes screenshot helpers/state from `Desktop`; delegates screenshot acquisition to `screenshot.capture()`.
`tests/test_snapshot_display_filter.py`	Updates patch targets to the new screenshot module and adapts cropping/screenshot tests.
`tests/test_screenshot_capture.py`	Adds a new unit test suite covering backend registration, backend behavior, and `capture()` fallback logic.

Comments suppressed due to low confidence (1)

tests/test_snapshot_display_filter.py:201

In test_get_screenshot_falls_back_to_pillow_when_dxcam_region_is_unsupported, the mocked ImageGrab.grab() returns a (1920, 1080) image even though the bbox passed is (0, 0, 3840, 1080), and the assertion expects the final screenshot size (3840, 1080). This mismatch can mask issues (e.g., relying on PIL crop padding). Consider making the mock return an image whose size matches the requested bbox (or asserting directly on the grab call args instead of the returned size) so the test reflects real ImageGrab behavior.

        with patch("windows_mcp.desktop.screenshot.ImageGrab.grab") as mock_grab:
            mock_grab.return_value = Image.new("RGB", (1920, 1080), "white")
            screenshot = desktop.get_screenshot(capture_rect=capture_rect)

        assert screenshot.size == (3840, 1080)
        assert fake_dxcam.create.call_count == 0
        assert mock_grab.call_args.kwargs == {
            "bbox": (0, 0, 3840, 1080),
            "all_screens": True,
        }

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-11T06:45:55Z

+                return _crop_screenshot(ImageGrab.grab(all_screens=True), capture_rect)
+            logger.warning("Failed to capture virtual screen, using primary screen")
+            screenshot = ImageGrab.grab()
+        return _crop_screenshot(screenshot, capture_rect)


When capture_rect is provided, this backend already requests only that region via ImageGrab.grab(bbox=...). Cropping the returned image again using _crop_screenshot(..., capture_rect) applies absolute virtual-screen coordinates to an image whose origin is already (0,0) for the requested region, which can yield padded/blank output. A concrete fix is to return screenshot directly on the success path when bbox was used, and only call _crop_screenshot in the fallback path where you grab the full virtual screen and then need to crop.

Suggested change

return _crop_screenshot(screenshot, capture_rect)

return screenshot

The double-crop issue in _PillowBackend and _MssBackend is a pre-existing bug (late discovery) — it existed in the original capture_with_pillow and capture_with_mss functions before this refactoring. Our refactoring moved the logic into class methods as-is without introducing or fixing it.

Root cause: When capture_rect is provided, both ImageGrab.grab(bbox=...) and mss.grab(monitor) already return an image cropped to the requested region (origin at (0,0)). Calling _crop_screenshot() afterwards applies absolute virtual-screen coordinates to this already-cropped image, which produces out-of-bounds/black output when the region doesn't start at the virtual screen origin.

Practical impact is very limited in the current codebase: the auto chain tries dxcam first (a required dependency), and the dxcam path does not call _crop_screenshot. The bug only manifests when dxcam is unavailable or fails and the capture rect targets a non-primary monitor — an unlikely combination in normal usage.

I'll verify and fix this in a follow-up PR to keep the refactoring scope clean.

Copilot · 2026-04-11T06:45:55Z

+                    "height": capture_rect.bottom - capture_rect.top,
+                }
+            raw = sct.grab(monitor)
+            image = Image.frombytes("RGB", raw.size, raw.rgb)


When capture_rect is provided, mss.grab() is already capturing exactly that rectangle (based on the monitor dict). Applying _crop_screenshot(image, capture_rect) again uses absolute virtual-screen coordinates against an image whose origin is the capture rectangle, which can crop to the wrong pixels and/or introduce black padding. Consider returning image directly when capture_rect is not None, and reserving _crop_screenshot only for cases where the captured image is the full virtual screen.

Suggested change

image = Image.frombytes("RGB", raw.size, raw.rgb)

image = Image.frombytes("RGB", raw.size, raw.rgb)

if capture_rect is not None:

return image

same as above

qodo-code-review · 2026-04-11T06:48:30Z

+    def capture(self, capture_rect: uia.Rect | None) -> Image.Image:
+        grab_kwargs: dict[str, object] = {"all_screens": True}
        if capture_rect is not None:
-            logger.warning(
-                "Failed to capture selected region directly, falling back to virtual screen crop"
+            grab_kwargs["bbox"] = (
+                capture_rect.left,
+                capture_rect.top,
+                capture_rect.right,
+                capture_rect.bottom,
            )
-            return crop_screenshot(ImageGrab.grab(all_screens=True), capture_rect)
-        logger.warning("Failed to capture virtual screen, using primary screen")
-        screenshot = ImageGrab.grab()
-    return crop_screenshot(screenshot, capture_rect)
+        try:
+            screenshot = ImageGrab.grab(**grab_kwargs)
+        except (OSError, RuntimeError, ValueError):
+            if capture_rect is not None:
+                logger.warning(
+                    "Failed to capture selected region directly, "
+                    "falling back to virtual screen crop"
+                )
+                return _crop_screenshot(ImageGrab.grab(all_screens=True), capture_rect)
+            logger.warning("Failed to capture virtual screen, using primary screen")
+            screenshot = ImageGrab.grab()
+        return _crop_screenshot(screenshot, capture_rect)


1. No 1920x1080 cap enforced 📘 Rule violation ➹ Performance

Screenshot capture paths can return images wider than 1920px (e.g., multi-monitor virtual screen) because no resize/cap is applied before returning. This violates the documented maximum screenshot resolution requirement and can increase payload/token usage significantly.

Agent Prompt

## Issue description Screenshot capture can return images larger than 1920x1080 (e.g., 3840x1080), violating the maximum resolution requirement. ## Issue Context `_PillowBackend.capture()` returns the grabbed image (and fallback crop) without any resizing/capping, and `capture()` returns that image unchanged. ## Fix Focus Areas - src/windows_mcp/desktop/screenshot.py[152-178] - src/windows_mcp/desktop/screenshot.py[230-261] - tests/test_screenshot_capture.py[414-430]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

qodo-code-review · 2026-04-11T06:48:30Z

+    def capture(self, capture_rect: uia.Rect | None) -> Image.Image:
+        if mss is None:
+            raise RuntimeError("mss is not available")
+        with mss.mss() as sct:
+            if capture_rect is None:
+                monitor = sct.monitors[0]
+            else:
+                monitor = {
+                    "left": capture_rect.left,
+                    "top": capture_rect.top,
+                    "width": capture_rect.right - capture_rect.left,
+                    "height": capture_rect.bottom - capture_rect.top,
+                }
+            raw = sct.grab(monitor)
+            image = Image.frombytes("RGB", raw.size, raw.rgb)
+        return _crop_screenshot(image, capture_rect)


2. Mss crops wrong region 🐞 Bug ≡ Correctness

_MssBackend.capture() grabs an already-cropped region image (local coords start at (0,0)) but then applies _crop_screenshot() using global screen coordinates, which can crop out-of-bounds and return black/incorrect pixels. This can silently produce incorrect screenshots for region captures (and current tests don’t assert against the corrupted output).

Agent Prompt

### Issue description `_MssBackend.capture()` builds an image from `mss.grab()` for a *specific region* and then calls `_crop_screenshot(image, capture_rect)`. That second crop uses global coordinates (via `GetVirtualScreenRect` offsets) against an image whose origin is already the region’s top-left, causing mis-cropping/out-of-bounds crops. ### Issue Context For `capture_rect != None`, the `monitor` dict passed to `sct.grab()` already specifies left/top/width/height, so the returned `raw` and resulting `image` are expected to be exactly that region. ### Fix Focus Areas - src/windows_mcp/desktop/screenshot.py[190-206] - src/windows_mcp/desktop/screenshot.py[26-40] ### Suggested change In `_MssBackend.capture()`: - If `capture_rect is None`: return `image`. - If `capture_rect is not None` and `image.size == (capture_rect.width(), capture_rect.height())`: return `image` directly (no crop). - Otherwise (defensive fallback if some platform/library returns a full virtual-screen image): apply `_crop_screenshot(image, capture_rect)`. ### Test hardening Add an assertion that the returned image content is not shifted/blank for a non-zero `capture_rect.left/top` case (e.g., validate a known pixel pattern or at least `getbbox()` plus a pixel check).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

same as above

Use cls.__dict__ instead of hasattr in __init_subclass__ so only classes that explicitly define name and priority are registered. Add duplicate name detection that raises ValueError on conflicts.

JezaChen added 6 commits April 3, 2026 21:24

refactor: remove get_monitors_rect parameter from capture functions

6508ec2

Have resolve_dxcam_region call uia.GetMonitorsRect() directly instead of receiving it as a callback. Remove unused Callable import and update test mock targets to screenshot.uia.GetMonitorsRect.

Copilot AI review requested due to automatic review settings April 11, 2026 06:42

Copilot started reviewing on behalf of JezaChen April 11, 2026 06:42 View session

Copilot AI reviewed Apr 11, 2026

View reviewed changes

qodo-code-review Bot reviewed Apr 11, 2026

View reviewed changes

fix: prevent inherited subclasses from overwriting backend registry

f7ff636

Use cls.__dict__ instead of hasattr in __init_subclass__ so only classes that explicitly define name and priority are registered. Add duplicate name detection that raises ValueError on conflicts.

Jeomon merged commit 4e189ea into CursorTouch:main Apr 11, 2026

JezaChen mentioned this pull request Apr 11, 2026

fix: remove double-crop in pillow and mss backends for region captures #203

Merged

2 tasks

This was referenced Jul 1, 2026

Fix/multimonitor snapshot annotations #303

Merged

Update dependencies #307

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Refactor screenshot capture module for cohesion and extensibility#202

Refactor screenshot capture module for cohesion and extensibility#202
Jeomon merged 7 commits into
CursorTouch:mainfrom
JezaChen:refactor/screenshot-capture

JezaChen commented Apr 11, 2026

Uh oh!

qodo-code-review Bot commented Apr 11, 2026

Uh oh!

qodo-code-review Bot commented Apr 11, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Copilot AI Apr 11, 2026

Uh oh!

JezaChen Apr 11, 2026

Uh oh!

Copilot AI Apr 11, 2026

Uh oh!

JezaChen Apr 11, 2026

Uh oh!

qodo-code-review Bot Apr 11, 2026

Uh oh!

qodo-code-review Bot Apr 11, 2026

Uh oh!

JezaChen Apr 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	return _crop_screenshot(screenshot, capture_rect)
	return screenshot

Uh oh!

Uh oh!

Conversation

JezaChen commented Apr 11, 2026

Summary

Motivation

What changed

Test plan

Uh oh!

qodo-code-review Bot commented Apr 11, 2026

Review Summary by Qodo

Walkthroughs

File Changes

Uh oh!

qodo-code-review Bot commented Apr 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review by Qodo

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

Uh oh!

JezaChen Apr 11, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

Uh oh!

JezaChen Apr 11, 2026

Choose a reason for hiding this comment

Uh oh!

qodo-code-review Bot Apr 11, 2026

Choose a reason for hiding this comment

Uh oh!

qodo-code-review Bot Apr 11, 2026

Choose a reason for hiding this comment

Uh oh!

JezaChen Apr 11, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

qodo-code-review Bot commented Apr 11, 2026 •

edited

Loading