Refactor screenshot capture module for cohesion and extensibility#202
Conversation
Move dxcam/mss references, camera cache, crop helpers, and backend selection out of Desktop into screenshot.py to reduce Desktop's responsibilities. Update test patches to target screenshot module and isolate _DXCAM_CAMERA_CACHE between tests.
…h targets Remove Desktop._crop_screenshot and Desktop._build_crop_box that were duplicated in screenshot.py. Update test to call module-level _crop_screenshot and patch screenshot.uia.GetVirtualScreenRect instead of service.uia.GetVirtualScreenRect.
Have resolve_dxcam_region call uia.GetMonitorsRect() directly instead of receiving it as a callback. Remove unused Callable import and update test mock targets to screenshot.uia.GetMonitorsRect.
…tration Replace flat capture_with_* functions with an OOP design: - ScreenshotBackend base class with __init_subclass__ auto-registration - DxcamBackend (priority=10), MssBackend (priority=20), PillowBackend (priority=100) - capture() now iterates registered backends by priority instead of if/elif chain - get_screenshot_backend() validates against dynamic registry - DxcamBackend encapsulates camera cache and resolve_region logic - Adding a new backend only requires defining a subclass with name and priority
Add tests/test_screenshot_capture.py with 35 test cases covering: - Backend auto-registration via __init_subclass__ - Environment variable parsing in get_screenshot_backend() - DxcamBackend: is_available, _resolve_region coordinate math, camera cache - MssBackend: availability check, monitor dict construction - PillowBackend: always-available, error fallback paths - capture() API: explicit backend, unknown backend, auto chain fallback, exception recovery, and image content verification
- Verify _get_backend singleton caching behavior - Assert exact kwargs sequence in pillow primary-screen fallback - Test explicit backend capture failure triggers pillow safety fallback - Test explicit dxcam with cross-monitor rect falls back to pillow
Review Summary by QodoRefactor screenshot capture with OOP backends and simplified API
WalkthroughsDescription• Refactor screenshot module with OOP backend hierarchy using __init_subclass__ auto-registration • Simplify capture() API from 7 parameters to 2 by eliminating test-only dependency injection • Move screenshot logic out of Desktop class, reducing its responsibilities and surface area • Add 35+ comprehensive unit tests covering backends, registry, and capture API edge cases Diagramflowchart LR
A["Desktop.get_screenshot"] -->|calls| B["capture()"]
B -->|selects backend| C["_ScreenshotBackend registry"]
C -->|instantiates| D["_DxcamBackend"]
C -->|instantiates| E["_MssBackend"]
C -->|instantiates| F["_PillowBackend"]
D -->|captures| G["Image"]
E -->|captures| G
F -->|captures| G
B -->|returns| H["Image + backend_name"]
File Changes1. src/windows_mcp/desktop/screenshot.py
|
Code Review by Qodo
|
There was a problem hiding this comment.
Pull request overview
Refactors the Windows screenshot capture implementation to be more cohesive and extensible by moving screenshot responsibilities out of Desktop and into a backend-driven screenshot module, along with updating/adding tests around the new API.
Changes:
- Introduces an OOP backend registry (
_ScreenshotBackend+ subclasses) and simplifies the publiccapture()API. - Removes screenshot-related delegation/state from
Desktopand routesDesktop.get_screenshot()directly throughscreenshot.capture(). - Updates existing tests to patch the new module targets and adds a new dedicated screenshot capture test suite.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
src/windows_mcp/desktop/screenshot.py |
Replaces function-based capture with backend classes + registry + simplified capture() entrypoint. |
src/windows_mcp/desktop/service.py |
Removes screenshot helpers/state from Desktop; delegates screenshot acquisition to screenshot.capture(). |
tests/test_snapshot_display_filter.py |
Updates patch targets to the new screenshot module and adapts cropping/screenshot tests. |
tests/test_screenshot_capture.py |
Adds a new unit test suite covering backend registration, backend behavior, and capture() fallback logic. |
Comments suppressed due to low confidence (1)
tests/test_snapshot_display_filter.py:201
- In
test_get_screenshot_falls_back_to_pillow_when_dxcam_region_is_unsupported, the mockedImageGrab.grab()returns a(1920, 1080)image even though thebboxpassed is(0, 0, 3840, 1080), and the assertion expects the final screenshot size(3840, 1080). This mismatch can mask issues (e.g., relying on PIL crop padding). Consider making the mock return an image whose size matches the requested bbox (or asserting directly on thegrabcall args instead of the returned size) so the test reflects realImageGrabbehavior.
with patch("windows_mcp.desktop.screenshot.ImageGrab.grab") as mock_grab:
mock_grab.return_value = Image.new("RGB", (1920, 1080), "white")
screenshot = desktop.get_screenshot(capture_rect=capture_rect)
assert screenshot.size == (3840, 1080)
assert fake_dxcam.create.call_count == 0
assert mock_grab.call_args.kwargs == {
"bbox": (0, 0, 3840, 1080),
"all_screens": True,
}
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| return _crop_screenshot(ImageGrab.grab(all_screens=True), capture_rect) | ||
| logger.warning("Failed to capture virtual screen, using primary screen") | ||
| screenshot = ImageGrab.grab() | ||
| return _crop_screenshot(screenshot, capture_rect) |
There was a problem hiding this comment.
When capture_rect is provided, this backend already requests only that region via ImageGrab.grab(bbox=...). Cropping the returned image again using _crop_screenshot(..., capture_rect) applies absolute virtual-screen coordinates to an image whose origin is already (0,0) for the requested region, which can yield padded/blank output. A concrete fix is to return screenshot directly on the success path when bbox was used, and only call _crop_screenshot in the fallback path where you grab the full virtual screen and then need to crop.
| return _crop_screenshot(screenshot, capture_rect) | |
| return screenshot |
There was a problem hiding this comment.
The double-crop issue in _PillowBackend and _MssBackend is a pre-existing bug (late discovery) — it existed in the original capture_with_pillow and capture_with_mss functions before this refactoring. Our refactoring moved the logic into class methods as-is without introducing or fixing it.
Root cause: When capture_rect is provided, both ImageGrab.grab(bbox=...) and mss.grab(monitor) already return an image cropped to the requested region (origin at (0,0)). Calling _crop_screenshot() afterwards applies absolute virtual-screen coordinates to this already-cropped image, which produces out-of-bounds/black output when the region doesn't start at the virtual screen origin.
Practical impact is very limited in the current codebase: the auto chain tries dxcam first (a required dependency), and the dxcam path does not call _crop_screenshot. The bug only manifests when dxcam is unavailable or fails and the capture rect targets a non-primary monitor — an unlikely combination in normal usage.
I'll verify and fix this in a follow-up PR to keep the refactoring scope clean.
| "height": capture_rect.bottom - capture_rect.top, | ||
| } | ||
| raw = sct.grab(monitor) | ||
| image = Image.frombytes("RGB", raw.size, raw.rgb) |
There was a problem hiding this comment.
When capture_rect is provided, mss.grab() is already capturing exactly that rectangle (based on the monitor dict). Applying _crop_screenshot(image, capture_rect) again uses absolute virtual-screen coordinates against an image whose origin is the capture rectangle, which can crop to the wrong pixels and/or introduce black padding. Consider returning image directly when capture_rect is not None, and reserving _crop_screenshot only for cases where the captured image is the full virtual screen.
| image = Image.frombytes("RGB", raw.size, raw.rgb) | |
| image = Image.frombytes("RGB", raw.size, raw.rgb) | |
| if capture_rect is not None: | |
| return image |
| def capture(self, capture_rect: uia.Rect | None) -> Image.Image: | ||
| grab_kwargs: dict[str, object] = {"all_screens": True} | ||
| if capture_rect is not None: | ||
| logger.warning( | ||
| "Failed to capture selected region directly, falling back to virtual screen crop" | ||
| grab_kwargs["bbox"] = ( | ||
| capture_rect.left, | ||
| capture_rect.top, | ||
| capture_rect.right, | ||
| capture_rect.bottom, | ||
| ) | ||
| return crop_screenshot(ImageGrab.grab(all_screens=True), capture_rect) | ||
| logger.warning("Failed to capture virtual screen, using primary screen") | ||
| screenshot = ImageGrab.grab() | ||
| return crop_screenshot(screenshot, capture_rect) | ||
| try: | ||
| screenshot = ImageGrab.grab(**grab_kwargs) | ||
| except (OSError, RuntimeError, ValueError): | ||
| if capture_rect is not None: | ||
| logger.warning( | ||
| "Failed to capture selected region directly, " | ||
| "falling back to virtual screen crop" | ||
| ) | ||
| return _crop_screenshot(ImageGrab.grab(all_screens=True), capture_rect) | ||
| logger.warning("Failed to capture virtual screen, using primary screen") | ||
| screenshot = ImageGrab.grab() | ||
| return _crop_screenshot(screenshot, capture_rect) |
There was a problem hiding this comment.
1. No 1920x1080 cap enforced 📘 Rule violation ➹ Performance
Screenshot capture paths can return images wider than 1920px (e.g., multi-monitor virtual screen) because no resize/cap is applied before returning. This violates the documented maximum screenshot resolution requirement and can increase payload/token usage significantly.
Agent Prompt
## Issue description
Screenshot capture can return images larger than 1920x1080 (e.g., 3840x1080), violating the maximum resolution requirement.
## Issue Context
`_PillowBackend.capture()` returns the grabbed image (and fallback crop) without any resizing/capping, and `capture()` returns that image unchanged.
## Fix Focus Areas
- src/windows_mcp/desktop/screenshot.py[152-178]
- src/windows_mcp/desktop/screenshot.py[230-261]
- tests/test_screenshot_capture.py[414-430]
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
| def capture(self, capture_rect: uia.Rect | None) -> Image.Image: | ||
| if mss is None: | ||
| raise RuntimeError("mss is not available") | ||
| with mss.mss() as sct: | ||
| if capture_rect is None: | ||
| monitor = sct.monitors[0] | ||
| else: | ||
| monitor = { | ||
| "left": capture_rect.left, | ||
| "top": capture_rect.top, | ||
| "width": capture_rect.right - capture_rect.left, | ||
| "height": capture_rect.bottom - capture_rect.top, | ||
| } | ||
| raw = sct.grab(monitor) | ||
| image = Image.frombytes("RGB", raw.size, raw.rgb) | ||
| return _crop_screenshot(image, capture_rect) |
There was a problem hiding this comment.
2. Mss crops wrong region 🐞 Bug ≡ Correctness
_MssBackend.capture() grabs an already-cropped region image (local coords start at (0,0)) but then applies _crop_screenshot() using global screen coordinates, which can crop out-of-bounds and return black/incorrect pixels. This can silently produce incorrect screenshots for region captures (and current tests don’t assert against the corrupted output).
Agent Prompt
### Issue description
`_MssBackend.capture()` builds an image from `mss.grab()` for a *specific region* and then calls `_crop_screenshot(image, capture_rect)`. That second crop uses global coordinates (via `GetVirtualScreenRect` offsets) against an image whose origin is already the region’s top-left, causing mis-cropping/out-of-bounds crops.
### Issue Context
For `capture_rect != None`, the `monitor` dict passed to `sct.grab()` already specifies left/top/width/height, so the returned `raw` and resulting `image` are expected to be exactly that region.
### Fix Focus Areas
- src/windows_mcp/desktop/screenshot.py[190-206]
- src/windows_mcp/desktop/screenshot.py[26-40]
### Suggested change
In `_MssBackend.capture()`:
- If `capture_rect is None`: return `image`.
- If `capture_rect is not None` and `image.size == (capture_rect.width(), capture_rect.height())`: return `image` directly (no crop).
- Otherwise (defensive fallback if some platform/library returns a full virtual-screen image): apply `_crop_screenshot(image, capture_rect)`.
### Test hardening
Add an assertion that the returned image content is not shifted/blank for a non-zero `capture_rect.left/top` case (e.g., validate a known pixel pattern or at least `getbbox()` plus a pixel check).
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
Use cls.__dict__ instead of hasattr in __init_subclass__ so only classes that explicitly define name and priority are registered. Add duplicate name detection that raises ValueError on conflicts.
Summary
Desktop: Move screenshot-related state and functions out ofDesktop(service.py) into thescreenshotmodule, eliminating unused delegation methods and reducingDesktop's responsibilities.capture()parameters: Remove unnecessary callable/injection parameters (crop_screenshot,get_monitors_rect,dxcam_module,mss_module,camera_cache) that were passed through solely for testability but made the API confusing and brittle. Each backend now accesses what it needs directly.ScreenshotBackendclass hierarchy: Replace the flatcapture_with_*functions and manual if/elif dispatch with an OOP design using__init_subclass__for automatic backend registration. Adding a new backend now only requires defining a single class withnameandpriority.test_screenshot_capture.pywith 38 test cases covering thecapture()API, each backend class, the auto-registration mechanism, and edge cases like fallback chains and error recovery.Motivation
The
Desktopclass had accumulated several screenshot-related methods (_capture_with_dxcam,_capture_with_pillow,_get_dxcam_camera,_resolve_dxcam_region,_get_screenshot_backend,_crop_screenshot,_build_crop_box) that were thin wrappers delegating toscreenshot.pyfunctions. These wrappers added no value — they simply forwarded calls — yet inflatedDesktop's surface area and obscured where the real logic lived.The
capture()function signature had grown to accept 7 parameters, most of which were dependency-injection hooks introduced purely for unit testing:This created several problems:
camera_cacheanddxcam_moduleonly matter for the dxcam backend; passing them through the genericcapture()is misleading — callers might expect them to affect other backends._DXCAM_CAMERA_CACHEwas a module global dict that existed primarily so tests could reset it, making the code harder to follow.What changed
screenshot.py— Major restructure:_ScreenshotBackendbase class with__init_subclass__auto-registration intoregistry._DxcamBackend(priority=10),_MssBackend(priority=20),_PillowBackend(priority=100).is_available) and capture logic (capture)._DxcamBackendowns its camera cache as an instance attribute and its monitor region resolution as a static method.capture()reduced to 2 parameters:capture(capture_rect, backend=None).get_screenshot_backend()validates against the dynamic registry instead of a hardcoded set.capture_with_dxcam(),capture_with_mss(),capture_with_pillow(),get_dxcam_camera(),_DXCAM_CAMERA_CACHE,resolve_dxcam_region(),_auto_backend_chain().service.py— Simplified:Desktop._dxcam_cameras,_get_screenshot_backend,_resolve_dxcam_region,_get_dxcam_camera,_capture_with_dxcam,_capture_with_pillow,_crop_screenshot,_build_crop_box.dxcam = screenshot_capture.dxcam/mss = screenshot_capture.mssaliases.Desktop.get_screenshot()now simply callsscreenshot_capture.capture(capture_rect).test_snapshot_display_filter.py— Updated mock targets:patch("windows_mcp.desktop.service.dxcam", ...)/service.mssreferences replaced withscreenshot.*targets._DXCAM_CAMERA_CACHEmock replaced with_backend_instancesreset.test_screenshot_capture.py— New file, 38 test cases:TestBackendRegistry: auto-registration, priority ordering, incomplete subclass handling.TestGetScreenshotBackend: valid/invalid env vars, defaults, case insensitivity.TestDxcamBackend: availability checks, region resolution (exact match, sub-region coordinates, cross-monitor), camera caching, capture success/failure.TestMssBackend: availability, monitor dict construction, full-screen path.TestPillowBackend: always-available, error fallback with/without capture rect.TestCapture: explicit backend, unknown backend error, auto chain fallback, exception recovery, safety fallback, image content verification.Test plan
ruff check . && ruff format --check .— lint passespytest tests/test_screenshot_capture.py -v— 38 new tests passpytest tests/test_snapshot_display_filter.py -v— 18 existing tests pass (no regressions)