feat: serve-centered local API bridge for Playwright/Tauri integration#1692
feat: serve-centered local API bridge for Playwright/Tauri integration#1692logoods wants to merge 1 commit into
Conversation
There was a problem hiding this comment.
Code Review
This pull request establishes the core infrastructure for the DeepSeek IDE extension layer, featuring a Python-based API server, a process manager for the TUI, and a Playwright wrapper for browser automation. Feedback highlights a critical race condition during the startup sequence and a path traversal vulnerability in the screenshot endpoint. Other recommendations focus on performance optimizations—such as parallelizing HTTP requests and avoiding synchronous file I/O—as well as improving the chat streaming implementation and error handling for malformed JSON requests.
| await server.start() | ||
|
|
||
| # Start Playwright layer (waits for TUI to be ready) | ||
| await pw.start() |
There was a problem hiding this comment.
There is a potential race condition here. The ExtensionServer is started at line 42, but the PlaywrightWrapper (which it depends on for almost all API calls) is not initialized until pw.start() completes at line 45. If an API request (e.g., /v1/status) arrives in this interval, it will likely result in an AttributeError because pw._session or other members are still None. Consider starting the server only after its dependencies are fully ready, or implement readiness checks in the API handlers.
|
|
||
| async def screenshot(self, path: str | None = None) -> dict[str, Any]: | ||
| page = await self._require_page() | ||
| output = pathlib.Path(path) if path else (pathlib.Path(__file__).resolve().parent / "logs" / "playwright-console.png") |
There was a problem hiding this comment.
This line is vulnerable to a Path Traversal attack. If a user provides an absolute path or a path containing .. via the API, they can write a screenshot to arbitrary locations on the filesystem. You should validate that the provided path is relative and does not escape the intended logs directory.
| output = pathlib.Path(path) if path else (pathlib.Path(__file__).resolve().parent / "logs" / "playwright-console.png") | |
| if path: | |
| requested_path = pathlib.Path(path) | |
| # Ensure the filename is safe and restricted to the logs directory | |
| output = pathlib.Path(__file__).resolve().parent / "logs" / requested_path.name | |
| else: | |
| output = pathlib.Path(__file__).resolve().parent / "logs" / "playwright-console.png" |
| if self._page.url != self._ui_config["host_url"]: | ||
| await self._page.goto(self._ui_config["host_url"], wait_until="domcontentloaded") | ||
|
|
||
| script = CONTROL_PANEL_SCRIPT.read_text(encoding="utf-8") |
There was a problem hiding this comment.
Reading a file synchronously using pathlib.Path.read_text inside an async method blocks the event loop. This can degrade performance, especially if the file is large or the disk is slow. It is better to read this file once during initialization or use an asynchronous file library like aiofiles.
# Suggestion: Read this once in __init__ and store it in self._control_panel_script
script = self._control_panel_script| async with self._session.get( | ||
| f"{RUNTIME_BASE}/v1/runtime/info", | ||
| timeout=aiohttp.ClientTimeout(total=5), | ||
| ) as runtime_resp: | ||
| runtime = await runtime_resp.json() | ||
|
|
||
| async with self._session.get( | ||
| f"{RUNTIME_BASE}/v1/workspace/status", | ||
| timeout=aiohttp.ClientTimeout(total=5), | ||
| ) as workspace_resp: | ||
| workspace = await workspace_resp.json() |
There was a problem hiding this comment.
These two HTTP requests are performed sequentially. Since they are independent, they can be executed in parallel using asyncio.gather to reduce the total latency of the get_status call.
async def fetch_json(url):
async with self._session.get(url, timeout=aiohttp.ClientTimeout(total=5)) as resp:
return await resp.json()
runtime, workspace = await asyncio.gather(
fetch_json(f"{RUNTIME_BASE}/v1/runtime/info"),
fetch_json(f"{RUNTIME_BASE}/v1/workspace/status")
)| async def chat_stream(self, request: web.Request): | ||
| payload = await request.json() | ||
| message = (payload.get("message") or "").strip() | ||
| if not message: | ||
| return web.Response(status=400, text="message is required") | ||
|
|
||
| content = await self._pw.send_message(message) | ||
| return web.Response(text=content, content_type="text/plain") |
There was a problem hiding this comment.
| try: | ||
| return await request.json() | ||
| except Exception: | ||
| return {} |
There was a problem hiding this comment.
Swallowing all exceptions during JSON parsing and returning an empty dictionary can mask client errors and lead to unexpected behavior in handlers (e.g., missing required fields). It is better to catch json.JSONDecodeError specifically and return a 400 Bad Request response to the client.
| try: | |
| return await request.json() | |
| except Exception: | |
| return {} | |
| try: | |
| return await request.json() | |
| except Exception: | |
| logger.warning("Malformed JSON received in request body") | |
| raise web.HTTPBadRequest(text="Invalid JSON body") |
|
This PR was opened before the v0.8.41 rebrand and is now stale. Feel free to rebase onto current |
Summary | 概述
This PR introduces a minimal, serve-centered integration loop that unifies:
DeepSeek runtime API ([deepseek serve --http], port 7878)
Python extension API (port 3000)
Playwright-injected control console
Tauri IPC bridge
本 PR 实现了一个以 [serve]为中心的最小闭环,统一了:
DeepSeek runtime API([deepseek serve --http],7878)
Python 扩展层 API(3000)
Playwright 注入控制台
Tauri IPC 桥接层
What’s Included | 变更内容
Added local extension API endpoints for status/chat/process/browser operations.
新增本地扩展 API,覆盖状态、聊天、进程控制、浏览器操作。
Added injected UI for:
runtime/process status
page summary
DOM inspection
element actions ([highlight/click/fill/focus]
MCP browser task entry
新增注入式控制台,支持:
runtime/进程状态
页面摘要
DOM 检视
元素动作([highlight/click/fill/focus]
MCP 浏览器任务入口
Extended IPC commands to proxy the same extension API surface:
capabilities, browser state/summary
process start/stop/restart
navigate/reload UI/evaluate/screenshot
DOM inspect/element action/MCP task
扩展 Tauri IPC,统一代理上述扩展 API 能力面。
Added duplicate-launch guard for app-server port conflicts.
Detects existing listener and avoids blind relaunch loops ([os error 10048]).
Improved attached-process lifecycle behavior for stop/restart/shutdown paths.
增加端口冲突保护,检测已占用实例并跳过重复拉起,避免 10048 重试风暴;并修复附着进程下的 stop/restart/shutdown 边界行为。
Why | 背景与动机
Current integration requires a stable local API contract for UI, IPC, and browser automation features.
This PR aligns all upper layers to one local API surface instead of scattered direct calls.
当前架构需要一个稳定的本地 API 契约来承接 UI、IPC 与浏览器自动化能力。
本 PR 将上层调用统一到本地 API,避免多入口直连导致的语义分裂。
Validation | 验证
Static checks passed for modified Python/Rust/JS files.
Local health/status endpoints returned expected responses.
Capability endpoints exposed new browser-native fields.
Process control ([start/stop/restart])verified through extension API.
Port-occupied scenario observed and handled via duplicate-launch guard.
修改文件静态检查通过(Python/Rust/JS)。
本地 health/status 接口返回正常。
capabilities 接口已返回新增浏览器能力字段。
进程控制接口(start/stop/restart)行为已验证。
8787 端口占用场景已通过防重复拉起逻辑处理。
Notes | 备注
This is a minimal closed-loop implementation focused on architecture convergence.
Future improvements can include:
clearer 7878 vs 8787 control semantics in UI
contract tests for extension API
richer Tauri-native workbench UX
这是面向架构收敛的最小闭环实现。
后续可继续增强:
UI 中更明确区分 7878 与 8787 控制语义
扩展 API 契约测试
更完整的 Tauri 原生工作台体验