Skip to content

feat: serve-centered local API bridge for Playwright/Tauri integration#1692

Closed
logoods wants to merge 1 commit into
Hmbown:mainfrom
logoods:feat/serve-centered-ai-native-bridge
Closed

feat: serve-centered local API bridge for Playwright/Tauri integration#1692
logoods wants to merge 1 commit into
Hmbown:mainfrom
logoods:feat/serve-centered-ai-native-bridge

Conversation

@logoods
Copy link
Copy Markdown

@logoods logoods commented May 15, 2026

Summary | 概述
This PR introduces a minimal, serve-centered integration loop that unifies:

DeepSeek runtime API ([deepseek serve --http], port 7878)
Python extension API (port 3000)
Playwright-injected control console
Tauri IPC bridge
本 PR 实现了一个以 [serve]为中心的最小闭环,统一了:

DeepSeek runtime API([deepseek serve --http],7878)
Python 扩展层 API(3000)
Playwright 注入控制台
Tauri IPC 桥接层
What’s Included | 变更内容

  1. Serve-centered extension layer | 以 serve 为中心的扩展层
    Added local extension API endpoints for status/chat/process/browser operations.
    新增本地扩展 API,覆盖状态、聊天、进程控制、浏览器操作。
  2. Playwright control console | Playwright 控制台
    Added injected UI for:
    runtime/process status
    page summary
    DOM inspection
    element actions ([highlight/click/fill/focus]
    MCP browser task entry
    新增注入式控制台,支持:
    runtime/进程状态
    页面摘要
    DOM 检视
    元素动作([highlight/click/fill/focus]
    MCP 浏览器任务入口
  3. Tauri IPC expansion | Tauri IPC 扩展
    Extended IPC commands to proxy the same extension API surface:
    capabilities, browser state/summary
    process start/stop/restart
    navigate/reload UI/evaluate/screenshot
    DOM inspect/element action/MCP task
    扩展 Tauri IPC,统一代理上述扩展 API 能力面。
  4. Process manager hardening | 进程管理增强
    Added duplicate-launch guard for app-server port conflicts.
    Detects existing listener and avoids blind relaunch loops ([os error 10048]).
    Improved attached-process lifecycle behavior for stop/restart/shutdown paths.
    增加端口冲突保护,检测已占用实例并跳过重复拉起,避免 10048 重试风暴;并修复附着进程下的 stop/restart/shutdown 边界行为。
    Why | 背景与动机
    Current integration requires a stable local API contract for UI, IPC, and browser automation features.
    This PR aligns all upper layers to one local API surface instead of scattered direct calls.

当前架构需要一个稳定的本地 API 契约来承接 UI、IPC 与浏览器自动化能力。
本 PR 将上层调用统一到本地 API,避免多入口直连导致的语义分裂。

Validation | 验证
Static checks passed for modified Python/Rust/JS files.

Local health/status endpoints returned expected responses.

Capability endpoints exposed new browser-native fields.

Process control ([start/stop/restart])verified through extension API.

Port-occupied scenario observed and handled via duplicate-launch guard.

修改文件静态检查通过(Python/Rust/JS)。

本地 health/status 接口返回正常。

capabilities 接口已返回新增浏览器能力字段。

进程控制接口(start/stop/restart)行为已验证。

8787 端口占用场景已通过防重复拉起逻辑处理。

Notes | 备注
This is a minimal closed-loop implementation focused on architecture convergence.

Future improvements can include:

clearer 7878 vs 8787 control semantics in UI
contract tests for extension API
richer Tauri-native workbench UX
这是面向架构收敛的最小闭环实现。

后续可继续增强:
UI 中更明确区分 7878 与 8787 控制语义
扩展 API 契约测试
更完整的 Tauri 原生工作台体验

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request establishes the core infrastructure for the DeepSeek IDE extension layer, featuring a Python-based API server, a process manager for the TUI, and a Playwright wrapper for browser automation. Feedback highlights a critical race condition during the startup sequence and a path traversal vulnerability in the screenshot endpoint. Other recommendations focus on performance optimizations—such as parallelizing HTTP requests and avoiding synchronous file I/O—as well as improving the chat streaming implementation and error handling for malformed JSON requests.

Comment thread core/main.py
Comment on lines +42 to +45
await server.start()

# Start Playwright layer (waits for TUI to be ready)
await pw.start()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

There is a potential race condition here. The ExtensionServer is started at line 42, but the PlaywrightWrapper (which it depends on for almost all API calls) is not initialized until pw.start() completes at line 45. If an API request (e.g., /v1/status) arrives in this interval, it will likely result in an AttributeError because pw._session or other members are still None. Consider starting the server only after its dependencies are fully ready, or implement readiness checks in the API handlers.


async def screenshot(self, path: str | None = None) -> dict[str, Any]:
page = await self._require_page()
output = pathlib.Path(path) if path else (pathlib.Path(__file__).resolve().parent / "logs" / "playwright-console.png")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This line is vulnerable to a Path Traversal attack. If a user provides an absolute path or a path containing .. via the API, they can write a screenshot to arbitrary locations on the filesystem. You should validate that the provided path is relative and does not escape the intended logs directory.

Suggested change
output = pathlib.Path(path) if path else (pathlib.Path(__file__).resolve().parent / "logs" / "playwright-console.png")
if path:
requested_path = pathlib.Path(path)
# Ensure the filename is safe and restricted to the logs directory
output = pathlib.Path(__file__).resolve().parent / "logs" / requested_path.name
else:
output = pathlib.Path(__file__).resolve().parent / "logs" / "playwright-console.png"

if self._page.url != self._ui_config["host_url"]:
await self._page.goto(self._ui_config["host_url"], wait_until="domcontentloaded")

script = CONTROL_PANEL_SCRIPT.read_text(encoding="utf-8")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Reading a file synchronously using pathlib.Path.read_text inside an async method blocks the event loop. This can degrade performance, especially if the file is large or the disk is slow. It is better to read this file once during initialization or use an asynchronous file library like aiofiles.

        # Suggestion: Read this once in __init__ and store it in self._control_panel_script
        script = self._control_panel_script

Comment on lines +391 to +401
async with self._session.get(
f"{RUNTIME_BASE}/v1/runtime/info",
timeout=aiohttp.ClientTimeout(total=5),
) as runtime_resp:
runtime = await runtime_resp.json()

async with self._session.get(
f"{RUNTIME_BASE}/v1/workspace/status",
timeout=aiohttp.ClientTimeout(total=5),
) as workspace_resp:
workspace = await workspace_resp.json()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

These two HTTP requests are performed sequentially. Since they are independent, they can be executed in parallel using asyncio.gather to reduce the total latency of the get_status call.

        async def fetch_json(url):
            async with self._session.get(url, timeout=aiohttp.ClientTimeout(total=5)) as resp:
                return await resp.json()

        runtime, workspace = await asyncio.gather(
            fetch_json(f"{RUNTIME_BASE}/v1/runtime/info"),
            fetch_json(f"{RUNTIME_BASE}/v1/workspace/status")
        )

Comment thread core/api/server.py
Comment on lines +80 to +87
async def chat_stream(self, request: web.Request):
payload = await request.json()
message = (payload.get("message") or "").strip()
if not message:
return web.Response(status=400, text="message is required")

content = await self._pw.send_message(message)
return web.Response(text=content, content_type="text/plain")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The /v1/chat/stream endpoint is currently a "fake" stream. It awaits the full response from the Playwright wrapper (line 86) before returning it as a single block of text. For a true streaming experience, this should use web.StreamResponse and yield chunks as they are received from the runtime API.

Comment thread core/api/server.py
Comment on lines +161 to +164
try:
return await request.json()
except Exception:
return {}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Swallowing all exceptions during JSON parsing and returning an empty dictionary can mask client errors and lead to unexpected behavior in handlers (e.g., missing required fields). It is better to catch json.JSONDecodeError specifically and return a 400 Bad Request response to the client.

Suggested change
try:
return await request.json()
except Exception:
return {}
try:
return await request.json()
except Exception:
logger.warning("Malformed JSON received in request body")
raise web.HTTPBadRequest(text="Invalid JSON body")

@Hmbown
Copy link
Copy Markdown
Owner

Hmbown commented May 23, 2026

This PR was opened before the v0.8.41 rebrand and is now stale. Feel free to rebase onto current main and reopen. 鲸鱼兄弟们等你 🐋

@Hmbown Hmbown closed this May 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants