Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ RUN pip install --no-cache-dir -r requirements.txt
RUN pip install --no-cache-dir uv
ENV PATH="/root/.local/bin:$PATH"

COPY server.py config.py process_runner.py ./
COPY server.py config.py process_runner.py builtin_tools.py ./
COPY frontend/ ./frontend/
COPY handlers/ ./handlers/

Expand Down
71 changes: 70 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,11 @@ Each tool **provider** is a single YAML file under `tools/`. The YAML contains:
runs `setup_commands`, then registers each tool automatically — no Python files to
maintain separately, no changes to `server.py` needed when adding new tools.

Two **built-in tools** (`mcpproxy-listfiles` and `mcpproxy-getfile`) are always registered
without any YAML config. They give LLMs read-only access to a configurable directory
(default: `.playwright-mcp`) — useful for retrieving screenshots and snapshots produced
by package providers such as Playwright MCP.

## Ports

| Port | Service |
Expand All @@ -38,6 +43,7 @@ maintain separately, no changes to `server.py` needed when adding new tools.
├── server.py
├── config.py ← shared env-var config (imported by all modules)
├── process_runner.py ← spawns & proxies any stdio MCP subprocess
├── builtin_tools.py ← built-in mcpproxy-listfiles / mcpproxy-getfile tools
├── frontend/
│ └── app.py ← FastAPI UI server (port 8889)
├── .env.example
Expand Down Expand Up @@ -531,7 +537,8 @@ pip install -r requirements.txt -r requirements-dev.txt
pytest tests/ -v
```

Tests cover `server.py` (pure helpers) and `frontend/app.py` (all API endpoints).
Tests cover `server.py` (pure helpers), `frontend/app.py` (all API endpoints), and
`builtin_tools.py` (file listing and retrieval).
CI runs on every push via `.github/workflows/tests.yml`.

---
Expand Down Expand Up @@ -748,6 +755,68 @@ Write state to a well-known file path and read it on the next call.

---

### Part 9 — reading files produced by package providers

Package providers (e.g. Playwright MCP) often write files to disk — screenshots (PNG),
accessibility snapshots (JSON), downloaded pages (HTML) — that the LLM would otherwise
have no way to retrieve.

mcpproxy ships two **built-in utility tools** that are always registered, with no YAML
config file required:

| Tool | Description |
|---|---|
| `mcpproxy-listfiles` | List files and subdirectories inside the files base directory |
| `mcpproxy-getfile` | Read a file from the files base directory (UTF-8 text or base64) |

**Default base directory:** `.playwright-mcp` relative to the server's working directory
(i.e. `/app/.playwright-mcp` inside Docker). Override with the `MCPPROXY_FILES_DIR`
environment variable.

Only files **inside** the base directory are accessible — path-traversal attempts
(`../`) are rejected.

#### Example workflow with Playwright

1. Ask the LLM to navigate to a page and take a screenshot via the Playwright MCP provider.
2. Playwright writes `screenshot.png` to `.playwright-mcp/`.
3. Ask the LLM to call `mcpproxy-listfiles` — it returns the file list.
4. Ask the LLM to call `mcpproxy-getfile` with `path="screenshot.png"` — it returns the
PNG as a base64 string that the LLM can describe or pass to a vision model.

#### `mcpproxy-listfiles` parameters

| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
| `path` | string | No | `""` | Subdirectory to list, relative to the base dir. Omit to list the root. |

Returns an object with `ok`, `base_dir`, `path`, and `entries` (list of `{name, type, size}`).

#### `mcpproxy-getfile` parameters

| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
| `path` | string | **Yes** | — | File path, relative to the base dir. |
| `encoding` | string | No | `"auto"` | `"auto"` tries UTF-8, falls back to base64. `"text"` forces UTF-8. `"base64"` always base64. |

Returns an object with `ok`, `path`, `size`, `content`, and `encoding`.

#### Changing the base directory

```bash
# In docker-compose.override.yml or as -e flag
MCPPROXY_FILES_DIR=/app/data
```

Or mount a volume at the target path so files persist across container restarts:

```yaml
volumes:
- ./playwright-output:/app/.playwright-mcp
```

---

### YAML provider reference

```yaml
Expand Down
151 changes: 151 additions & 0 deletions builtin_tools.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,151 @@
"""
Built-in mcpproxy utility tools — registered automatically at startup,
no YAML config file required.

mcpproxy-listfiles List files/directories inside the mcpproxy files dir.
mcpproxy-getfile Read a file from the mcpproxy files dir (text or base64).

The *base directory* defaults to ``.playwright-mcp`` (relative to the server's
working directory) and can be overridden at runtime with the
``MCPPROXY_FILES_DIR`` environment variable. Only files **inside** the base
directory are accessible — path-traversal attempts are rejected.
"""

import base64
import os
from pathlib import Path
from typing import Any


def _base_dir() -> Path:
"""Return the resolved base directory for built-in file access.

Evaluated on each call so that tests can override MCPPROXY_FILES_DIR
with monkeypatch without restarting the process.
"""
raw = os.environ.get("MCPPROXY_FILES_DIR", ".playwright-mcp")
return Path(raw).resolve()


def _safe_resolve(relative: str | None) -> Path:
"""Resolve *relative* under the base dir; raise ValueError on traversal."""
base = _base_dir()
target = (base / (relative or "")).resolve()
# relative_to() raises ValueError if target is not under base
try:
target.relative_to(base)
except ValueError:
raise ValueError(
f"Path '{relative}' is outside the allowed directory '{base}'"
)
return target


# ---------------------------------------------------------------------------
# Tool handlers
# ---------------------------------------------------------------------------

async def list_files(
context: dict[str, Any],
path: str | None = None,
) -> dict[str, Any]:
"""List files and subdirectories at *path* inside the files base directory.

Returns a JSON object with an ``entries`` list; each entry has ``name``,
``type`` (``"file"`` or ``"directory"``), and ``size`` (bytes, files only).
If the directory does not exist yet the entries list is empty (not an error).
"""
try:
target = _safe_resolve(path)
base = _base_dir()
if not target.exists():
return {
"ok": True,
"base_dir": str(base),
"path": path or "",
"entries": [],
}
if not target.is_dir():
return {"ok": False, "error": f"'{path}' is not a directory"}
entries: list[dict[str, Any]] = []
for entry in sorted(target.iterdir()):
entries.append(
{
"name": entry.name,
"type": "directory" if entry.is_dir() else "file",
"size": entry.stat().st_size if entry.is_file() else None,
}
)
return {
"ok": True,
"base_dir": str(base),
"path": path or "",
"entries": entries,
}
except Exception as exc:
return {"ok": False, "error": str(exc)}


async def get_file(
context: dict[str, Any],
path: str,
encoding: str = "auto",
) -> dict[str, Any]:
"""Read a file from the files base directory.

*encoding* controls how the content is returned:
``"auto"`` (default) — try UTF-8; fall back to base64 for binary files.
``"text"`` — decode as UTF-8; error if the file is binary.
``"base64"`` — always return base64-encoded bytes (safe for images etc.).

Returns a JSON object with ``content`` (string), ``encoding`` used, and
``size`` (bytes).
"""
try:
target = _safe_resolve(path)
if not target.exists():
return {"ok": False, "error": f"File not found: {path}"}
if not target.is_file():
return {"ok": False, "error": f"Not a file: {path}"}

raw = target.read_bytes()
size = len(raw)

if encoding == "base64":
return {
"ok": True,
"path": path,
"size": size,
"content": base64.b64encode(raw).decode(),
"encoding": "base64",
}

# encoding == "text" or "auto"
try:
text = raw.decode("utf-8")
return {
"ok": True,
"path": path,
"size": size,
"content": text,
"encoding": "text",
}
except UnicodeDecodeError:
if encoding == "text":
return {
"ok": False,
"error": (
f"File '{path}' is not valid UTF-8 text. "
"Try encoding='base64'."
),
}
# "auto" fallback → base64
return {
"ok": True,
"path": path,
"size": size,
"content": base64.b64encode(raw).decode(),
"encoding": "base64",
}
except Exception as exc:
return {"ok": False, "error": str(exc)}
6 changes: 6 additions & 0 deletions config.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,12 @@
ENV_FILE = Path(os.environ.get("MCP_ENV_FILE", ".env"))
SERVER_NAME = os.environ.get("MCP_SERVER_NAME", "local-config-driven-mcp")

# Base directory exposed by the built-in mcpproxy-listfiles / mcpproxy-getfile tools.
# Defaults to .playwright-mcp (relative to the server's working directory) so that
# screenshots and snapshots produced by the Playwright MCP package provider are
# immediately accessible. Override with MCPPROXY_FILES_DIR.
FILES_DIR = Path(os.environ.get("MCPPROXY_FILES_DIR", ".playwright-mcp"))

UI_HOST = os.environ.get("MCP_UI_HOST", "0.0.0.0")
UI_PORT = int(os.environ.get("MCP_UI_PORT", "8889"))

Expand Down
87 changes: 87 additions & 0 deletions server.py
Original file line number Diff line number Diff line change
Expand Up @@ -317,10 +317,97 @@ def run_provider_setup(spec: dict[str, Any]) -> None:
raise


# ---------------------------------------------------------------------------
# Built-in tools (always available, no YAML config required)
# ---------------------------------------------------------------------------

def register_builtin_tools() -> None:
"""Register the mcpproxy-listfiles and mcpproxy-getfile utility tools.

These tools expose read-only access to the files directory (default:
``.playwright-mcp``, override with ``MCPPROXY_FILES_DIR``). They are
always registered regardless of what YAML providers are loaded, giving
LLMs a way to retrieve screenshots, JSON snapshots, and other files
produced by package providers such as the Playwright MCP server.
"""
try:
from builtin_tools import get_file, list_files

register_tool(
{
"name": "mcpproxy-listfiles",
"description": (
"List files and directories inside the mcpproxy files directory "
"(default: .playwright-mcp, override with MCPPROXY_FILES_DIR). "
"Use this to discover screenshots, JSON snapshots, and other files "
"produced by package providers such as the Playwright MCP server. "
"Pass a subdirectory path to drill down."
),
"input_schema": {
"type": "object",
"properties": {
"path": {
"type": "string",
"description": (
"Subdirectory to list, relative to the base files directory. "
"Omit or pass an empty string to list the root."
),
"default": "",
}
},
"required": [],
},
},
list_files,
)

register_tool(
{
"name": "mcpproxy-getfile",
"description": (
"Read the contents of a file from the mcpproxy files directory "
"(default: .playwright-mcp). "
"Returns UTF-8 text for text files (JSON, HTML, Markdown, …) or "
"base64-encoded bytes for binary files (PNG screenshots, …). "
"Use mcpproxy-listfiles first to discover available file paths."
),
"input_schema": {
"type": "object",
"properties": {
"path": {
"type": "string",
"description": "Path to the file, relative to the base files directory.",
},
"encoding": {
"type": "string",
"description": (
"How to encode the returned content. "
"'auto' (default) tries UTF-8 and falls back to base64. "
"'text' forces UTF-8 (error on binary). "
"'base64' always returns base64 (safe for images)."
),
"default": "auto",
},
},
"required": ["path"],
},
},
get_file,
)

print("Registered built-in tools: mcpproxy-listfiles, mcpproxy-getfile")
except Exception as exc:
print(f"register_builtin_tools error: {exc}")
traceback.print_exc()
raise


# ---------------------------------------------------------------------------
# Load all providers at import time
# ---------------------------------------------------------------------------

register_builtin_tools()

for provider_spec in load_provider_specs(CONFIG_DIR):
register_provider(provider_spec)
run_provider_setup(provider_spec)
Expand Down
Loading
Loading