MarketCanvas-Env is a computer-use environment for canvas interaction. This environment uses a minimal 2D canvas simulation. This repo implement two distinct environments:
- high-level: semantic tool-calling actions (
add_element,move_element, ...) with JSON state observation - low-level: raw computer-use actions (
mouse_click,mouse_drag, ...) with pixel (base64 PNG) observation
Note: The env tooling (reward, tasks, scoring) is kept flat, no base classes or reusable abstractions, for the sake of simplicity.
uv syncuv run python demo.py # defaults to high_level
uv run python demo.py low_level # computer-use modeRuns a scripted sequence of actions. The final canvas is saved to output.png. Useful for quick sanity checks without spinning up the full agent loop.
export ANTHROPIC_API_KEY=your-key-here
uv run python run.py high_levelThis starts the MCP server and the agent loop. Use low_level for computer-use mode. The final canvas state is saved to output.png after every episode.
Options:
--model: LLM model to use (defaultanthropic/claude-haiku-4-5-20251001)--max-steps: max agent turns (default 20)--mcp-port: MCP server port (default 8000)--context-window: max messages to keep, 0=unlimited (default 0).
Warning
For low_level, use --context-window to limit context size, base64 PNG observations consume tokens quickly.
uv run python run.py low_level --context-window 6run.py # CLI entry (argparse: --model, --max-steps, --context-window, --mcp-port)
-> mcp_server/server.py # Env-agnostic MCP server factory (FastMCP, SSE transport)
-> envs/{env}/task.py # Per-env system prompt + requirements, calls orchestrator
orchestrator.py # Reusable LLM agent loop (litellm + FastMCP client)
# discovers tools via MCP, manages context window, logs with Rich
sim/ # Pure canvas engine (no RL, no MCP dependency)
elements.py # Element dataclass + ElementType enum
canvas.py # CRUD operations on elements
renderer.py # PIL renderer (OS-aware font paths)
envs/
reward_fn.py # Shared reward function (both envs use same scoring)
high_level/ # Semantic action space, JSON observation
low_level/ # Computer-use action space, base64 PNG observation
The orchestrator is fully decoupled from any specific environment, it discovers tools from MCP at runtime. Adding a new env means adding the following:
env.pyaction_space.pyobservation_space.pytask.pyreward_fn.py(if applicable)