Skip to content

Pazuzzu/minicanva-task

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MarketCanvas-Env

MarketCanvas-Env is a computer-use environment for canvas interaction. This environment uses a minimal 2D canvas simulation. This repo implement two distinct environments:

  • high-level: semantic tool-calling actions (add_element, move_element, ...) with JSON state observation
  • low-level: raw computer-use actions (mouse_click, mouse_drag, ...) with pixel (base64 PNG) observation

Note: The env tooling (reward, tasks, scoring) is kept flat, no base classes or reusable abstractions, for the sake of simplicity.

Usage

1. Setup

uv sync

2. Demo

uv run python demo.py              # defaults to high_level
uv run python demo.py low_level    # computer-use mode

Runs a scripted sequence of actions. The final canvas is saved to output.png. Useful for quick sanity checks without spinning up the full agent loop.

3. Run

export ANTHROPIC_API_KEY=your-key-here
uv run python run.py high_level

This starts the MCP server and the agent loop. Use low_level for computer-use mode. The final canvas state is saved to output.png after every episode.

Options:

  • --model: LLM model to use (default anthropic/claude-haiku-4-5-20251001)
  • --max-steps: max agent turns (default 20)
  • --mcp-port: MCP server port (default 8000)
  • --context-window: max messages to keep, 0=unlimited (default 0).

Warning

For low_level, use --context-window to limit context size, base64 PNG observations consume tokens quickly.

uv run python run.py low_level --context-window 6

Architecture

run.py                    # CLI entry (argparse: --model, --max-steps, --context-window, --mcp-port)
  -> mcp_server/server.py # Env-agnostic MCP server factory (FastMCP, SSE transport)
  -> envs/{env}/task.py   # Per-env system prompt + requirements, calls orchestrator

orchestrator.py           # Reusable LLM agent loop (litellm + FastMCP client)
                          #   discovers tools via MCP, manages context window, logs with Rich

sim/                      # Pure canvas engine (no RL, no MCP dependency)
  elements.py             # Element dataclass + ElementType enum
  canvas.py               # CRUD operations on elements
  renderer.py             # PIL renderer (OS-aware font paths)

envs/
  reward_fn.py            # Shared reward function (both envs use same scoring)
  high_level/             # Semantic action space, JSON observation
  low_level/              # Computer-use action space, base64 PNG observation

The orchestrator is fully decoupled from any specific environment, it discovers tools from MCP at runtime. Adding a new env means adding the following:

  • env.py
  • action_space.py
  • observation_space.py
  • task.py
  • reward_fn.py (if applicable)

About

Minimal RL environment for canvas design agents, semantic and computer-use action spaces, MCP tooling, heuristic reward.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages