A purple agent for the Design2Code benchmark that generates HTML code from screenshot images using GPT-4o Vision. This agent implements the A2A (Agent-to-Agent) protocol and can be submitted to the Design2Code leaderboard.
src/
├─ server.py # Server setup and agent card configuration
├─ executor.py # A2A request handling
├─ agent.py # Your agent implementation goes here
└─ messenger.py # A2A messaging utilities
tests/
└─ test_agent.py # Agent tests
Dockerfile # Docker configuration
pyproject.toml # Python dependencies
.github/
└─ workflows/
└─ test-and-publish.yml # CI workflow
This Design2Code agent:
- Receives screenshot images embedded in messages via
<screenshot_base64>...</screenshot_base64>tags - Uses GPT-4o Vision (via LiteLLM) to analyze screenshots and generate HTML code
- Returns self-contained HTML that recreates the visual appearance of the screenshot
- Wraps HTML output in
<html_code>...</html_code>tags as required by the evaluator - Maintains conversation history for multi-turn interactions
-
Fork this repository to your GitHub account
-
Set up environment variables:
# Create a .env file (or export in your shell) echo "OPENAI_API_KEY=your-openai-api-key-here" > .env
-
Install dependencies:
uv sync
-
Run the agent locally (see Running Locally below)
-
Test your agent (see Testing below)
To submit this agent to the Design2Code leaderboard:
-
Register your agent on AgentBeats:
- Deploy your agent (see Publishing below for Docker deployment)
- Register it on the AgentBeats platform to obtain your
agentbeats_id
-
Fork the leaderboard repository:
-
Configure your submission:
- Edit
scenario.tomlin the leaderboard repository - Add your
agentbeats_idunder[[participants]] - Set
name = "agent"(required) - Add your
OPENAI_API_KEYas a GitHub secret
- Edit
-
Push to trigger evaluation:
- Push your changes to the leaderboard repository
- GitHub Actions will automatically run the evaluation
For detailed submission instructions, see the leaderboard repository README.
# Install dependencies
uv sync
# Run the server (default: http://127.0.0.1:9009)
uv run src/server.py
# Or with custom options
uv run src/server.py --host 0.0.0.0 --port 9009 --agent-llm openai/gpt-4o--host: Host to bind the server (default:127.0.0.1)--port: Port to bind the server (default:9009)--card-url: External URL for the agent card (for deployment)--agent-llm: LLM model to use (default:openai/gpt-4o)
The agent requires the following environment variable:
OPENAI_API_KEY: Your OpenAI API key for GPT-4o Vision access
You can set this in a .env file (loaded automatically) or export it:
export OPENAI_API_KEY=your-api-key-here# Build the image
docker build -t design2code-agent .
# Run the container (with API key from environment)
docker run -p 9009:9009 -e OPENAI_API_KEY=your-api-key-here design2code-agent
# Or with custom port
docker run -p 9019:9019 -e OPENAI_API_KEY=your-api-key-here design2code-agent --port 9019The repository includes A2A conformance tests and a screenshot generation test.
# Install test dependencies
uv sync --extra test
# Start your agent in one terminal (see Running Locally above)
# Run all tests in another terminal
uv run pytest --agent-url http://localhost:9009
# Run specific test
uv run pytest tests/test_agent.py::test_screenshot_generation --agent-url http://localhost:9009
# Run with verbose output
uv run pytest -v --agent-url http://localhost:9009Note: The screenshot test requires OPENAI_API_KEY to be set, as it makes real API calls to test the full generation pipeline.
The repository includes a GitHub Actions workflow that automatically builds, tests, and publishes a Docker image to GitHub Container Registry.
Add your API key as a repository secret:
- Go to Settings → Secrets and variables → Actions
- Click "New repository secret"
- Name:
OPENAI_API_KEY - Value: Your OpenAI API key
- Click "Add secret"
This secret will be available to CI/CD workflows and can be used when deploying.
-
Push to
main→ publisheslatesttag:ghcr.io/<your-username>/design2code-agent:latest -
Create a git tag (e.g.
git tag v1.0.0 && git push origin v1.0.0) → publishes version tags:ghcr.io/<your-username>/design2code-agent:1.0.0 ghcr.io/<your-username>/design2code-agent:1
Once the workflow completes, find your Docker image in the Packages section (right sidebar). Configure package visibility in package settings if needed.
After publishing, you can deploy the containerized agent:
# Pull and run from GitHub Container Registry
docker run -p 9009:9009 \
-e OPENAI_API_KEY=your-api-key-here \
ghcr.io/<your-username>/design2code-agent:latestNote: Organization repositories may need package write permissions enabled manually (Settings → Actions → General). Version tags must follow semantic versioning (e.g.,
v1.0.0).
To participate in the Design2Code leaderboard, your agent must:
✅ Accept screenshots: Receive images via <screenshot_base64>...</screenshot_base64> tags
✅ Generate HTML: Produce HTML that recreates the visual appearance
✅ Format output: Wrap HTML in <html_code>...</html_code> tags
✅ Self-contained: Include all CSS within the HTML file (no external dependencies)
✅ Image placeholders: Use "rick.jpg" as placeholder for images
✅ Vision model: Use a vision-capable LLM (e.g., GPT-4o Vision)
✅ A2A compliance: Follow the A2A protocol format
This agent meets all these requirements and is ready for submission.
The Design2Code benchmark evaluates agents on five dimensions (each weighted 20%):
- Layout Coverage: Element size and area coverage matching
- Text Accuracy: Text content similarity using sequence matching
- Position Accuracy: Element positioning accuracy
- Color Accuracy: Color matching using CIEDE2000 color difference
- Visual Similarity: Overall visual similarity using CLIP model
Final score: 0.2 × (layout + text + position + color + visual) (range: 0.0 to 1.0)
See the leaderboard repository for more details on evaluation methodology.