Agentify Design2Code

Assessment framework for evaluating HTML generation from screenshots using A2A standards and Google ADK integration.

Project Structure

src/
├── green_agent/    # Assessment manager agent
├── white_agent/    # Target agent being tested
├── my_util/        # Utility functions
└── launcher.py     # Evaluation coordinator
data/               # HTML files and screenshots

Installation

Install dependencies:

uv sync

Install Playwright browser for screenshot generation:

uv run playwright install chromium

This downloads the Chromium browser (~130MB) needed for generating screenshots from HTML files during evaluation.

Configuration

Create a .env file with the following variables:

# Required: OpenAI API Key for white agent
OPENAI_API_KEY=your_openai_api_key_here

# Optional: Google API Key for ADK agents (only if using ADK)
GOOGLE_API_KEY=your_google_api_key_here

# Optional: Evaluation Debug Level
# Options: INFO, DEBUG, TRACE, or leave empty to disable detailed logging
# - INFO: High-level metrics (block counts, final scores)
# - DEBUG: Per-pair details, matching statistics
# - TRACE: All intermediate calculations, raw data
EVAL_DEBUG_LEVEL=INFO

Usage

After configuring .env, run:

# Launch complete evaluation
uv run python main.py launch

# Start green agent only
uv run python main.py green

# Start white agent only (A2A HTTP server)
uv run python main.py white

# Start white agent with LangServe web interface
uv run python main.py langserve

Dataset

The Design2Code dataset contains HTML files paired with screenshots. The green agent loads these pairs and creates assessment tasks for the white agent to generate HTML from screenshots.

Architecture

The white agent is a native A2A HTTP server using OpenAI for HTML generation, with an optional ADK client wrapper for integration with ADK workflows.

┌─────────────────────────────────────┐
│   White Agent A2A HTTP Server      │
│   (OpenAI GPT-4o via LiteLLM)     │
│                                     │
│  ┌──────────────────────────────┐  │
│  │  HTML Generation Logic       │  │
│  │  - Vision model processing   │  │
│  │  - HTML/CSS generation       │  │
│  └──────────────────────────────┘  │
└──────────┬──────────────────────────┘
           │ HTTP (A2A Protocol)
           │
    ┌──────┴───────┐
    │              │
┌───▼────┐    ┌────▼─────────────┐
│  A2A   │    │ ADK Client       │
│Clients │    │ Wrapper          │
│(Green  │    │ (Optional)       │
│ Agent) │    └────┬─────────────┘
└────────┘         │
                   ▼
            ┌─────────────┐
            │ ADK Agents  │
            │ (Gemini)    │
            └─────────────┘

Key Points:

Primary: A2A HTTP server using OpenAI (no Google credentials needed)
Optional: ADK client wrapper for calling from ADK workflows
Separation: White agent doesn't depend on ADK
Flexibility: Can be called by any HTTP client (A2A or custom)

For A2A Users (Green Agent)

Just start the server and call via A2A. The green agent does this automatically:

# Start white agent server
uv run python main.py white

# Run evaluation (green agent calls white agent)
uv run python main.py launch

Requirements: Only OPENAI_API_KEY needed

For ADK Users

The white agent runs as an HTTP server. ADK users can call it using the client wrapper:

Basic Usage

from src.white_agent import call_white_agent_http
import asyncio

async def generate_html():
    # White agent server must be running
    html_code = await call_white_agent_http(
        screenshot_base64="...",  # Your base64-encoded PNG
        white_agent_url="http://localhost:10002",
        description="Generate HTML from this screenshot"
    )
    return html_code

asyncio.run(generate_html())

As a Tool in ADK Agents

from google.adk.agents import Agent
from src.white_agent import create_white_agent_tool

# Create tool that calls white agent via HTTP
white_tool = create_white_agent_tool(
    white_agent_url="http://localhost:10002"
)

# Use in an ADK agent (Gemini for orchestration)
coordinator = Agent(
    model="gemini-2.0-flash",  # Gemini orchestrates
    name="design_coordinator",
    description="Coordinates design-to-code workflows",
    instruction="Use the HTML generation tool for design tasks.",
    tools=[white_tool]  # Delegates to white agent via HTTP
)

Requirements:

OPENAI_API_KEY for white agent server
GOOGLE_API_KEY for your ADK agents (Gemini)

Running Examples

The project includes comprehensive examples:

# Start white agent server first
uv run python main.py white

# Then run examples (in another terminal)
uv run python -m src.white_agent.examples.basic_adk_usage
uv run python -m src.white_agent.examples.agent_tool_usage
uv run python -m src.white_agent.examples.a2a_compatibility

Benefits of This Architecture

✅ Uses OpenAI: White agent uses OpenAI (no Google credentials needed)
✅ Simple: Standard A2A HTTP server
✅ ADK Compatible: Optional wrapper for ADK users
✅ Flexible: Mix different LLMs (Gemini for orchestration, OpenAI for vision)
✅ Independent: White agent doesn't depend on ADK
✅ Backward Compatible: Existing A2A clients work unchanged

How It Works

White Agent: A2A HTTP server using OpenAI via LiteLLM
For A2A Clients: Call directly via A2A protocol
For ADK Users: Use client wrapper to call HTTP server
Multi-Agent: ADK agents (Gemini) delegate to white agent (OpenAI) via HTTP

For LangChain Users (Recommended Web Interface)

The white agent now includes a LangChain/LangServe interface with a professional web UI and REST API, all using OpenAI GPT-4o Vision.

Quick Start

# Install dependencies (if not already done)
uv sync

# Start LangServe web interface
uv run python main.py langserve

Then open your browser to:

Interactive Playground: http://localhost:8000/agent/playground
API Documentation: http://localhost:8000/docs
Simple Mode: http://localhost:8000/simple/playground

Features

✅ Professional Web UI - Interactive chat interface with playground
✅ OpenAI GPT-4o Vision - Direct OpenAI integration (no Google dependency)
✅ Streaming Responses - Real-time response streaming
✅ REST API - Full API for programmatic access
✅ Agent Mode - Full reasoning and tool use capabilities
✅ Simple Mode - Direct HTML generation without agent overhead

Using the Agent in Code

Direct Function Call (Simplest)

from src.white_agent import generate_html_from_screenshot_impl
import asyncio

async def generate():
    html = generate_html_from_screenshot_impl(
        screenshot_base64="<your_base64_screenshot>",
        description="Generate a landing page"
    )
    return html

asyncio.run(generate())

LangChain Agent Executor

from src.white_agent import create_white_agent
import asyncio

async def use_agent():
    agent = create_white_agent()

    result = agent.invoke({
        "input": "Generate HTML from this screenshot: <base64_data>"
    })

    print(result["output"])

asyncio.run(use_agent())

Simple Chain (Fast, No Agent)

from src.white_agent import create_simple_chain

chain = create_simple_chain()

result = chain({
    "screenshot_base64": "<base64_data>",
    "description": "Generate HTML"
})

html = result["output"]

API Usage

The LangServe API provides multiple endpoints:

# Agent endpoint (with reasoning)
curl -X POST "http://localhost:8000/agent/invoke" \
  -H "Content-Type: application/json" \
  -d '{"input": {"input": "Generate HTML from: <base64>"}}'

# Simple endpoint (direct generation)
curl -X POST "http://localhost:8000/simple/invoke" \
  -H "Content-Type: application/json" \
  -d '{"input": {"screenshot_base64": "<base64>", "description": "Generate HTML"}}'

# Health check
curl http://localhost:8000/health

Running Examples

# Start LangServe first
uv run python main.py langserve

# In another terminal, run examples
uv run python -m src.white_agent.examples.langchain_usage
uv run python -m src.white_agent.examples.a2a_compatibility

Architecture Comparison

Interface	Port	Use Case	LLM
A2A Server	10002	Green agent evaluation, A2A clients	OpenAI GPT-4o
LangServe	8000	Web UI, REST API, LangChain agents	OpenAI GPT-4o
ADK Wrapper	-	ADK agents (calls A2A server)	Gemini (orchestration) + OpenAI (HTML gen)

Recommendation: Use LangServe for:

Interactive web interface
Development and testing
REST API access
LangChain-based workflows
When you want the best web experience with OpenAI

Use A2A server for:

Green agent evaluation
Production A2A workflows
When you need A2A protocol compliance

Benefits of LangServe Interface

✅ All OpenAI - No Google API keys needed
✅ Better UI - Professional web interface with playground
✅ Developer Friendly - Interactive docs, streaming, batch processing
✅ Production Ready - FastAPI backend, scalable
✅ Flexible - Agent mode for reasoning, simple mode for speed
✅ Observable - Compatible with LangSmith for monitoring

Requirements: Only OPENAI_API_KEY needed (no Google dependencies)

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.cursor/rules		.cursor/rules
data		data
src		src
tests		tests
.gitignore		.gitignore
.python-version		.python-version
LANGCHAIN_SETUP.md		LANGCHAIN_SETUP.md
README.md		README.md
main.py		main.py
pyproject.toml		pyproject.toml
test_visual_metrics.py		test_visual_metrics.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Agentify Design2Code

Project Structure

Installation

Configuration

Usage

Dataset

Architecture

For A2A Users (Green Agent)

For ADK Users

Basic Usage

As a Tool in ADK Agents

Running Examples

Benefits of This Architecture

How It Works

For LangChain Users (Recommended Web Interface)

Quick Start

Features

Using the Agent in Code

Direct Function Call (Simplest)

LangChain Agent Executor

Simple Chain (Fast, No Agent)

API Usage

Running Examples

Architecture Comparison

Benefits of LangServe Interface

About

Uh oh!

Releases

Packages

Languages

radmanesh/agentify-design2code

Folders and files

Latest commit

History

Repository files navigation

Agentify Design2Code

Project Structure

Installation

Configuration

Usage

Dataset

Architecture

For A2A Users (Green Agent)

For ADK Users

Basic Usage

As a Tool in ADK Agents

Running Examples

Benefits of This Architecture

How It Works

For LangChain Users (Recommended Web Interface)

Quick Start

Features

Using the Agent in Code

Direct Function Call (Simplest)

LangChain Agent Executor

Simple Chain (Fast, No Agent)

API Usage

Running Examples

Architecture Comparison

Benefits of LangServe Interface

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages