A comprehensive, AI-powered web validation framework that automatically detects spelling errors, grammar issues, and visual/UI problems on websites. Built with a modular architecture using LangGraph, Playwright, and support for Google's Gemini AI and OpenAI models.
- Spell Checker Agent: Detects spelling errors, grammar issues, and unclear phrasing in text content
- Visual QA Agent: Identifies layout issues, accessibility problems, responsive design flaws, and UI inconsistencies
- Real Browser Automation: Uses Playwright for accurate rendering of JavaScript-heavy sites
- AI-Powered Analysis: Leverages Google's Gemini 2.5 Flash or OpenAI models (GPT-4o, GPT-4 Turbo) with vision capabilities
- Authentication Support: Test authenticated websites with form-based or HTTP Basic authentication (see Authentication Guide)
- Multi-Language Support: Handles bidirectional text (Hebrew, Arabic, etc.)
- Flexible Configuration: YAML-based config for agents, targets, and output formats
- Multiple Output Formats: Text, JSON, and HTML reports with interactive dashboards
- Parallel Execution: Run multiple agents concurrently for faster validation
- Robust Error Handling: Retry logic with exponential backoff for reliability
- Slack Integration: Interact with agents directly from Slack (see Slack Integration Guide)
- LangSmith Observability: Monitor and debug agent runs with LangSmith integration (see LangSmith Guide)
The project uses a modular, extensible architecture:
agent-validator/
├── agents/ # Validation agents (spell_checker, visual_qa)
├── core/ # Core framework (orchestrator, config, exceptions)
├── reporters/ # Output formatters (text, JSON, HTML)
├── utils/ # Utilities (browser, validation, text processing)
├── tests/ # Unit tests
├── examples/ # Example configurations
└── main.py # CLI entry point
For detailed architecture documentation, see ARCHITECTURE.md.
- Python 3.10+
- API Key from one of:
- Google AI API Key (free tier available at Google AI Studio)
- OpenAI API Key (get it at OpenAI Platform)
- Clone the repository:
git clone <repository-url>
cd agent-validator- Install dependencies with pipenv:
pipenv install- Install Playwright browsers:
pipenv run install-playwright- Set up your API key:
cp .env.example .env
# Edit .env and add your GOOGLE_API_KEY or OPENAI_API_KEY (or both)Create a .env file with your API key(s):
# For Google Gemini (default)
GOOGLE_API_KEY=your_google_api_key_here
# For OpenAI (optional)
OPENAI_API_KEY=your_openai_api_key_hereCreate or edit config.yaml to customize agents and targets:
agents:
spell_checker:
enabled: true
provider: "gemini" # Options: gemini, openai
model: "gemini-2.5-flash" # For OpenAI: gpt-4o, gpt-4-turbo, gpt-3.5-turbo
temperature: 0
max_text_length: 10000
visual_qa:
enabled: true
provider: "gemini" # Options: gemini, openai
model: "gemini-2.5-flash" # For OpenAI: gpt-4o, gpt-4-turbo (vision models)
temperature: 0
viewports:
- width: 1920
height: 1080
name: "Desktop"
targets:
- url: "https://example.com"
agents: ["spell_checker", "visual_qa"]
# Example with authentication
- url: "https://app.example.com/dashboard"
agents: ["spell_checker", "visual_qa"]
auth:
type: form # or "basic" for HTTP Basic auth
username: "${AUTH_USERNAME}" # Use environment variables
password: "${AUTH_PASSWORD}"
selectors: # Required for form auth
username: 'input[name="username"]'
password: 'input[name="password"]'
submit: 'button[type="submit"]'
output:
format: "html" # Options: text, json, html
path: "./reports"
timestamp: trueSee examples/ for more configuration examples and docs/AUTHENTICATION.md for authentication setup.
# Run all agents on a URL
pipenv run python main.py --url https://example.com
# Run specific agent
pipenv run python main.py --url https://example.com --agents spell_checker
# Run with custom config
pipenv run python main.py --config examples/multi_site_check.yaml
# Output as HTML
pipenv run python main.py --url https://example.com --format html --output reports/
# Verbose logging
pipenv run python main.py --url https://example.com -v
# With authentication (form-based)
export AUTH_USERNAME="user@example.com"
export AUTH_PASSWORD="password123"
pipenv run python main.py --url https://app.example.com/dashboard \
--auth-type form \
--username '${AUTH_USERNAME}' \
--password '${AUTH_PASSWORD}' \
--username-selector 'input[name="username"]' \
--password-selector 'input[name="password"]' \
--submit-selector 'button[type="submit"]'
# With authentication (HTTP Basic)
pipenv run python main.py --url https://api.example.com/docs \
--auth-type basic \
--username admin \
--password '${API_KEY}'# Main CLI using validate script
pipenv run validate --url https://example.com
# With options
pipenv run validate --url https://example.com --format html --output reports/from core.config_loader import ConfigLoader
from core.orchestrator import Orchestrator
from reporters import get_reporter
# Load configuration
config = ConfigLoader.load("config.yaml")
# Create orchestrator
orchestrator = Orchestrator(config._config)
# Run agents on a URL
results = orchestrator.run_multiple_agents(
url="https://example.com",
agent_names=["spell_checker", "visual_qa"]
)
# Generate report
reporter = get_reporter("html", timestamp=True)
report = reporter.format_report(results)
print(report)The original standalone agents are still available:
pipenv run spell-check # Run spell checker only
pipenv run visual-check # Run visual QA onlyRun validation agents directly from Slack! See the Slack Integration Guide for setup instructions.
# Start the Slack bot
pipenv run slack-bot
# With verbose logging
pipenv run slack-bot -vKey Features:
- Chat with agents directly in Slack
- Interactive conversations to gather required arguments
- Formatted reports delivered to Slack channels
- Support for both direct messages and channel mentions
Example Slack conversation:
You: @Agent Validator Bot check spelling on https://example.com
Bot: 🚀 Starting spell_checker validation for: https://example.com
Bot: [Formatted report with spelling errors]
Console-friendly plain text reports.
Machine-readable format for integration with other tools:
{
"timestamp": "2026-01-05 10:30:00",
"results": [...],
"summary": {
"total_validations": 2,
"passed": 1,
"failed": 1
}
}Interactive dashboard with:
- Summary statistics
- Severity-based filtering
- Color-coded issues
- Responsive design
Run the test suite:
pipenv run testpipenv run python main.py --config examples/multi_site_check.yamlpipenv run python main.py --config examples/mobile_responsive.yamlpipenv run python main.py --url https://example.com --agents spell_checker --format textagents/- Validation agentscore/- Framework core (orchestrator, config, exceptions)reporters/- Output formattersutils/- Helper utilitiestests/- Unit testsexamples/- Example configurations
- Create agent class in
agents/:
from agents.base_agent import BaseAgent
class MyAgent(BaseAgent):
def build_workflow(self):
# Define your workflow
pass- Register in orchestrator:
# In core/orchestrator.py
AGENT_REGISTRY = {
"my_agent": MyAgent,
}Contributions are welcome! Please feel free to submit a Pull Request.
[Your License Here]
- Google AI Studio - Get your API key
- Playwright Documentation - Browser automation
- LangGraph Documentation - Agent framework
For issues, questions, or suggestions, please open an issue on GitHub.
Built with ❤️ using LangGraph, Playwright, and Google Gemini AI
- Applies bidirectional text reordering for proper display of Hebrew/Arabic text
- Returns success or failure status with detailed error information
--- Launching browser and scraping: https://www.example.com ---
--- Gemini analyzing 8542 characters of live text ---
--- Summarizing findings ---
==============================
FAILED: Found 3 errors on page https://www.example.com:
1. Error: 'recieve' -> Correction: 'receive'
Context: "You will recieve an email confirmation within 24 hours."
2. Error: 'their' -> Correction: 'there'
Context: "Their are many options available for customization."
3. Error: 'alot' -> Correction: 'a lot'
Context: "We offer alot of features for our users."
- LangGraph: Orchestrates the multi-node agent workflow
- Playwright: Provides real browser automation for accurate content scraping
- AI Models: Supports both Google Gemini AI and OpenAI for intelligent analysis
- python-bidi: Handles bidirectional text rendering for RTL languages
In your config.yaml, change the provider and model:
agents:
spell_checker:
provider: "openai" # Switch to OpenAI
model: "gpt-4o" # Use GPT-4o modelOr use Gemini:
agents:
spell_checker:
provider: "gemini"
model: "gemini-2.5-pro" # Use Pro model for higher qualityGemini Models:
gemini-2.5-flash(default, fast and efficient)gemini-2.5-pro(higher quality)gemini-1.5-flash(legacy)
OpenAI Models:
gpt-4o(latest multimodal model)gpt-4-turbo(vision support)gpt-3.5-turbo(text-only, faster)
Modify the character limit in the scraper node:
return {"raw_text": clean_text[:20000]} # Increase to 20,000 charactersRefine the scraping to specific page sections:
visible_text = page.inner_text("main") # Only scrape main content area
# or
visible_text = page.inner_text("article") # Only scrape article contentBuilt with ❤️ using LangGraph, Playwright, Google Gemini AI, and OpenAI
MIT
Contributions are welcome! Please feel free to submit a Pull Request.