MCP UI Screenshot Analyzer

An MCP (Model Context Protocol) server that integrates with GitHub Copilot to provide AI-powered UI analysis using local models only (zero API costs, privacy-first design).

Features

UI Screenshot Analysis: Semantic understanding of UI layouts and structure
Color Palette Extraction: Extract dominant colors using OpenCV k-means
Text Extraction: OCR capabilities via Gemma 3 vision
Bug Detection: Identify layout issues and accessibility problems
Depth Levels: Configurable analysis depth (quick/standard/deep)
Smart Caching: Performance optimization for repeated analyses

Current Status

Week 1 MVP - COMPLETE ✓

✅ Gemma 3 12B vision integration via Ollama
✅ OpenCV color extraction
✅ Result caching (1-hour TTL)
✅ All 6 MCP tools implemented (4 fully functional)
✅ Error handling and validation
⏳ YOLOv8 component detection (Week 2)
⏳ Code generation (Week 3)

Quick Start

Prerequisites

macOS or Linux
Python 3.10+
8GB+ RAM (16GB recommended)
Ollama installed

Installation

# 1. Install Ollama
curl -fsSL https://ollama.com/install.sh | sh

# 2. Pull Gemma 3 12B model
ollama pull gemma3:12b

# 3. Activate virtual environment (already created)
source venv/bin/activate

# 4. Run the MCP server
python server.py

Verify Installation

# Check Ollama is running
pgrep -x "ollama"

# Check models are available
ollama list

# Test server initialization
python -c "from server import mcp, gemma_analyzer; print('Server ready!')"

MCP Tools

1. analyze_ui_screenshot

Main analysis tool with configurable depth levels.

Parameters:

image_path (str): Absolute path to screenshot
depth (str): "quick" | "standard" | "deep"

Depth Levels:

quick: Gemma 3 only (2-4s) - Basic description
standard: Gemma 3 + colors (5-8s) - Detailed analysis
deep: Full pipeline (12-18s) - Comprehensive analysis

Example:

# Via GitHub Copilot Chat:
Analyze the UI screenshot at /Users/me/screenshot.png

# Programmatic:
result = analyze_ui_screenshot("/path/to/screenshot.png", depth="standard")

2. extract_color_palette

Extract dominant colors using OpenCV k-means clustering.

Parameters:

image_path (str): Absolute path to image
n_colors (int): Number of colors (2-10, default: 5)

Returns: Color palette with hex codes, RGB values, and percentages

3. extract_ui_text

Extract text from UI using Gemma 3 OCR capabilities.

Parameters:

image_path (str): Absolute path to screenshot

Returns: List of extracted text elements

4. detect_ui_bugs

Detect layout issues and accessibility problems.

Parameters:

image_path (str): Absolute path to screenshot

Returns: List of issues with severity and suggestions

5. detect_ui_components

Coming in Week 2 - YOLOv8 integration

6. generate_component_code

Coming in Week 3 - Code generation

GitHub Copilot Integration

Configuration

Create .vscode/mcp-settings.json:

{
  "mcpServers": {
    "ui-analyzer": {
      "command": "python",
      "args": ["/Users/manhhaycode/Developer/image-analysis/server.py"],
      "env": {}
    }
  }
}

Usage in Copilot Chat

Analyze the UI screenshot at /path/to/screenshot.png

Extract colors from /path/to/design.png

Detect bugs in ~/Desktop/app-screenshot.png

Performance

Operation	Time (CPU)	Time (GPU)	Cached
Quick analysis	2-4s	1-2s	<1s
Standard analysis	5-8s	2-3s	<1s
Deep analysis	12-18s	4-6s	<1s
Color extraction	<1s	<0.5s	<0.1s

Cache: Results cached for 1 hour, automatic invalidation

Project Structure

image-analysis/
├── server.py                  # MCP server entry point
├── config.yaml               # Configuration
├── analyzers/
│   ├── gemma_analyzer.py     # Ollama Gemma 3 integration
│   ├── color_extractor.py    # OpenCV color extraction
│   └── __init__.py
├── orchestrator/
│   ├── cache.py              # Result caching
│   └── __init__.py
├── tests/
│   ├── fixtures/             # Sample screenshots
│   └── __init__.py
├── utils/
│   └── __init__.py
└── venv/                     # Virtual environment

Configuration

Edit config.yaml to customize:

vision:
  model: "gemma3:12b"           # Primary model
  fallback: "gemma3:2b"          # Low RAM fallback

performance:
  enable_caching: true
  cache_ttl_seconds: 3600       # 1 hour

color_extraction:
  default_n_colors: 5

Troubleshooting

Server fails to start:

# Verify Ollama is running
pgrep -x "ollama" || ollama serve &

# Check Gemma 3 model
ollama list | grep gemma3:12b

# Reinstall dependencies
pip install -r requirements.txt

Out of memory:

# Use quantized model (6.6GB vs 9GB)
ollama pull gemma3:12b-q4

# Edit config.yaml:
vision:
  model: "gemma3:12b-q4"

Image not found errors:

Always use absolute paths
Verify file exists: ls -la /path/to/image.png
Check file permissions

Development

Running Tests

# (Tests to be implemented)
python -m pytest tests/

Adding New Tools

Implement analyzer in analyzers/
Add MCP tool decorator in server.py
Integrate caching
Update documentation

Roadmap

Week 1 (COMPLETE): MVP with Gemma 3 + color extraction + caching ✓
Week 2: YOLOv8 component detection
Week 3-4: Code generation, comprehensive testing, documentation

Performance Optimization

The system includes several optimizations:

Smart Caching: MD5-based image hashing with 1-hour TTL
Depth Levels: User-controlled trade-off between speed and detail
Lazy Loading: Components loaded only when needed
Error Recovery: Graceful degradation if optional features fail

Hardware Requirements

Minimum: 8GB RAM, CPU only (using quantized model)
Recommended: 16GB RAM, any GPU
Optimal: 32GB RAM, GPU with 8GB+ VRAM

License

MIT License

Contributing

Contributions welcome! Please open issues or PRs on GitHub.

Support

For issues or questions:

Check CLAUDE.md for detailed documentation
Review troubleshooting section above
Open a GitHub issue

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

MCP UI Screenshot Analyzer

Features

Current Status

Quick Start

Prerequisites

Installation

Verify Installation

MCP Tools

1. analyze_ui_screenshot

2. extract_color_palette

3. extract_ui_text

4. detect_ui_bugs

5. detect_ui_components

6. generate_component_code

GitHub Copilot Integration

Configuration

Usage in Copilot Chat

Performance

Project Structure

Configuration

Troubleshooting

Development

Running Tests

Adding New Tools

Roadmap

Performance Optimization

Hardware Requirements

License

Contributing

Support

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.serena		.serena
__pycache__		__pycache__
analyzers		analyzers
orchestrator		orchestrator
tests		tests
utils		utils
.DS_Store		.DS_Store
CLAUDE.md		CLAUDE.md
README.md		README.md
config.yaml		config.yaml
requirements.txt		requirements.txt
server.py		server.py
validate.py		validate.py

manhhaycode/mcp-ui-analyzer

Folders and files

Latest commit

History

Repository files navigation

MCP UI Screenshot Analyzer

Features

Current Status

Quick Start

Prerequisites

Installation

Verify Installation

MCP Tools

1. analyze_ui_screenshot

2. extract_color_palette

3. extract_ui_text

4. detect_ui_bugs

5. detect_ui_components

6. generate_component_code

GitHub Copilot Integration

Configuration

Usage in Copilot Chat

Performance

Project Structure

Configuration

Troubleshooting

Development

Running Tests

Adding New Tools

Roadmap

Performance Optimization

Hardware Requirements

License

Contributing

Support

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages