diff --git a/.claude/commands/prime.md b/.claude/commands/prime.md
index f2eecaaed2..b497f04fc8 100644
--- a/.claude/commands/prime.md
+++ b/.claude/commands/prime.md
@@ -10,9 +10,9 @@ git ls-files
@README.md
@pyproject.toml
-@docs/vision.md
-@docs/workflow.md
-@docs/architecture/repository.md
+@docs/concepts/vision.md
+@docs/workflows/overview.md
+@docs/reference/repository.md
## Read and Execute
diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md
index 233ff9de26..4bd0247407 100644
--- a/.github/copilot-instructions.md
+++ b/.github/copilot-instructions.md
@@ -38,7 +38,7 @@ This file provides guidance to GitHub Copilot when working with code in this rep
**Supported Libraries** (9 total):
- matplotlib, seaborn, plotly, bokeh, altair, plotnine, pygal, highcharts, lets-plot
-**Core Principle**: Community proposes plot ideas via GitHub Issues → AI generates code → Multi-LLM quality checks → Deployed.
+**Core Principle**: Community proposes plot ideas via GitHub Issues → AI generates code → AI quality review → Deployed.
## Development Setup
@@ -135,8 +135,8 @@ Examples: `scatter-basic`, `scatter-color-mapped`, `bar-grouped-horizontal`, `he
### PR Labels (set by workflows)
- **`approved`** - Human approved specification for merge
-- **`ai-approved`** - AI quality check passed (score >= 90)
-- **`ai-rejected`** - AI quality check failed (score < 90)
+- **`ai-approved`** - AI quality check passed (score >= 90, or >= 50 after 3 attempts)
+- **`ai-rejected`** - AI quality check failed (score < 90), triggers repair loop
- **`quality:XX`** - Quality score (e.g., `quality:92`)
**Specification Lifecycle:**
@@ -146,8 +146,8 @@ Examples: `scatter-basic`, `scatter-color-mapped`, `bar-grouped-horizontal`, `he
**Implementation PR Lifecycle:**
```
-[open] → impl-review → ai-approved → impl-merge → impl:{library}:done
- → ai-rejected → impl-repair (×3)
+[open] → impl-review → ai-approved (≥90) → impl-merge → impl:{library}:done
+ → ai-rejected (<90) → impl-repair (×3) → ai-approved (≥50) or failed (<50)
```
## Code Standards
@@ -205,8 +205,7 @@ plt.savefig('plot.png', dpi=300, bbox_inches='tight')
### Anti-Patterns to Avoid
-- No `preview.png` files in repository (use GCS)
-- No `quality_report.json` files (use GitHub Issues)
+- No `preview.png` files in repository (stored in GCS)
- No hardcoded API keys (use environment variables)
## Tech Stack
diff --git a/.github/workflows/spec-create.yml b/.github/workflows/spec-create.yml
index ee6746d685..bfdd4aac51 100644
--- a/.github/workflows/spec-create.yml
+++ b/.github/workflows/spec-create.yml
@@ -117,7 +117,7 @@ jobs:
6. **Create specification files:**
- Read template: `prompts/templates/specification.md`
- Read metadata template: `prompts/templates/specification.yaml`
- - Read tagging guide: `docs/concepts/tagging-system.md`
+ - Read tagging guide: `docs/reference/tagging-system.md`
- Create directory: `plots/{specification-id}/`
- Create: `plots/{specification-id}/specification.md` (follow template structure)
- Create: `plots/{specification-id}/specification.yaml` with:
@@ -213,7 +213,7 @@ jobs:
6. **Create specification files:**
- Read template: `prompts/templates/specification.md`
- Read metadata template: `prompts/templates/specification.yaml`
- - Read tagging guide: `docs/concepts/tagging-system.md`
+ - Read tagging guide: `docs/reference/tagging-system.md`
- Create directory: `plots/{specification-id}/`
- Create: `plots/{specification-id}/specification.md` (follow template structure)
- Create: `plots/{specification-id}/specification.yaml` with:
diff --git a/CLAUDE.md b/CLAUDE.md
index aa6fd50966..5fc7559c08 100644
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -109,7 +109,7 @@ done
- **highcharts** - Interactive web charts, stock charts (requires license for commercial use)
- **lets-plot** - ggplot2 grammar of graphics by JetBrains, interactive
-**Core Principle**: Community proposes plot ideas via GitHub Issues → AI generates code → Multi-LLM quality checks → Deployed.
+**Core Principle**: Community proposes plot ideas via GitHub Issues → AI generates code → AI quality review → Deployed.
## Essential Commands
@@ -261,7 +261,7 @@ Example: `plots/scatter-basic/` contains everything for the basic scatter plot.
1. **Repository Pattern**: Data access layer in `core/repositories/`
2. **Async Everything**: FastAPI + SQLAlchemy async + asyncpg
-3. **Clean Repo**: Only production code in git. Quality reports → GitHub Issues. Preview images → GCS.
+3. **Clean Repo**: Only production code in git. Quality reports → `metadata/{library}.yaml`. Preview images → GCS.
4. **Issue-Based Workflow**: GitHub Issues as state machine for plot lifecycle
### Metadata System
@@ -399,8 +399,8 @@ gs://pyplots-images/
- **Plotting**: matplotlib, seaborn, plotly, bokeh, altair, plotnine, pygal, highcharts, lets-plot
- **Package Manager**: uv (fast Python installer)
- **Infrastructure**: Google Cloud Run, Cloud SQL, Cloud Storage
-- **Automation**: GitHub Actions (code workflows) + n8n Cloud (external services)
-- **AI**: Claude (code generation), Vertex AI (multi-LLM quality checks)
+- **Automation**: GitHub Actions
+- **AI**: Claude (code generation + quality review)
## Code Standards
@@ -475,7 +475,6 @@ uv run python -c "from core.database import is_db_configured; print(is_db_config
- Implementation code (full Python source)
- Implementation metadata (library, variant, quality score, generation info from metadata/*.yaml)
- GCS URLs for preview images
-- Social media promotion queue
**What's in Repository** (source of truth):
- Everything in `plots/{specification-id}/`:
@@ -486,7 +485,6 @@ uv run python -c "from core.database import is_db_configured; print(is_db_config
**What's NOT Stored in DB**:
- Preview images (in GCS)
-- Detailed quality reports (in GitHub Issues, summary in metadata)
**Migrations**: Managed with Alembic
```bash
@@ -511,8 +509,7 @@ The `prompts/` directory contains AI agent prompts for code generation, quality
| `plot-generator.md` | Base rules for all plot implementations |
| `library/*.md` | Library-specific rules (9 files) |
| `quality-criteria.md` | Definition of code/visual quality |
-| `quality-evaluator.md` | Multi-LLM evaluation prompt |
-| `auto-tagger.md` | Automatic tagging across 5 dimensions |
+| `quality-evaluator.md` | AI quality evaluation prompt |
| `spec-validator.md` | Validates plot request issues |
| `spec-id-generator.md` | Assigns unique spec IDs |
@@ -918,12 +915,12 @@ pytest --pdb # Debug on failure
## Key Documentation Files
-- **docs/development.md**: Development setup, testing, deployment
-- **docs/workflow.md**: Automation flows (Discovery → Deployment → Social)
-- **docs/specs-guide.md**: How to write plot specifications
-- **docs/architecture/repository.md**: Directory structure
-- **docs/architecture/api.md**: API endpoints reference
-- **docs/architecture/database.md**: Database schema
+- **docs/contributing.md**: How to add/improve specs and implementations
+- **docs/workflows/overview.md**: Automation flows and label system
+- **docs/concepts/vision.md**: Product vision
+- **docs/reference/repository.md**: Directory structure
+- **docs/reference/api.md**: API endpoints reference
+- **docs/reference/database.md**: Database schema
- **prompts/README.md**: AI agent prompt system
## Project Philosophy
@@ -932,5 +929,5 @@ pytest --pdb # Debug on failure
- **Spec improvements over code fixes**: If a plot has issues, improve the spec, not the code
- **Your data first**: Examples work with real user data, not fake data
- **Community-driven**: Anyone can propose plots via GitHub Issues
-- **Multi-LLM quality**: Claude + Gemini + GPT ensure quality (score ≥90 required)
-- **Full transparency**: All feedback documented in GitHub Issues, not hidden in repo files
+- **AI quality review**: Claude evaluates quality (≥90 instant merge, <90 repair loop, ≥50 minimum)
+- **Full transparency**: All quality feedback stored in repository (`metadata/{library}.yaml`)
diff --git a/README.md b/README.md
index 3b5d92a271..80515b6617 100644
--- a/README.md
+++ b/README.md
@@ -18,7 +18,7 @@
maintains plotting examples. Browse hundreds of plots across all major Python libraries - matplotlib, seaborn, plotly,
bokeh, altair, plotnine, pygal, highcharts, and lets-plot.
-**Community-driven, AI-maintained** - Propose plot ideas via GitHub Issues, AI generates the code, multi-LLM quality
+**Community-driven, AI-maintained** - Propose plot ideas via GitHub Issues, AI generates the code, automated quality
checks ensure excellence. Zero manual coding required.
---
@@ -29,37 +29,11 @@ checks ensure excellence. Zero manual coding required.
- **Compare libraries** - View matplotlib, seaborn, plotly side-by-side for the same plot
- **Always current** - AI agents continuously update examples with latest library versions
- **Natural language search** - Find plots by asking "show correlation between variables"
-- **Multi-LLM quality checks** - Claude + Gemini + GPT ensure every plot meets quality standards
+- **AI quality review** - Claude evaluates every plot against quality standards (score ≥ 50 required)
- **Open source** - Community proposes ideas via Issues, AI generates the code
---
-## Quick Start
-
-```bash
-# Clone repository
-git clone https://github.com/MarkusNeusinger/pyplots.git
-cd pyplots
-
-# Install dependencies with uv (fast!)
-curl -LsSf https://astral.sh/uv/install.sh | sh
-uv sync --all-extras
-
-# Database setup (optional - API works without DB in limited mode)
-cp .env.example .env
-# Edit .env with your DATABASE_URL
-
-# Run migrations
-uv run alembic upgrade head
-
-# Start backend
-uv run uvicorn api.main:app --reload
-
-# Visit http://localhost:8000/docs
-```
-
----
-
## Architecture
**Specification-first design**: Every plot starts as a Markdown spec (library-agnostic), then AI generates
@@ -81,9 +55,9 @@ plots/scatter-basic/
**Issue-based workflow**: GitHub Issues as state machine for plot lifecycle. Status tracked via live-updating table (no sub-issues). Each library generates in parallel, creating PRs to a feature branch.
-**AI quality review**: Claude evaluates generated plots (score ≥ 90 required). Automatic feedback loops (max 3 attempts per library). Quality scores flow via PR labels → per-library metadata files.
+**AI quality review**: Claude evaluates generated plots. Score ≥ 90 → immediate merge. Score < 90 → repair loop (max 3 attempts). After 3 attempts: ≥ 50 → merge, < 50 → failed.
-See [docs/architecture/](docs/architecture/) for details.
+See [docs/reference/](docs/reference/) for details.
---
@@ -97,9 +71,9 @@ See [docs/architecture/](docs/architecture/) for details.
**Infrastructure**: Google Cloud Run • Cloud SQL • Cloud Storage
-**Automation**: GitHub Actions • n8n Cloud Pro
+**Automation**: GitHub Actions
-**AI**: Claude (Code Max) • Vertex AI (Multi-LLM)
+**AI**: Claude (code generation + quality review)
---
@@ -115,31 +89,28 @@ Most plotting libraries are fully open source. Note these exceptions:
```
pyplots/
-├── plots/ # Plot-centric directories (spec + metadata + implementations)
-│ └── {spec-id}/
-│ ├── specification.md
-│ ├── specification.yaml
-│ ├── metadata/
-│ └── implementations/
+├── plots/ # Plot specs + metadata + implementations
├── prompts/ # AI agent prompts
-├── core/ # Shared business logic
├── api/ # FastAPI backend
-├── app/ # React frontend (Vite + MUI)
-├── tests/ # Test suite (pytest)
-└── docs/ # Documentation
+├── app/ # React frontend
+├── core/ # Shared business logic
+├── automation/ # Workflow scripts (sync, labels)
+├── tests/ # Test suite (unit, integration, e2e)
+├── alembic/ # Database migrations
+├── docs/ # Documentation
+└── .github/workflows/ # GitHub Actions
```
-**For detailed structure and file organization**, see [Repository Structure](docs/architecture/repository.md)
+**For details**, see [Repository Structure](docs/reference/repository.md)
---
## Documentation
-- **[Vision](docs/vision.md)** - Product vision and mission
-- **[Workflow](docs/workflow.md)** - Automation flows (Discovery → Deployment → Social Media)
-- **[Development](docs/development.md)** - Local setup, testing, deployment
-- **[Specs Guide](docs/specs-guide.md)** - How to write plot specifications
-- **[Architecture](docs/architecture/)** - API, database, repository structure
+- **[Vision](docs/concepts/vision.md)** - Product vision and mission
+- **[Contributing](docs/contributing.md)** - How to add/improve specs and implementations
+- **[Workflows](docs/workflows/overview.md)** - Automation flows and label system
+- **[Reference](docs/reference/)** - API, database, repository structure
---
@@ -160,32 +131,19 @@ We welcome contributions! **All code is AI-generated** - you propose ideas, AI i
2. AI generates spec, creates feature branch
3. Maintainer reviews and adds `approved` label
4. 9 library implementations generate in parallel (tracked via live status table)
-5. AI quality review per library (score ≥ 90 required)
+5. AI quality review per library (≥ 90 instant, < 90 repair loop, ≥ 50 final threshold)
6. Auto-merge to feature branch, then to main
**Important**: Don't submit code directly! If a plot has quality issues, it means the spec needs improvement, not the
code.
-See [development.md](docs/development.md) for details.
+See [contributing.md](docs/contributing.md) for details.
---
## Development
-```bash
-# Install dependencies (uv is a fast Python package installer)
-uv sync --all-extras
-
-# Run tests
-uv run pytest
-
-# Start backend
-uv run uvicorn api.main:app --reload
-```
-
-**For detailed development setup, testing, and code quality tools**, see [Development Guide](docs/development.md)
-
-**Python versions**: 3.10+ | **Coverage target**: 90%+
+See **[Development Guide](docs/development.md)** for local setup instructions.
---
diff --git a/docs/architecture/api.md b/docs/architecture/api.md
deleted file mode 100644
index 7967af4ccd..0000000000
--- a/docs/architecture/api.md
+++ /dev/null
@@ -1,786 +0,0 @@
-# 🔌 API Specification
-
-## Overview
-
-The pyplots API is a **FastAPI-based REST API** that serves as the central data access layer for all components: frontend, n8n workflows, and GitHub Actions.
-
-**Base URL**: `https://api.pyplots.ai`
-
-**Key Principle**: All database access goes through the API - no direct database connections from frontend or automation tools.
-
----
-
-## Authentication
-
-### Public Endpoints
-
-No authentication required:
-- Browse plots
-- View specs
-- Search
-
-### Authenticated Endpoints
-
-API key required (header):
-```http
-Authorization: Bearer {api_key}
-```
-
-Used for:
-- User data upload
-- Plot generation with custom data
-- Internal automation endpoints
-
----
-
-## Core Endpoints
-
-### 1. Specs
-
-#### GET `/specs`
-
-**Purpose**: List all plot specifications
-
-**Query Parameters**:
-- `tags` (optional): Comma-separated tags to filter by
-- `search` (optional): Search in title and description
-- `limit` (optional): Number of results (default: 50, max: 100)
-- `offset` (optional): Pagination offset
-
-**Response**:
-```json
-{
- "specs": [
- {
- "id": "scatter-basic-001",
- "title": "Basic 2D Scatter Plot",
- "description": "Create a simple scatter plot...",
- "tags": ["correlation", "bivariate", "basic"],
- "implementation_count": 3,
- "best_quality_score": 92.0,
- "created_at": "2025-01-15T10:00:00Z"
- }
- ],
- "total": 42,
- "limit": 50,
- "offset": 0
-}
-```
-
-**Example**:
-```bash
-GET /specs?tags=correlation,finance&limit=10
-```
-
----
-
-#### GET `/specs/{spec_id}`
-
-**Purpose**: Get detailed information about a specific spec
-
-**Response**:
-```json
-{
- "id": "scatter-basic-001",
- "title": "Basic 2D Scatter Plot",
- "description": "Create a simple scatter plot...",
- "data_requirements": [
- {
- "name": "x",
- "type": "numeric",
- "description": "X-axis values"
- },
- {
- "name": "y",
- "type": "numeric",
- "description": "Y-axis values"
- }
- ],
- "optional_params": [
- {
- "name": "color",
- "type": "string|column",
- "default": null,
- "description": "Point color or column for mapping"
- },
- {
- "name": "alpha",
- "type": "float",
- "default": 0.8,
- "description": "Transparency (0-1)"
- }
- ],
- "tags": ["correlation", "bivariate", "basic"],
- "use_cases": [
- "Correlation analysis",
- "Outlier detection"
- ],
- "implementations": [
- {
- "library": "matplotlib",
- "variant": "default",
- "quality_score": 92.0,
- "preview_url": "https://storage.googleapis.com/...",
- "python_version": "3.10+"
- },
- {
- "library": "seaborn",
- "variant": "default",
- "quality_score": 90.0,
- "preview_url": "https://storage.googleapis.com/...",
- "python_version": "3.10+"
- }
- ],
- "created_at": "2025-01-15T10:00:00Z",
- "updated_at": "2025-01-16T14:30:00Z"
-}
-```
-
----
-
-#### GET `/specs/{spec_id}/markdown`
-
-**Purpose**: Get the original spec as Markdown
-
-**Response**:
-```markdown
-# scatter-basic-001: Basic 2D Scatter Plot
-
-## Description
-
-Create a simple scatter plot showing the relationship...
-
-## Data Requirements
-
-- **x**: Numeric values for x-axis
-- **y**: Numeric values for y-axis
-
-...
-```
-
-**Content-Type**: `text/markdown`
-
----
-
-### 2. Implementations
-
-#### GET `/specs/{spec_id}/implementations`
-
-**Purpose**: Get all implementations for a spec
-
-**Query Parameters**:
-- `library` (optional): Filter by library (matplotlib, seaborn, etc.)
-- `variant` (optional): Filter by variant (default, ggplot_style, etc.)
-
-**Response**:
-```json
-{
- "spec_id": "scatter-basic-001",
- "implementations": [
- {
- "id": "550e8400-e29b-41d4-a716-446655440000",
- "library": "matplotlib",
- "library_name": "Matplotlib",
- "plot_function": "scatter",
- "variant": "default",
- "quality_score": 92.0,
- "preview_url": "https://storage.googleapis.com/...",
- "python_version": "3.10+",
- "tested": true,
- "created_at": "2025-01-15T10:30:00Z"
- }
- ]
-}
-```
-
----
-
-#### GET `/specs/{spec_id}/implementations/{library}/{variant}/code`
-
-**Purpose**: Get the implementation code
-
-**Response**:
-```python
-import matplotlib.pyplot as plt
-import pandas as pd
-
-
-def create_plot(data: pd.DataFrame, x: str, y: str, **kwargs):
- """
- Implementation for scatter-basic-001 using matplotlib
-
- Args:
- data: Input DataFrame
- x: Column name for x-axis
- y: Column name for y-axis
- **kwargs: Additional parameters (color, size, alpha, etc.)
-
- Returns:
- matplotlib Figure object
- """
- fig, ax = plt.subplots(figsize=(10, 6))
-
- ax.scatter(data[x], data[y], **kwargs)
- ax.set_xlabel(x)
- ax.set_ylabel(y)
- ax.grid(True, alpha=0.3)
-
- return fig
-```
-
-**Content-Type**: `text/x-python`
-
----
-
-### 3. Plot Generation
-
-#### POST `/plots/generate`
-
-**Purpose**: Generate plot with user's data
-
-**Authentication**: Required (API key)
-
-**Request**:
-```json
-{
- "spec_id": "scatter-basic-001",
- "library": "matplotlib",
- "variant": "default",
- "data": {
- "x": [1, 2, 3, 4, 5],
- "y": [2, 4, 6, 8, 10]
- },
- "params": {
- "color": "blue",
- "alpha": 0.8,
- "title": "My Scatter Plot"
- }
-}
-```
-
-**Alternative (CSV upload)**:
-```http
-POST /plots/generate
-Content-Type: multipart/form-data
-
-spec_id=scatter-basic-001
-library=matplotlib
-variant=default
-x=column1
-y=column2
-file={csv_file}
-```
-
-**Response**:
-```json
-{
- "image_url": "https://storage.googleapis.com/pyplots-images/generated/{session_id}/{plot_id}.png",
- "code": "import matplotlib.pyplot as plt\nimport pandas as pd\n\n...",
- "expires_at": "2025-01-19T10:00:00Z"
-}
-```
-
-**Notes**:
-- Image auto-deleted after 24 hours
-- No user data stored permanently
-- Maximum data size: 10 MB
-
----
-
-### 4. Libraries
-
-#### GET `/libraries`
-
-**Purpose**: List all supported plotting libraries
-
-**Response**:
-```json
-{
- "libraries": [
- {
- "id": "matplotlib",
- "name": "Matplotlib",
- "version": "3.8.0",
- "documentation_url": "https://matplotlib.org",
- "implementation_count": 42,
- "active": true
- },
- {
- "id": "seaborn",
- "name": "Seaborn",
- "version": "0.13.0",
- "documentation_url": "https://seaborn.pydata.org",
- "implementation_count": 38,
- "active": true
- }
- ]
-}
-```
-
----
-
-### 5. Search & Discovery
-
-#### GET `/search`
-
-**Purpose**: Full-text search across specs
-
-**Query Parameters**:
-- `q`: Search query
-- `tags` (optional): Filter by tags
-- `libraries` (optional): Filter by available libraries
-- `limit`: Results limit (default: 20)
-
-**Response**:
-```json
-{
- "results": [
- {
- "spec_id": "scatter-basic-001",
- "title": "Basic 2D Scatter Plot",
- "description": "Create a simple scatter plot...",
- "relevance_score": 0.95,
- "tags": ["correlation", "bivariate"],
- "preview_url": "https://storage.googleapis.com/..."
- }
- ],
- "total": 5,
- "query": "correlation analysis"
-}
-```
-
----
-
-#### GET `/tags`
-
-**Purpose**: Get all available tags with counts
-
-**Response**:
-```json
-{
- "tags": [
- {
- "tag": "correlation",
- "count": 15,
- "confidence": 1.0
- },
- {
- "tag": "finance",
- "count": 8,
- "confidence": 0.95
- }
- ]
-}
-```
-
----
-
-#### GET `/similar/{spec_id}`
-
-**Purpose**: Find similar plots (based on tags and description)
-
-**Response**:
-```json
-{
- "spec_id": "scatter-basic-001",
- "similar": [
- {
- "spec_id": "scatter-advanced-005",
- "similarity_score": 0.85,
- "title": "Advanced Scatter Plot with Regression",
- "preview_url": "https://storage.googleapis.com/..."
- }
- ]
-}
-```
-
----
-
-## Internal/Automation Endpoints
-
-### 6. Deployment Management
-
-#### POST `/internal/sync-from-repo`
-
-**Purpose**: Sync metadata from repository to database
-
-**Authentication**: Service account only
-
-**Request**:
-```json
-{
- "trigger": "deployment"
-}
-```
-
-**Response**:
-```json
-{
- "synced": {
- "specs": 5,
- "implementations": 15
- },
- "errors": []
-}
-```
-
-**Usage**: Called by GitHub Actions after deployment
-
----
-
-#### POST `/internal/specs/{spec_id}/deployed`
-
-**Purpose**: Mark spec as deployed and add to promotion queue
-
-**Authentication**: Service account only
-
-**Request**:
-```json
-{
- "quality_score": 92.0,
- "preview_url": "https://storage.googleapis.com/..."
-}
-```
-
-**Response**:
-```json
-{
- "status": "deployed",
- "added_to_promotion_queue": true
-}
-```
-
----
-
-### 7. Promotion Queue
-
-#### GET `/internal/promotion-queue`
-
-**Purpose**: Get next item from promotion queue
-
-**Authentication**: Service account (n8n)
-
-**Query Parameters**:
-- `limit`: Number of items (default: 1)
-- `platform`: Filter by platform (twitter, linkedin, etc.)
-
-**Response**:
-```json
-{
- "items": [
- {
- "id": "660e8400-e29b-41d4-a716-446655440000",
- "spec_id": "scatter-basic-001",
- "title": "Basic 2D Scatter Plot",
- "quality_score": 92.0,
- "preview_url": "https://storage.googleapis.com/...",
- "platform": "twitter",
- "priority": "high"
- }
- ],
- "daily_count": 1,
- "limit_reached": false
-}
-```
-
----
-
-#### POST `/internal/promotion-queue/{id}/mark-posted`
-
-**Purpose**: Mark promotion as posted
-
-**Authentication**: Service account (n8n)
-
-**Request**:
-```json
-{
- "platform": "twitter",
- "post_url": "https://twitter.com/pyplots/status/123456789"
-}
-```
-
-**Response**:
-```json
-{
- "status": "posted",
- "posted_at": "2025-01-18T15:00:00Z"
-}
-```
-
----
-
-## Error Responses
-
-### Standard Error Format
-
-```json
-{
- "error": {
- "code": "VALIDATION_ERROR",
- "message": "Invalid spec_id format",
- "details": {
- "field": "spec_id",
- "expected": "Format: {type}-{variant}-{number}"
- }
- }
-}
-```
-
-### Error Codes
-
-| Code | HTTP Status | Description |
-|------|-------------|-------------|
-| `VALIDATION_ERROR` | 400 | Invalid request parameters |
-| `NOT_FOUND` | 404 | Resource not found |
-| `UNAUTHORIZED` | 401 | Missing or invalid API key |
-| `RATE_LIMIT_EXCEEDED` | 429 | Too many requests |
-| `SERVER_ERROR` | 500 | Internal server error |
-| `DATA_TOO_LARGE` | 413 | Uploaded data exceeds 10 MB |
-| `GENERATION_FAILED` | 500 | Plot generation failed |
-
----
-
-## Rate Limiting
-
-### Public Endpoints
-
-- 100 requests per minute per IP
-- 1000 requests per hour per IP
-
-### Authenticated Endpoints
-
-- 1000 requests per minute per API key
-- 10000 requests per hour per API key
-
-### Headers
-
-Response includes rate limit headers:
-```http
-X-RateLimit-Limit: 100
-X-RateLimit-Remaining: 95
-X-RateLimit-Reset: 1705680000
-```
-
----
-
-## CORS Configuration
-
-### Allowed Origins
-
-```python
-# Development
-CORS_ORIGINS = [
- "http://localhost:3000",
- "http://127.0.0.1:3000"
-]
-
-# Production
-CORS_ORIGINS = [
- "https://pyplots.ai",
- "https://www.pyplots.ai"
-]
-```
-
-### Allowed Methods
-
-```
-GET, POST, OPTIONS
-```
-
----
-
-## Caching
-
-### Response Caching
-
-Public endpoints cached with appropriate headers:
-
-```http
-Cache-Control: public, max-age=3600
-ETag: "abc123"
-```
-
-### Cache Invalidation
-
-- Specs: Invalidate on deployment
-- Implementations: Invalidate on update
-- Libraries: Invalidate on version change
-
----
-
-## Request/Response Examples
-
-### Browse Plots with Filtering
-
-```bash
-curl "https://api.pyplots.ai/specs?tags=correlation,finance&limit=5"
-```
-
-### Get Specific Spec
-
-```bash
-curl "https://api.pyplots.ai/specs/scatter-basic-001"
-```
-
-### Get Implementation Code
-
-```bash
-curl "https://api.pyplots.ai/specs/scatter-basic-001/implementations/matplotlib/default/code"
-```
-
-### Generate Plot with User Data
-
-```bash
-curl -X POST "https://api.pyplots.ai/plots/generate" \
- -H "Authorization: Bearer YOUR_API_KEY" \
- -H "Content-Type: application/json" \
- -d '{
- "spec_id": "scatter-basic-001",
- "library": "matplotlib",
- "data": {
- "x": [1, 2, 3, 4, 5],
- "y": [2, 4, 6, 8, 10]
- },
- "params": {
- "color": "blue",
- "title": "My Data"
- }
- }'
-```
-
----
-
-## Client SDKs
-
-### Python Client
-
-```python
-from pyplots import Client
-
-client = Client(api_key="YOUR_API_KEY")
-
-# Browse specs
-specs = client.specs.list(tags=["correlation"])
-
-# Get spec details
-spec = client.specs.get("scatter-basic-001")
-
-# Generate plot
-plot = client.plots.generate(
- spec_id="scatter-basic-001",
- library="matplotlib",
- data={"x": [1, 2, 3], "y": [2, 4, 6]}
-)
-
-# Download image
-plot.save("output.png")
-
-# Get code
-print(plot.code)
-```
-
-### JavaScript Client
-
-```javascript
-import { PyplotsClient } from '@pyplots/client';
-
-const client = new PyplotsClient({ apiKey: 'YOUR_API_KEY' });
-
-// Browse specs
-const specs = await client.specs.list({ tags: ['correlation'] });
-
-// Get spec details
-const spec = await client.specs.get('scatter-basic-001');
-
-// Generate plot
-const plot = await client.plots.generate({
- specId: 'scatter-basic-001',
- library: 'matplotlib',
- data: { x: [1, 2, 3], y: [2, 4, 6] }
-});
-
-// Get image URL
-console.log(plot.imageUrl);
-```
-
----
-
-## API Versioning
-
-### Current Version
-
-API version: `v1` (implicit in URLs)
-
-### Future Versioning
-
-When breaking changes needed:
-- `/v2/specs` (new version)
-- `/specs` (alias to latest)
-- Old versions deprecated with 6-month notice
-
----
-
-## Health & Status
-
-### GET `/health`
-
-**Purpose**: Health check
-
-**Response**:
-```json
-{
- "status": "healthy",
- "version": "1.0.0",
- "database": "connected",
- "storage": "accessible"
-}
-```
-
-### GET `/status`
-
-**Purpose**: System status
-
-**Response**:
-```json
-{
- "api": "operational",
- "database": "operational",
- "storage": "operational",
- "stats": {
- "total_specs": 42,
- "total_implementations": 126,
- "active_libraries": 3
- }
-}
-```
-
----
-
-## Security
-
-### Input Validation
-
-- All inputs validated with Pydantic
-- SQL injection prevention (SQLAlchemy ORM)
-- File upload size limits (10 MB)
-- Allowed file types: CSV, Excel, JSON
-
-### Sandboxed Execution
-
-Plot generation runs in sandboxed environment:
-- Import whitelist (pandas, numpy, matplotlib, etc.)
-- Time limit: 30 seconds
-- Memory limit: 512 MB
-- No file system access
-
-### Data Privacy
-
-- User data never stored permanently
-- Generated plots deleted after 24 hours
-- No tracking of data content
-- Anonymous session IDs only
-
----
-
-*For database schema, see [database.md](./database.md)*
-*For automation workflows, see `.github/workflows/`*
diff --git a/docs/concepts/ab-testing-rules.md b/docs/concepts/ab-testing-rules.md
deleted file mode 100644
index bd867ff901..0000000000
--- a/docs/concepts/ab-testing-rules.md
+++ /dev/null
@@ -1,798 +0,0 @@
-# 🧪 A/B Testing Rules: Comparison Strategies
-
-## Overview
-
-When you create a new version of generation or evaluation rules, you need to **scientifically prove** it's better than the current version before deploying it. This document explores different strategies for A/B testing rule versions.
-
-## The Core Challenge
-
-**Problem**: You have two rule versions and need to answer:
-- Is the new version objectively better?
-- For which metrics? (quality score, generation time, pass rate)
-- By how much? (statistical significance)
-- For all plot types or just some?
-
-**Requirements**:
-- Compare same specs with both rule versions
-- Objective metrics (not subjective)
-- Statistical validity (enough samples)
-- Visual comparison (side-by-side images)
-- Cost-conscious (minimize AI API calls)
-
----
-
-## Approach 1: Parallel Generation
-
-### Concept
-
-Generate plots **simultaneously** with both rule versions and compare results.
-
-```
-Spec: scatter-basic-001
- │
- ├─→ Generate with v1.0.0 → plot_v1.png + metrics_v1
- │
- └─→ Generate with v2.0.0 → plot_v2.png + metrics_v2
-
-Compare: metrics_v1 vs metrics_v2
-```
-
-### Workflow
-
-```mermaid
-graph LR
- A[Test Specs] --> B[Generate with v1.0.0]
- A --> C[Generate with v2.0.0]
- B --> D[Collect Metrics]
- C --> D
- D --> E[Statistical Comparison]
- E --> F[Report with Visuals]
-```
-
-### Implementation
-
-```bash
-# Command-line tool
-python automation/testing/ab_parallel.py \
- --baseline v1.0.0 \
- --candidate v2.0.0 \
- --specs scatter-basic-001,heatmap-corr-002,bar-grouped-004 \
- --runs 10 \
- --output comparison-report.html
-```
-
-### Pros
-
-✅ **Fair comparison**: Both versions tested under identical conditions
-✅ **No bias**: Same timestamp, same LLM state, same randomness
-✅ **Fast results**: Get answers quickly
-✅ **Easy to automate**: Can run in CI/CD
-
-### Cons
-
-❌ **Expensive**: Doubles AI API costs (generate everything twice)
-❌ **Requires both implementations**: Need v1 and v2 automation code
-❌ **Resource intensive**: Doubles compute and storage
-
-### Best For
-
-- Final validation before deploying new rules
-- Small test sets (5-10 specs)
-- Critical decisions (major version bumps)
-- When budget allows doubling AI costs
-
----
-
-## Approach 2: Historical Comparison
-
-### Concept
-
-Generate plots with **new** version only, compare metrics against **historical** results from the old version.
-
-```
-Spec: scatter-basic-001
- │
- └─→ Generate with v2.0.0 → plot_v2.png + metrics_v2
-
-Database: metrics_v1 (from past generations with v1.0.0)
-
-Compare: metrics_v2 vs historical metrics_v1
-```
-
-### Workflow
-
-```mermaid
-graph LR
- A[Test Specs] --> B[Generate with v2.0.0]
- B --> C[Collect Metrics]
- D[Database: v1.0.0 Results] --> E[Historical Metrics]
- C --> F[Compare vs History]
- E --> F
- F --> G[Report]
-```
-
-### Implementation
-
-```bash
-# Generate with new version
-python automation/testing/ab_historical.py \
- --candidate v2.0.0 \
- --baseline-from-db v1.0.0 \
- --specs scatter-basic-001,heatmap-corr-002 \
- --output comparison-report.html
-```
-
-### Database Query
-
-```sql
--- Get historical metrics for v1.0.0
-SELECT
- spec_id,
- AVG(quality_score) as avg_score,
- AVG(generation_time_seconds) as avg_time,
- COUNT(*) as sample_size
-FROM implementations
-WHERE generation_ruleset_version = 'v1.0.0'
- AND spec_id IN ('scatter-basic-001', 'heatmap-corr-002')
-GROUP BY spec_id;
-```
-
-### Pros
-
-✅ **Cost-effective**: Only generate with new version
-✅ **Fast**: No need to regenerate with old version
-✅ **Scalable**: Can compare against large historical dataset
-✅ **Continuous**: Always comparing against production baseline
-
-### Cons
-
-❌ **Timing bias**: Old results from different time (different LLM version?)
-❌ **Context drift**: Libraries may have updated between v1 and v2
-❌ **Sample variance**: Historical data may be noisy
-❌ **No visual comparison**: Can't show side-by-side images (old images may not exist)
-
-### Best For
-
-- Quick preliminary checks
-- Large-scale comparisons (100+ specs)
-- Continuous monitoring
-- When budget is tight
-- Minor version updates (low risk)
-
----
-
-## Approach 3: Staged Rollout
-
-### Concept
-
-Deploy new version to a **small percentage** of plots first, monitor performance, gradually increase.
-
-```
-Day 1: 10% of new plots use v2.0.0, 90% use v1.0.0
- Monitor metrics for 24 hours
-
-Day 2: If good → 25% use v2.0.0
- If bad → Rollback to 0%
-
-Day 3: 50% → 75% → 100%
-```
-
-### Workflow
-
-```mermaid
-graph TD
- A[Deploy v2.0.0] --> B[10% Traffic]
- B --> C{Metrics OK?}
- C -->|Yes| D[Increase to 25%]
- C -->|No| E[Rollback to 0%]
- D --> F{Metrics OK?}
- F -->|Yes| G[Increase to 50%]
- F -->|No| E
- G --> H[Eventually 100%]
-```
-
-### Implementation
-
-```python
-# automation/rollout/canary.py
-class CanaryRollout:
- def select_rule_version(self, spec_id: str) -> str:
- """
- Returns which rule version to use for this generation
-
- Uses consistent hashing to ensure:
- - Same spec always gets same version (during rollout)
- - Percentage split is accurate
- """
- rollout_percentage = get_current_rollout_percentage()
-
- # Hash spec_id to get consistent assignment
- hash_value = int(hashlib.md5(spec_id.encode()).hexdigest(), 16)
- bucket = hash_value % 100
-
- if bucket < rollout_percentage:
- return "v2.0.0" # New version
- else:
- return "v1.0.0" # Current stable version
-```
-
-### Monitoring
-
-```bash
-# Real-time monitoring dashboard
-python automation/rollout/monitor.py \
- --new-version v2.0.0 \
- --baseline v1.0.0 \
- --metrics quality_score,generation_time,pass_rate \
- --window 24h \
- --auto-rollback-threshold -5%
-```
-
-### Pros
-
-✅ **Safe**: Limits blast radius if new version has issues
-✅ **Real production data**: Testing with actual usage patterns
-✅ **Gradual**: Can abort anytime
-✅ **Continuous feedback**: Real-time metrics
-✅ **Cost-effective**: Not duplicating work
-
-### Cons
-
-❌ **Slow**: Takes days to fully roll out
-❌ **Requires monitoring**: Someone needs to watch metrics
-❌ **Mixed state**: System has two versions running simultaneously
-❌ **Rollback complexity**: Need to invalidate cached results
-
-### Best For
-
-- Major version changes (high risk)
-- Production deployments
-- When you have time (not urgent)
-- When you have monitoring infrastructure
-- Large-scale systems with many users
-
----
-
-## Approach 4: Hybrid (Recommended)
-
-### Concept
-
-Combine multiple approaches for **balance of speed, cost, and confidence**.
-
-```
-Phase 1: Historical Comparison (Quick & Cheap)
- ↓ If promising
-Phase 2: Parallel Generation on Small Set (5 specs)
- ↓ If good
-Phase 3: Staged Rollout (10% → 50% → 100%)
-```
-
-### Workflow
-
-```mermaid
-graph TD
- A[New Rule v2.0.0] --> B[Phase 1: Historical]
- B --> C{Promising?}
- C -->|No| Z[Reject v2.0.0]
- C -->|Yes| D[Phase 2: Parallel 5 specs]
- D --> E{Quality OK?}
- E -->|No| Z
- E -->|Yes| F[Phase 3: Canary 10%]
- F --> G{Monitor 24h}
- G -->|Issues| H[Rollback]
- G -->|Good| I[Increase to 50%]
- I --> J[Monitor 24h]
- J -->|Good| K[Deploy 100%]
-```
-
-### Decision Tree
-
-```
-┌─────────────────────────────────────────────────┐
-│ Phase 1: Historical Comparison │
-│ Cost: Low | Speed: Fast | Confidence: Medium │
-│ │
-│ Question: Is new version likely better? │
-│ Criteria: avg(score_v2) > avg(score_v1) + 2% │
-└─────────────────────────────────────────────────┘
- │
- ├─ NO: STOP (reject v2.0.0)
- │
- └─ YES: Continue
- ↓
-┌─────────────────────────────────────────────────┐
-│ Phase 2: Parallel on Small Set │
-│ Cost: Medium | Speed: Medium | Confidence: High │
-│ │
-│ Question: Is quality consistently better? │
-│ Criteria: - No regressions on critical metrics │
-│ - Visual quality equal or better │
-│ - Statistical significance (p < 0.05) │
-└─────────────────────────────────────────────────┘
- │
- ├─ NO: STOP (need more refinement)
- │
- └─ YES: Deploy with canary
- ↓
-┌─────────────────────────────────────────────────┐
-│ Phase 3: Staged Rollout │
-│ Cost: Low | Speed: Slow | Confidence: Very High │
-│ │
-│ 10% → 24h monitor → 50% → 24h → 100% │
-│ │
-│ Auto-rollback if: - Quality drops > 5% │
-│ - Failure rate > 10% │
-│ - Generation time > 2x │
-└─────────────────────────────────────────────────┘
-```
-
-### Implementation
-
-```bash
-# Automated multi-phase testing
-python automation/testing/ab_hybrid.py \
- --baseline v1.0.0 \
- --candidate v2.0.0 \
- --test-specs standard_test_set.txt \
- --auto-progress \
- --output hybrid-test-report.html
-
-# Output:
-# ✓ Phase 1 (Historical): +3.2% quality improvement → PASS
-# ✓ Phase 2 (Parallel): 4/5 specs improved → PASS
-# → Triggering Phase 3 (Canary 10%)
-# → Will auto-increase after 24h if metrics stable
-```
-
-### Pros
-
-✅ **Balanced cost**: Expensive tests only if cheap tests pass
-✅ **Fast feedback**: Know quickly if worth pursuing
-✅ **High confidence**: Multiple validation layers
-✅ **Safe**: Gradual rollout limits risk
-✅ **Efficient**: Don't waste resources on bad versions
-
-### Cons
-
-❌ **Complex**: More moving parts
-❌ **Longer total time**: Three phases take longer than one
-❌ **Requires automation**: Manual process would be tedious
-
-### Best For
-
-- **Most scenarios** (recommended default)
-- Production systems
-- When you want confidence without excessive cost
-- Continuous improvement workflow
-
----
-
-## Metrics to Compare
-
-### 1. Quality Score
-```python
-{
- "metric": "quality_score",
- "v1_mean": 87.3,
- "v2_mean": 91.2,
- "improvement": "+3.9%",
- "p_value": 0.003, # Statistically significant
- "verdict": "BETTER"
-}
-```
-
-### 2. Pass Rate
-```python
-{
- "metric": "pass_rate",
- "v1": 0.87, # 87% passed
- "v2": 0.94, # 94% passed
- "improvement": "+7%",
- "verdict": "BETTER"
-}
-```
-
-### 3. Generation Time
-```python
-{
- "metric": "generation_time_seconds",
- "v1_p50": 12.3,
- "v2_p50": 15.1,
- "change": "+22.8%",
- "verdict": "WORSE (slower)"
-}
-```
-
-### 4. Attempt Distribution
-```python
-{
- "metric": "attempts_to_pass",
- "v1": {"1": 0.60, "2": 0.27, "3": 0.13}, # 60% pass on first try
- "v2": {"1": 0.75, "2": 0.20, "3": 0.05}, # 75% pass on first try
- "verdict": "BETTER (fewer retries)"
-}
-```
-
-### 5. LLM Agreement (Multi-LLM only)
-```python
-{
- "metric": "llm_agreement",
- "v1": 0.78, # 78% agreement between Claude/Gemini/GPT
- "v2": 0.89, # 89% agreement
- "verdict": "BETTER (more consistent criteria)"
-}
-```
-
----
-
-## Comparison Report Format
-
-### HTML Report Structure
-
-```html
-
-
-
- Rule A/B Test: v1.0.0 vs v2.0.0
-
-
- A/B Test Results
-
-
-
- Summary
-
-
- | Metric |
- v1.0.0 |
- v2.0.0 |
- Change |
- Verdict |
-
-
- | Quality Score |
- 87.3 |
- 91.2 |
- +3.9% |
- ✓ BETTER |
-
-
-
-
-
-
-
- Per-Spec Results
-
-
scatter-basic-001
-
-
-
v1.0.0 (score: 88)
-

-
-
-
v2.0.0 (score: 93)
-

-
-
-
- Improvement: Font size increased, grid more subtle,
- colorblind-safe palette applied.
-
-
-
-
-
-
-
- Statistical Significance
- T-test: p-value = 0.003 (p < 0.05, significant)
- Effect size: Cohen's d = 0.72 (medium effect)
- Sample size: 10 specs × 5 runs = 50 samples per version
-
-
-
-
- Recommendation
-
- ✓ DEPLOY v2.0.0
-
- Rationale:
- - Significant quality improvement (+3.9%, p=0.003)
- - Better pass rate (+7%)
- - No critical regressions
- - Visual quality consistently better
-
- Suggested rollout: Canary 10% → 50% → 100%
-
-
-
-
-```
-
----
-
-## Sample Sizes & Statistical Power
-
-### How Many Specs to Test?
-
-**Rule of Thumb**:
-- **Quick check**: 3-5 specs (low confidence, good enough for draft)
-- **Standard test**: 10-15 specs (medium confidence, good for minor versions)
-- **Rigorous test**: 20-30 specs (high confidence, required for major versions)
-
-### How Many Runs per Spec?
-
-```python
-# Statistical power calculation
-def required_sample_size(
- expected_improvement: float = 0.05, # 5% improvement
- significance_level: float = 0.05, # p < 0.05
- power: float = 0.80 # 80% power
-) -> int:
- """
- Returns: Number of runs needed per spec per version
-
- Example:
- - To detect 5% improvement
- - With 95% confidence (p < 0.05)
- - And 80% chance of detecting if it exists
- → Need ~64 samples per version
- → For 10 specs: 6-7 runs per spec per version
- """
- pass
-```
-
-**Practical Guide**:
-- **Budget unlimited**: 10 runs per spec per version
-- **Budget medium**: 5 runs per spec per version
-- **Budget tight**: 3 runs per spec per version
-- **Quick check**: 1 run per spec per version (not statistically valid, just a sanity check)
-
----
-
-## Cost Estimation
-
-### Parallel Generation (Approach 1)
-
-```python
-# Assumptions
-specs = 10
-runs = 5
-cost_per_generation = $0.10 # Claude API
-
-# Cost calculation
-total_generations = specs × runs × 2 # ×2 for both versions
-total_cost = total_generations × cost_per_generation
-
-# = 10 × 5 × 2 × $0.10 = $10.00
-```
-
-### Historical Comparison (Approach 2)
-
-```python
-# Only generate with new version
-total_generations = specs × runs
-total_cost = total_generations × cost_per_generation
-
-# = 10 × 5 × $0.10 = $5.00 (50% cheaper)
-```
-
-### Hybrid Approach (Approach 4)
-
-```python
-# Phase 1: Historical (free, uses existing data)
-phase1_cost = $0
-
-# Phase 2: Parallel on 5 specs, 5 runs each
-phase2_cost = 5 × 5 × 2 × $0.10 = $5.00
-
-# Phase 3: Canary (spreads over time, no extra cost)
-phase3_cost = $0
-
-# Total: $5.00 (same as Approach 2, but higher confidence)
-```
-
----
-
-## Automation Scripts (Conceptual)
-
-### Quick Start
-
-```bash
-# Install dependencies
-pip install pyplots-testing
-
-# Run standard A/B test (hybrid approach)
-pyplots-ab-test \
- --baseline v1.0.0 \
- --candidate v2.0.0 \
- --output report.html
-
-# Output:
-# ✓ Phase 1: Historical check PASSED
-# ✓ Phase 2: Parallel test PASSED
-# → Starting Phase 3: Canary rollout
-```
-
-### Custom Test
-
-```python
-# automation/testing/custom_ab_test.py
-from pyplots.testing import ABTest
-
-# Configure test
-test = ABTest(
- baseline_version="v1.0.0",
- candidate_version="v2.0.0",
- approach="hybrid"
-)
-
-# Add test specs
-test.add_specs([
- "scatter-basic-001",
- "heatmap-corr-002",
- "bar-grouped-004"
-])
-
-# Configure metrics
-test.track_metrics([
- "quality_score",
- "pass_rate",
- "generation_time",
- "attempts_to_pass"
-])
-
-# Run test
-results = test.run()
-
-# Generate report
-test.generate_report(
- output="comparison-report.html",
- include_visuals=True
-)
-
-# Decision
-if results.recommend_deployment():
- print("✓ Deploy v2.0.0")
- test.trigger_canary_rollout()
-else:
- print("✗ Keep v1.0.0")
- print(f"Reason: {results.rejection_reason}")
-```
-
----
-
-## Decision Framework
-
-### Should I Deploy the New Version?
-
-```
-Deploy v2.0.0 if ALL of:
-├─ Quality score improved OR stayed same
-├─ Pass rate improved OR stayed within -2%
-├─ No critical regressions (must-have features still work)
-├─ Statistical significance (p < 0.05) OR large effect size
-└─ Visual inspection looks good (side-by-side comparison)
-
-DON'T deploy if ANY of:
-├─ Quality score dropped > 3%
-├─ Pass rate dropped > 5%
-├─ Critical features broken
-├─ Generation time increased > 50% (unless quality gain is huge)
-└─ Visual quality clearly worse
-```
-
-### Borderline Cases
-
-```
-If results are mixed (some metrics better, some worse):
-1. Weight metrics by importance:
- - Quality score: 40%
- - Pass rate: 30%
- - Visual quality: 20%
- - Generation time: 10%
-
-2. Calculate weighted score
-
-3. If weighted score > current + 5%:
- → Deploy
- Otherwise:
- → Refine and test again
-```
-
----
-
-## Future Enhancements
-
-### Automatic A/B Testing in CI/CD
-
-```yaml
-# .github/workflows/test-new-rules.yml
-on:
- pull_request:
- paths:
- - 'rules/**'
-
-jobs:
- ab-test:
- runs-on: ubuntu-latest
- steps:
- - name: Detect rule changes
- id: detect
- run: |
- # Extract version numbers
- OLD_VERSION=$(...)
- NEW_VERSION=$(...)
-
- - name: Run A/B test
- run: |
- pyplots-ab-test \
- --baseline $OLD_VERSION \
- --candidate $NEW_VERSION \
- --auto
-
- - name: Post report to PR
- uses: actions/github-script@v6
- with:
- script: |
- // Comment with HTML report
-```
-
-### Machine Learning for Rule Optimization
-
-```python
-# Future: Learn which rules produce best results
-from pyplots.ml import RuleOptimizer
-
-optimizer = RuleOptimizer()
-
-# Learn from historical data
-optimizer.train(
- generations=all_generations_from_database,
- target_metric="quality_score"
-)
-
-# Suggest rule improvements
-suggestions = optimizer.suggest_improvements(
- current_version="v2.0.0"
-)
-
-# Output:
-# Suggestion 1: Increase grid alpha to 0.35 (predicted +2% quality)
-# Suggestion 2: Add minimum 11pt font size (predicted +1.5% quality)
-```
-
----
-
-## Summary
-
-### Quick Reference Table
-
-| Approach | Cost | Speed | Confidence | Best For |
-|----------|------|-------|------------|----------|
-| **Parallel** | High | Fast | High | Critical decisions, final validation |
-| **Historical** | Low | Very Fast | Medium | Quick checks, large scale |
-| **Staged** | Low | Slow | Very High | Major changes, production |
-| **Hybrid** | Medium | Medium | Very High | **Most scenarios (recommended)** |
-
-### Recommendation
-
-**For most use cases, use the Hybrid approach**:
-1. Quick historical check (5 min, $0)
-2. If promising → Parallel test on 5 specs (1 hour, $5)
-3. If good → Canary rollout 10% → 50% → 100% (2-3 days, $0 extra)
-
-This balances cost, speed, and confidence while minimizing risk.
-
----
-
-## Related Documentation
-
-- [Rule Versioning System](../architecture/rule-versioning.md)
-- [Claude Skill for Plot Generation](./claude-skill-plot-generation.md)
-- [Automation Workflows](../architecture/automation-workflows.md)
-
----
-
-*"Test scientifically. Deploy confidently."*
diff --git a/docs/concepts/claude-skill-plot-generation.md b/docs/concepts/claude-skill-plot-generation.md
deleted file mode 100644
index 32763a74da..0000000000
--- a/docs/concepts/claude-skill-plot-generation.md
+++ /dev/null
@@ -1,966 +0,0 @@
-# 🎨 Claude Skill: Plot Generation
-
-## Overview
-
-A **Claude Skill** is a specialized, reusable capability that can be invoked by Claude Code or other AI systems. This document proposes a comprehensive skill for automated plot generation that:
-
-- Reads versioned rule files (Markdown)
-- Generates implementation code from specs
-- Performs self-review and optimization
-- Handles multi-attempt feedback loops
-- Integrates with the pyplots rule versioning system
-
-## Why a Claude Skill?
-
-### Problems with Ad-Hoc Prompting
-
-❌ **Inconsistent**: Every generation uses slightly different prompts
-❌ **Not reusable**: Have to explain the full process each time
-❌ **Hard to improve**: Prompt changes lost in chat history
-❌ **No versioning**: Can't track what prompts generated which plots
-❌ **Manual orchestration**: Human has to manage the feedback loop
-
-### Benefits of a Skill
-
-✅ **Consistent**: Same process every time
-✅ **Reusable**: Call the skill, get a plot
-✅ **Versionable**: Skill linked to rule versions
-✅ **Automated**: Handles feedback loops internally
-✅ **Testable**: Can A/B test different skill versions
-✅ **Scalable**: Easy to invoke from automation (GitHub Actions, n8n)
-
----
-
-## Skill Architecture
-
-### High-Level Flow
-
-```
-Input: Spec Markdown + Target Library + Rule Version
- ↓
-┌──────────────────────────────────────────┐
-│ Claude Skill: Plot Generation v1.0.0 │
-│ │
-│ 1. Load Rules (from rules/{version}/) │
-│ 2. Generate Code │
-│ 3. Self-Review │
-│ 4. Optimize if needed (max 3 attempts) │
-│ 5. Return Result │
-└──────────────────────────────────────────┘
- ↓
-Output: Python Code + Metadata + Feedback
-```
-
-### Skill Interface
-
-```python
-# Conceptual API
-from claude_skills import PlotGenerationSkill
-
-skill = PlotGenerationSkill(
- rule_version="v1.0.0", # Which rules to use
- max_attempts=3 # Maximum optimization loops
-)
-
-result = skill.generate(
- spec_markdown="specs/scatter-basic-001.md",
- library="matplotlib",
- variant="default"
-)
-
-# result.success → True/False
-# result.code → Generated Python code
-# result.quality_score → Self-review score
-# result.attempt_count → How many tries it took
-# result.feedback → Improvement suggestions
-```
-
----
-
-## Skill Inputs
-
-### Required Inputs
-
-```python
-{
- "spec_markdown": "# scatter-basic-001: Basic 2D Scatter Plot\n\n...",
- "library": "matplotlib", # or "seaborn", "plotly", etc.
- "variant": "default", # or "ggplot_style", "dark_mode", etc.
-}
-```
-
-### Optional Inputs
-
-```python
-{
- "rule_version": "v1.0.0", # Default: latest active version
- "max_attempts": 3, # Default: 3
- "strict_mode": False, # If True, fail if any criterion not met
- "custom_criteria": [], # Additional quality checks
- "python_version": "3.12", # Target Python version
- "style_constraints": { # Additional styling rules
- "color_palette": "colorblind_safe",
- "figure_size": (12, 8)
- }
-}
-```
-
----
-
-## Skill Outputs
-
-### Success Case
-
-```python
-{
- "success": True,
- "code": "import matplotlib.pyplot as plt\nimport pandas as pd\n\n...",
- "quality_score": 92,
- "attempt_count": 2,
- "criteria_met": [
- "axes_labeled",
- "grid_visible",
- "colorblind_safe"
- ],
- "criteria_failed": [],
- "feedback": {
- "attempt_1": {
- "score": 78,
- "issues": ["X-axis labels overlapping", "Grid too prominent"],
- "improvements": "Rotate labels, reduce grid alpha"
- },
- "attempt_2": {
- "score": 92,
- "issues": [],
- "improvements": "All criteria met, code optimized"
- }
- },
- "metadata": {
- "rule_version": "v1.0.0",
- "generation_time_seconds": 15.3,
- "library": "matplotlib",
- "variant": "default"
- }
-}
-```
-
-### Failure Case
-
-```python
-{
- "success": False,
- "code": None,
- "quality_score": 71, # Below threshold after 3 attempts
- "attempt_count": 3,
- "criteria_met": ["axes_labeled"],
- "criteria_failed": ["grid_visible", "colorblind_safe"],
- "feedback": {
- "attempt_1": {...},
- "attempt_2": {...},
- "attempt_3": {
- "score": 71,
- "issues": [
- "Colorblind safety check still failing",
- "Unable to find suitable palette that works with data"
- ],
- "recommendations": [
- "Consider using different visualization type",
- "May need manual refinement"
- ]
- }
- },
- "error": "Failed to meet quality threshold after 3 attempts"
-}
-```
-
----
-
-## Internal Workflow
-
-### Phase 1: Load Rules
-
-```python
-def load_rules(version: str) -> Rules:
- """
- Load generation rules from rules/generation/{version}/
-
- Returns:
- - code_generation_rules: How to generate code
- - quality_criteria: What makes a good plot
- - self_review_checklist: How to self-evaluate
- """
- base_path = f"rules/generation/{version}/"
-
- rules = Rules(
- generation=load_markdown(base_path + "code-generation-rules.md"),
- quality=load_markdown(base_path + "quality-criteria.md"),
- self_review=load_markdown(base_path + "self-review-checklist.md"),
- metadata=load_yaml(base_path + "metadata.yaml")
- )
-
- return rules
-```
-
-### Phase 2: Generate Initial Code
-
-```python
-def generate_initial_code(
- spec: str,
- library: str,
- rules: Rules
-) -> str:
- """
- Generate first version of code based on spec and rules
-
- Process:
- 1. Parse spec to extract requirements
- 2. Follow generation rules for code structure
- 3. Apply library-specific patterns
- 4. Generate complete, executable code
- """
- prompt = f"""
-You are generating a plot implementation.
-
-# Spec
-{spec}
-
-# Target Library
-{library}
-
-# Generation Rules
-{rules.generation}
-
-# Task
-Generate complete Python code that:
-1. Implements the spec requirements
-2. Follows all generation rules
-3. Is ready to execute
-
-Return only the Python code, no explanations.
-"""
-
- code = call_claude(prompt)
- return code
-```
-
-### Phase 3: Self-Review
-
-```python
-def self_review(
- code: str,
- spec: str,
- rules: Rules
-) -> SelfReviewResult:
- """
- Evaluate generated code against quality criteria
-
- Returns:
- - score: 0-100
- - issues: List of problems found
- - suggestions: How to improve
- """
- # Execute code to generate plot image
- image_bytes = execute_and_render(code)
-
- prompt = f"""
-You are reviewing a generated plot implementation.
-
-# Spec
-{spec}
-
-# Generated Code
-```python
-{code}
-```
-
-# Quality Criteria (from rules)
-{rules.quality}
-
-# Self-Review Checklist
-{rules.self_review}
-
-# Task
-1. Execute the code mentally (or review the logic)
-2. Check against each quality criterion
-3. Provide a score (0-100) and detailed feedback
-
-Return JSON:
-{{
- "score": 0-100,
- "criteria_met": ["id1", "id2"],
- "criteria_failed": ["id3"],
- "issues": ["Issue 1", "Issue 2"],
- "suggestions": ["Suggestion 1", "Suggestion 2"]
-}}
-"""
-
- result = call_claude_with_image(prompt, image_bytes)
- return parse_json(result)
-```
-
-### Phase 4: Optimization Loop
-
-```python
-def optimize_code(
- code: str,
- review_result: SelfReviewResult,
- rules: Rules
-) -> str:
- """
- Improve code based on self-review feedback
-
- Process:
- 1. Identify specific issues
- 2. Generate targeted fixes
- 3. Apply fixes to code
- 4. Return improved version
- """
- prompt = f"""
-You are optimizing plot code based on review feedback.
-
-# Current Code
-```python
-{code}
-```
-
-# Review Feedback
-Score: {review_result.score}/100
-
-Issues:
-{'\n'.join(f"- {issue}" for issue in review_result.issues)}
-
-Suggestions:
-{'\n'.join(f"- {sug}" for sug in review_result.suggestions)}
-
-# Quality Criteria (still need to meet)
-{format_failed_criteria(review_result.criteria_failed, rules.quality)}
-
-# Task
-Generate improved code that addresses all issues.
-Focus specifically on the failed criteria.
-
-Return only the improved Python code, no explanations.
-"""
-
- improved_code = call_claude(prompt)
- return improved_code
-```
-
-### Phase 5: Multi-Attempt Loop
-
-```python
-def generate_with_feedback_loop(
- spec: str,
- library: str,
- rules: Rules,
- max_attempts: int = 3,
- pass_threshold: int = 90
-) -> GenerationResult:
- """
- Main generation loop with self-correction
-
- Returns after:
- - Score >= threshold (success)
- - max_attempts reached (failure)
- """
- feedback_history = []
-
- # Attempt 1: Initial generation
- code = generate_initial_code(spec, library, rules)
- review = self_review(code, spec, rules)
- feedback_history.append(review)
-
- attempt = 1
-
- # Attempts 2-3: Optimization loop
- while review.score < pass_threshold and attempt < max_attempts:
- attempt += 1
-
- code = optimize_code(code, review, rules)
- review = self_review(code, spec, rules)
- feedback_history.append(review)
-
- # Final result
- success = review.score >= pass_threshold
-
- return GenerationResult(
- success=success,
- code=code if success else None,
- quality_score=review.score,
- attempt_count=attempt,
- criteria_met=review.criteria_met,
- criteria_failed=review.criteria_failed,
- feedback=feedback_history
- )
-```
-
----
-
-## Skill Definition (Claude Code Format)
-
-```yaml
-# skills/plot-generation/skill.yaml
-name: plot-generation
-version: 1.0.0
-description: Generate plot implementations from specifications with automated quality feedback
-
-inputs:
- spec_markdown:
- type: string
- required: true
- description: Plot specification in Markdown format
-
- library:
- type: string
- required: true
- enum: [matplotlib, seaborn, plotly, bokeh, altair]
-
- variant:
- type: string
- required: false
- default: "default"
-
- rule_version:
- type: string
- required: false
- default: "latest"
- description: Which rule version to use (e.g., "v1.0.0")
-
- max_attempts:
- type: integer
- required: false
- default: 3
- min: 1
- max: 5
-
-capabilities:
- - read_files: true # Read spec and rule files
- - execute_code: true # Execute generated code to render plots
- - vision: true # Analyze generated plot images
- - iterative: true # Multi-attempt optimization loop
-
-outputs:
- success:
- type: boolean
- description: Whether generation succeeded
-
- code:
- type: string
- description: Generated Python code (null if failed)
-
- quality_score:
- type: integer
- description: Final quality score (0-100)
-
- attempt_count:
- type: integer
- description: Number of attempts needed
-
- feedback:
- type: object
- description: Detailed feedback from all attempts
-
-workflow:
- - step: load_rules
- action: Read rule files from rules/generation/{rule_version}/
-
- - step: generate
- action: Create initial code based on spec and rules
-
- - step: review
- action: Self-evaluate code against quality criteria
- loop:
- max_iterations: ${max_attempts}
- continue_if: quality_score < 90
- next_step: optimize
-
- - step: optimize
- action: Improve code based on feedback
- next_step: review
-
- - step: finalize
- action: Return result with code and metadata
-```
-
----
-
-## Invocation Examples
-
-### Example 1: Basic Invocation
-
-```bash
-# From command line (hypothetical)
-claude-skill plot-generation \
- --spec specs/scatter-basic-001.md \
- --library matplotlib \
- --variant default \
- --output generated-plot.py
-```
-
-### Example 2: From Python
-
-```python
-# core/generators/claude_generator.py
-from claude_skills import invoke_skill
-
-result = invoke_skill(
- skill="plot-generation",
- inputs={
- "spec_markdown": Path("specs/scatter-basic-001.md").read_text(),
- "library": "matplotlib",
- "variant": "default",
- "rule_version": "v1.0.0"
- }
-)
-
-if result.success:
- # Save generated code
- Path("plots/matplotlib/scatter/scatter-basic-001/default.py").write_text(result.code)
-
- # Record metadata
- save_metadata(
- spec_id="scatter-basic-001",
- quality_score=result.quality_score,
- attempt_count=result.attempt_count,
- rule_version="v1.0.0"
- )
-else:
- # Log failure
- log_failure(
- spec_id="scatter-basic-001",
- reason=result.error,
- feedback=result.feedback
- )
-```
-
-### Example 3: From GitHub Actions
-
-```yaml
-# .github/workflows/generate-plot.yml
-- name: Generate plot implementation
- id: generate
- run: |
- claude-skill plot-generation \
- --spec ${{ env.SPEC_FILE }} \
- --library ${{ matrix.library }} \
- --rule-version v1.0.0 \
- --output generated.py \
- --json-output result.json
-
-- name: Check if successful
- run: |
- SUCCESS=$(jq -r '.success' result.json)
- SCORE=$(jq -r '.quality_score' result.json)
-
- if [ "$SUCCESS" = "true" ]; then
- echo "✓ Generation successful (score: $SCORE)"
- else
- echo "✗ Generation failed"
- exit 1
- fi
-```
-
-### Example 4: A/B Testing with Different Rule Versions
-
-```python
-# automation/testing/ab_with_skills.py
-from claude_skills import invoke_skill
-
-def compare_rule_versions(spec_id: str, versions: list[str]):
- """
- Generate plot with multiple rule versions and compare
- """
- results = {}
-
- for version in versions:
- result = invoke_skill(
- skill="plot-generation",
- inputs={
- "spec_markdown": load_spec(spec_id),
- "library": "matplotlib",
- "rule_version": version
- }
- )
-
- results[version] = {
- "success": result.success,
- "quality_score": result.quality_score,
- "attempt_count": result.attempt_count,
- "code": result.code
- }
-
- # Compare results
- return generate_comparison_report(results)
-
-# Usage
-report = compare_rule_versions(
- spec_id="scatter-basic-001",
- versions=["v1.0.0", "v2.0.0"]
-)
-```
-
----
-
-## Advanced Features
-
-### Feature 1: Custom Quality Criteria
-
-```python
-# Add project-specific criteria
-result = invoke_skill(
- skill="plot-generation",
- inputs={
- "spec_markdown": spec,
- "library": "matplotlib",
- "custom_criteria": [
- {
- "id": "brand_colors",
- "requirement": "Use company brand colors only",
- "colors": ["#FF6B6B", "#4ECDC4", "#45B7D1"],
- "weight": 1.0
- },
- {
- "id": "max_figure_width",
- "requirement": "Figure width must not exceed 10 inches",
- "max_width": 10,
- "weight": 0.5
- }
- ]
- }
-)
-```
-
-### Feature 2: Style Templates
-
-```python
-# Use predefined style templates
-result = invoke_skill(
- skill="plot-generation",
- inputs={
- "spec_markdown": spec,
- "library": "matplotlib",
- "style_template": "academic_paper", # or "presentation", "web", etc.
- "style_overrides": {
- "font_family": "Arial",
- "font_size": 12
- }
- }
-)
-```
-
-### Feature 3: Multi-Library Generation
-
-```python
-# Generate for all suitable libraries in one call
-result = invoke_skill(
- skill="plot-generation",
- inputs={
- "spec_markdown": spec,
- "library": "all", # Special value: generate for all suitable libraries
- "rule_version": "v1.0.0"
- }
-)
-
-# Result contains implementations for multiple libraries
-# result.implementations = {
-# "matplotlib": {...},
-# "seaborn": {...},
-# "plotly": {...}
-# }
-```
-
-### Feature 4: Incremental Refinement
-
-```python
-# Start with a draft, refine iteratively
-draft_result = invoke_skill(
- skill="plot-generation",
- inputs={
- "spec_markdown": spec,
- "library": "matplotlib",
- "strict_mode": False, # Allow lower quality for draft
- "max_attempts": 1 # Quick draft
- }
-)
-
-# Review and refine
-final_result = invoke_skill(
- skill="plot-generation",
- inputs={
- "spec_markdown": spec,
- "library": "matplotlib",
- "initial_code": draft_result.code, # Start from draft
- "feedback": "Improve colorblind safety and font sizes",
- "strict_mode": True,
- "max_attempts": 3
- }
-)
-```
-
----
-
-## Integration with Rule Versioning
-
-### Linking Skills to Rules
-
-```yaml
-# skills/plot-generation/versions.yaml
-skill_versions:
- - version: "1.0.0"
- compatible_rule_versions:
- generation: ["v1.0.0", "v1.1.0"]
- evaluation: ["v1.0.0"]
- status: "active"
-
- - version: "1.1.0"
- compatible_rule_versions:
- generation: ["v2.0.0", "v2.1.0"]
- evaluation: ["v2.0.0"]
- status: "active"
-```
-
-### Automatic Rule Selection
-
-```python
-# Skill automatically selects appropriate rules
-result = invoke_skill(
- skill="plot-generation",
- skill_version="1.1.0", # Skill version
- inputs={
- "spec_markdown": spec,
- "library": "matplotlib",
- # rule_version not specified → use latest compatible
- }
-)
-
-# Skill uses:
-# - Skill logic version 1.1.0
-# - Latest compatible generation rules (v2.1.0)
-# - Latest compatible evaluation rules (v2.0.0)
-```
-
----
-
-## Performance Optimization
-
-### Caching
-
-```python
-# Cache generated code to avoid regeneration
-result = invoke_skill(
- skill="plot-generation",
- inputs={
- "spec_markdown": spec,
- "library": "matplotlib",
- "rule_version": "v1.0.0"
- },
- cache_key=f"{spec_hash}:{library}:v1.0.0"
-)
-
-# If cache hit: return cached result (instant)
-# If cache miss: generate and cache (15-30 seconds)
-```
-
-### Parallel Generation
-
-```python
-# Generate for multiple libraries in parallel
-from concurrent.futures import ThreadPoolExecutor
-
-libraries = ["matplotlib", "seaborn", "plotly"]
-
-with ThreadPoolExecutor(max_workers=3) as executor:
- futures = [
- executor.submit(
- invoke_skill,
- skill="plot-generation",
- inputs={"spec_markdown": spec, "library": lib}
- )
- for lib in libraries
- ]
-
- results = {lib: future.result() for lib, future in zip(libraries, futures)}
-```
-
----
-
-## Error Handling
-
-### Graceful Degradation
-
-```python
-try:
- result = invoke_skill(
- skill="plot-generation",
- inputs={...},
- timeout_seconds=60 # Don't wait forever
- )
-
- if result.success:
- # Use generated code
- save_code(result.code)
- else:
- # Fall back to template or manual generation
- log_failure(result.feedback)
- use_fallback_template()
-
-except TimeoutError:
- # Skill took too long
- log_error("Generation timeout")
- use_fallback_template()
-
-except SkillError as e:
- # Skill crashed or invalid input
- log_error(f"Skill error: {e}")
- use_fallback_template()
-```
-
-### Retry Logic
-
-```python
-# Retry with exponential backoff
-def generate_with_retry(spec, library, max_retries=3):
- for attempt in range(max_retries):
- try:
- result = invoke_skill(...)
-
- if result.success:
- return result
- elif result.quality_score > 75:
- # Close enough, acceptable
- return result
- else:
- # Try again with more attempts
- continue
-
- except Exception as e:
- if attempt < max_retries - 1:
- time.sleep(2 ** attempt) # 1s, 2s, 4s
- continue
- else:
- raise
-```
-
----
-
-## Monitoring & Telemetry
-
-### Metrics to Track
-
-```python
-{
- "skill_invocations": {
- "total": 1234,
- "successful": 1180,
- "failed": 54,
- "success_rate": 0.956
- },
-
- "performance": {
- "avg_generation_time_seconds": 18.3,
- "p50": 15.2,
- "p95": 35.7,
- "p99": 48.1
- },
-
- "quality": {
- "avg_quality_score": 89.2,
- "avg_attempts_to_pass": 1.7,
- "first_attempt_success_rate": 0.68
- },
-
- "by_rule_version": {
- "v1.0.0": {
- "invocations": 523,
- "avg_quality_score": 87.1
- },
- "v2.0.0": {
- "invocations": 711,
- "avg_quality_score": 91.3
- }
- }
-}
-```
-
----
-
-## Future Enhancements
-
-### 1. Learning from Feedback
-
-```python
-# Skill learns which strategies work best
-skill.train(
- successful_generations=database.get_successful_generations(),
- failed_generations=database.get_failed_generations()
-)
-
-# Improves:
-# - Which libraries work best for which plot types
-# - Common pitfalls to avoid
-# - Optimization strategies
-```
-
-### 2. Multi-Modal Input
-
-```python
-# Generate plot from image + description
-result = invoke_skill(
- skill="plot-generation",
- inputs={
- "reference_image": "path/to/example.png", # What they want
- "description": "Like this but for time series data",
- "library": "matplotlib"
- }
-)
-```
-
-### 3. Interactive Refinement
-
-```python
-# User provides feedback, skill refines
-result1 = invoke_skill(...) # First version
-
-user_feedback = "The legend is too large and covers data"
-
-result2 = invoke_skill(
- inputs={
- "initial_code": result1.code,
- "user_feedback": user_feedback,
- "refine_only": ["legend"] # Only change legend
- }
-)
-```
-
----
-
-## Summary
-
-### Skill Benefits
-
-✅ **Consistent quality** through versioned rules
-✅ **Automated feedback loops** reduce manual work
-✅ **Testable** via A/B testing of rule versions
-✅ **Scalable** from CLI to full automation
-✅ **Auditable** - know exactly what generated each plot
-
-### Next Steps
-
-1. **Define initial ruleset** (v1.0.0-draft)
-2. **Prototype skill logic** (Python script)
-3. **Test with 5-10 specs** manually
-4. **Refine based on results**
-5. **Formalize as Claude Skill** (if system supports)
-6. **Integrate with automation** (GitHub Actions, n8n)
-
----
-
-## Related Documentation
-
-- [Rule Versioning System](../architecture/rule-versioning.md)
-- [A/B Testing Rules](./ab-testing-rules.md)
-- [Automation Workflows](../architecture/automation-workflows.md)
-
----
-
-*"A skill is a reusable unit of AI capability. Make it good, make it versioned, make it testable."*
diff --git a/docs/vision.md b/docs/concepts/vision.md
similarity index 100%
rename from docs/vision.md
rename to docs/concepts/vision.md
diff --git a/docs/contributing.md b/docs/contributing.md
new file mode 100644
index 0000000000..7531fcb781
--- /dev/null
+++ b/docs/contributing.md
@@ -0,0 +1,81 @@
+# Contributing to pyplots
+
+## Overview
+
+pyplots is a specification-driven platform where **AI generates all plot implementations**. As a contributor, your main focus is on **specifications** (what to visualize) rather than code (how to implement).
+
+---
+
+## How to Propose a New Plot Type
+
+1. **Create a GitHub Issue** with a descriptive title (e.g., "Radar Chart with Multiple Series")
+ - Do NOT include spec-id in the title
+2. **Add the `spec-request` label**
+3. **Wait for automation**:
+ - `spec-create.yml` analyzes your request
+ - Assigns a unique spec-id
+ - Creates a PR with `specification.md` and `specification.yaml`
+4. **Review the generated spec** (PR comments)
+5. **Maintainer adds `approved` label** to the Issue (not the PR)
+6. **Spec merges to main** with `spec-ready` label
+
+---
+
+## How to Improve an Existing Spec
+
+1. **Create a GitHub Issue** referencing the spec to update
+2. **Add the `spec-update` label**
+3. **Wait for `spec-update.yml`** to create a PR with changes
+4. **Maintainer reviews and adds `approved` label**
+
+---
+
+## How to Trigger Implementation Generation
+
+After a spec has the `spec-ready` label:
+
+**Single Library:**
+- Add `generate:{library}` label to the issue (e.g., `generate:matplotlib`)
+
+**All Libraries:**
+```bash
+gh workflow run bulk-generate.yml -f specification_id= -f library=all
+```
+
+---
+
+## What NOT to Do
+
+| Don't | Why |
+|-------|-----|
+| Manually create `plots/` directories | Let `spec-create.yml` handle it |
+| Write `specification.md` files directly | Let AI generate from your Issue |
+| Include `[spec-id]` in issue titles | Spec-id is auto-assigned |
+| Add `approved` label to PRs | Add it to Issues instead |
+| Run `gh pr merge` on implementation PRs | Let `impl-merge.yml` handle it |
+| Create `metadata/*.yaml` manually | Created automatically on merge |
+
+---
+
+## Why This Workflow?
+
+Manual intervention causes:
+- Missing quality scores in metadata
+- Missing preview images in GCS
+- Issues staying open when complete
+- Broken database sync
+
+**Trust the automation.** It handles: code generation, quality review, repair attempts, image promotion, and database sync.
+
+---
+
+## Labels Reference
+
+See [workflows/overview.md](./workflows/overview.md) for the complete label system.
+
+---
+
+## Questions?
+
+- Check existing [Issues](https://github.com/MarkusNeusinger/pyplots/issues) for similar requests
+- Review the [workflows overview](./workflows/overview.md) for automation details
diff --git a/docs/development.md b/docs/development.md
index 232beff2b8..b2ec96af7c 100644
--- a/docs/development.md
+++ b/docs/development.md
@@ -1,436 +1,209 @@
-# 🛠️ Development Guide
+# Development Guide
-## Getting Started
-
-### Prerequisites
-
-**Required**:
-- Python 3.10+ (3.12 recommended)
-- [uv](https://github.com/astral-sh/uv) package manager
-- Git
-- PostgreSQL 15+ (for local development)
-
-**Optional**:
-- Docker (for containerized database)
-- Node.js 20+ (for frontend development)
+Guide for setting up a local development environment.
---
-## Local Setup
-
-### 1. Clone Repository
-
-```bash
-git clone https://github.com/your-username/pyplots.git
-cd pyplots
-```
+## Prerequisites
-### 2. Install Dependencies
+- **Python 3.10+**
+- **Node.js 18+** and yarn
+- **PostgreSQL** (or access to Cloud SQL)
+- **uv** - Fast Python package manager
```bash
-# Install uv if not already installed
+# Install uv
curl -LsSf https://astral.sh/uv/install.sh | sh
-
-# Sync dependencies
-uv sync --all-extras
```
-This installs:
-- Core dependencies
-- Development tools (pytest, ruff, mypy)
-- All plotting libraries (matplotlib, seaborn, plotly)
-
-### 3. Database Setup
+---
-**Option A: Local PostgreSQL**
+## Backend Setup
```bash
-# Create database
-createdb pyplots
+# Clone and install
+git clone https://github.com/MarkusNeusinger/pyplots.git
+cd pyplots
+uv sync --all-extras
-# Set environment variables
+# Database configuration
cp .env.example .env
-# Edit .env and set DATABASE_URL
-```
-
-**Option B: Docker**
-
-```bash
-docker run -d \
- --name pyplots-postgres \
- -e POSTGRES_DB=pyplots \
- -e POSTGRES_USER=pyplots \
- -e POSTGRES_PASSWORD=dev_password \
- -p 5432:5432 \
- postgres:15
-```
+# Edit .env with your DATABASE_URL:
+# DATABASE_URL=postgresql+asyncpg://user:pass@host:5432/pyplots
-### 4. Run Migrations
-
-```bash
+# Run migrations
uv run alembic upgrade head
-```
-### 5. Start Backend
-
-```bash
-uv run uvicorn api.main:app --reload --port 8000
+# Start API server
+uv run uvicorn api.main:app --reload
+# → http://localhost:8000/docs
```
-API available at: `http://localhost:8000`
-Docs available at: `http://localhost:8000/docs`
+---
-### 6. Start Frontend (Optional)
+## Frontend Setup
```bash
cd app
-npm install
-npm run dev
+yarn install
+yarn dev
+# → http://localhost:3000
```
-Frontend available at: `http://localhost:3000`
+For production build:
+```bash
+yarn build
+```
---
-## Environment Variables
-
-Create `.env` file in project root (see `.env.example`):
+## Running Tests
```bash
-# Database (Cloud SQL via public IP for local development)
-DATABASE_URL=postgresql+asyncpg://user:password@CLOUD_SQL_PUBLIC_IP:5432/pyplots
+# All tests
+uv run pytest
-# Google Cloud Storage
-GCS_BUCKET=pyplots-images
-GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account.json
+# Only unit tests (fast, no DB needed)
+uv run pytest tests/unit
-# Environment
-ENVIRONMENT=development
+# Only integration tests (SQLite in-memory)
+uv run pytest tests/integration
-# API Keys (optional, for AI generation workflows)
-# ANTHROPIC_API_KEY=sk-ant-...
-```
+# Only E2E tests (requires DATABASE_URL)
+uv run pytest tests/e2e
-**Production**: In Cloud Run, `DATABASE_URL` is injected from Secret Manager and uses a Unix socket connection to Cloud SQL.
+# With coverage report
+uv run pytest --cov=. --cov-report=html
+```
-**Never commit `.env`!** (Already in `.gitignore`)
+**Coverage target**: 90%+
---
-## Development Workflow
-
-### Running Tests
-
-```bash
-# Run all tests
-uv run pytest
-
-# Run with coverage
-uv run pytest --cov=. --cov-report=html
-
-# Run specific test file
-uv run pytest tests/unit/api/test_routers.py
-
-# Run specific test
-uv run pytest tests/unit/api/test_routers.py::test_get_specs
-```
+## Code Quality
-### Code Formatting
+Both linting and formatting must pass for CI.
```bash
-# Check formatting
+# Check linting
uv run ruff check .
-# Auto-fix issues
+# Auto-fix linting issues
uv run ruff check . --fix
-# Format code
-uv run ruff format .
-```
-
-### Type Checking (Optional)
-
-```bash
-# Install mypy first
-uv sync --extra typecheck
+# Check formatting
+uv run ruff format --check .
-# Then run type checking
-uv run mypy .
+# Auto-format
+uv run ruff format .
```
-**Note**: Type checking is optional. Ruff already catches most issues.
-
-### Pre-commit Hook (Recommended)
-
+**Always run before committing:**
```bash
-# Install pre-commit
-uv pip install pre-commit
-
-# Install git hooks
-pre-commit install
-
-# Now formatting runs automatically on git commit
+uv run ruff check . && uv run ruff format .
```
---
-## Code Standards
-
-See [CLAUDE.md](../CLAUDE.md) for detailed code standards including:
-- Python style guide (PEP 8, Ruff)
-- Type hints requirements
-- Docstring format (Google style)
-- Import ordering
-
----
-
-## Testing
-
-**Coverage Target**: 90%+
-
-**Test Structure**: Mirror source structure (`plots/.../default.py` → `tests/unit/plots/.../test_*.py`)
-
-**Test Naming**: `test_{what_it_does}`
-
-**Fixtures**: Use pytest fixtures in `tests/conftest.py` for reusable test data
-
-See [CLAUDE.md](../CLAUDE.md) for testing standards.
-
----
-
-## Writing Plot Implementations
-
-See [CLAUDE.md](../CLAUDE.md) for:
-- Implementation file template
-- Best practices (validation, defaults, error handling)
-- Anti-patterns to avoid
-
----
-
-## Contributing
-
-### Proposing New Plots
-
-**Option 1: GitHub Issue (Recommended)**
-
-1. Create issue using spec template
-2. Fill in description, applications, data requirements
-3. Add label `plot-idea`
-4. Wait for review and approval
-5. AI generates implementations automatically
+## Database
-**Option 2: Pull Request (Advanced)**
+### Local PostgreSQL
-1. Create spec directory: `plots/{spec-id}/` with spec.md
-2. Implement for at least one library
-3. Add tests
-4. Create PR with previews
-5. Wait for quality check and review
-
-### Contribution Guidelines
-
-**Before Submitting**:
-- [ ] Code passes all tests (`pytest`)
-- [ ] Code is formatted (`ruff format`)
-- [ ] Type hints are present (`mypy`)
-- [ ] Coverage is >90% for new code
-- [ ] Docstrings are complete
-- [ ] Preview image looks good
-
-**PR Description Template**:
-
-```markdown
-## Description
-
-Implements scatter-basic-001 for matplotlib
-
-## Checklist
-
-- [x] Spec file created/updated
-- [x] Implementation code written
-- [x] Tests added (coverage: 95%)
-- [x] Preview generated
-- [ ] Quality check passed (waiting for CI)
-
-## Preview
-
-
-
-## Related Issue
-
-Closes #123
+Set `DATABASE_URL` in `.env`:
```
-
----
-
-## Project Structure
-
-See [CLAUDE.md](../CLAUDE.md) for:
-- Directory structure
-- Implementation file naming (`plots/{spec-id}/implementations/{library}.py`)
-- Test file naming (`tests/unit/plots/test_{spec_id}.py`)
-
----
-
-## Common Tasks
-
-### Add a New Library
-1. Update database (add to `libraries` table)
-2. Create directory structure (`mkdir -p plots/{library}/scatter`)
-3. Implement existing specs
-4. Add tests
-
-### Update an Existing Implementation
-1. Create GitHub issue referencing original
-2. Update implementation file in `plots/{spec-id}/implementations/{library}.py`
-3. Run tests: `uv run pytest tests/unit/plots/test_{spec_id}.py`
-4. Create PR → Quality check runs automatically
-
----
-
-## Debugging Tips
-
-### Database Connection Issues
-
-```bash
-# Test connection
-psql -U pyplots -d pyplots -h localhost
-
-# Check migrations
-uv run alembic current
-uv run alembic history
+DATABASE_URL=postgresql+asyncpg://user:pass@localhost:5432/pyplots
```
-### Import Errors
+### Cloud SQL (development)
-```bash
-# Verify package installation
-uv pip list
-
-# Reinstall
-uv sync --reinstall
+For Cloud SQL access, your IP must be in the authorized networks. Set:
```
-
-### Plot Generation Errors
-
-```python
-# Run implementation standalone
-python plots/scatter-basic/implementations/matplotlib.py
-
-# Add debug prints
-print(f"Data shape: {data.shape}")
-print(f"Columns: {data.columns.tolist()}")
+DATABASE_URL=postgresql+asyncpg://user:pass@CLOUD_SQL_PUBLIC_IP:5432/pyplots
```
-### Test Failures
+### Migrations
```bash
-# Verbose output
-pytest -v
+# Apply all migrations
+uv run alembic upgrade head
-# Show print statements
-pytest -s
+# Create new migration
+uv run alembic revision --autogenerate -m "description"
-# Drop into debugger on failure
-pytest --pdb
+# Check current version
+uv run alembic current
```
---
-## FAQ
-
-### Q: How do I add a completely new plot type?
-
-**A**: Create GitHub issue with spec → AI generates code → Review and merge
-
-### Q: What if I want to use a different plotting style?
-
-**A**: Create style variant (e.g., `ggplot_style.py`, `dark_style.py`)
-
-### Q: How do I test plot generation locally?
-
-**A**: Run implementation file directly: `python plots/scatter-basic/implementations/matplotlib.py`
-
-### Q: Do I need to implement for all libraries?
-
-**A**: No! Start with one library. Others can be added later.
+## Environment Variables
-### Q: How do I handle Python version differences?
+Copy `.env.example` and configure:
-**A**: Only create version-specific files if absolutely necessary (e.g., syntax changes). Prefer single `default.py` that works across 3.10-3.13.
+| Variable | Required | Description |
+|----------|----------|-------------|
+| `DATABASE_URL` | Yes | PostgreSQL connection string |
+| `GCS_BUCKET` | No | GCS bucket for images (default: pyplots-images) |
+| `GOOGLE_APPLICATION_CREDENTIALS` | No | Path to service account JSON |
+| `ENVIRONMENT` | No | `development` or `production` |
---
-## Working with Rules
-
-The project includes versioned rules for AI code generation and quality evaluation.
-
-**Location**: `rules/` directory
-
-**Key Files**:
-- `rules/versions.yaml` - Version configuration
-- `rules/generation/v*/` - Code generation rules (Markdown)
-- `rules/README.md` - Rule system documentation
+## Project Structure
-**Rule States**: draft → active → deprecated → archived
+```
+pyplots/
+├── api/ # FastAPI backend
+├── app/ # React frontend
+├── core/ # Shared business logic
+├── plots/ # Plot specifications and implementations
+├── prompts/ # AI agent prompts
+├── tests/ # Test suite
+│ ├── unit/ # Fast, mocked tests
+│ ├── integration/ # SQLite in-memory
+│ └── e2e/ # Real PostgreSQL
+└── docs/ # Documentation
+```
-**See Also**:
-- [A/B Testing Strategies](./concepts/ab-testing-rules.md)
-- [Rules README](../rules/README.md)
+See [Repository Structure](reference/repository.md) for details.
---
-## Deployment
-
-pyplots runs on **Google Cloud Platform** (europe-west4 region):
-
-| Service | Component | Purpose |
-|---------|-----------|---------|
-| **Cloud Run** | `pyplots-backend` | FastAPI API (serverless, auto-scaling) |
-| **Cloud Run** | `pyplots-frontend` | React SPA via nginx |
-| **Cloud SQL** | PostgreSQL 18 | Database (Unix socket in production) |
-| **Cloud Storage** | `pyplots-images` | Preview images (GCS bucket) |
-| **Secret Manager** | `DATABASE_URL` | Secure credential storage |
-| **Cloud Build** | Triggers | Auto-deploy on push to main |
-
-### Automatic Deployment (Recommended)
-
-Push to `main` triggers Cloud Build automatically:
-- `api/**`, `core/**`, `pyproject.toml` changes → Backend redeploy
-- `app/**` changes → Frontend redeploy
-
-### Manual Deployment
+## Useful Commands
```bash
-# Backend
-gcloud builds submit --config=api/cloudbuild.yaml --project=YOUR_PROJECT_ID
+# Run single test file
+uv run pytest tests/unit/api/test_routers.py
-# Frontend
-gcloud builds submit --config=app/cloudbuild.yaml --project=YOUR_PROJECT_ID
-```
+# Run single test
+uv run pytest tests/unit/api/test_routers.py::test_get_specs -v
-### Key Files
+# Debug test failures
+uv run pytest --pdb
-- `api/cloudbuild.yaml` - Backend build config (Cloud SQL + Secrets)
-- `api/Dockerfile` - Python 3.13 + uv + uvicorn
-- `app/cloudbuild.yaml` - Frontend build config
-- `app/Dockerfile` - Multi-stage: Node build → nginx serve
+# Check database connection
+uv run python -c "from core.database import is_db_configured; print(is_db_configured())"
+```
---
-## Resources
+## Troubleshooting
-**Documentation**:
-- [FastAPI Docs](https://fastapi.tiangolo.com/)
-- [SQLAlchemy Docs](https://docs.sqlalchemy.org/)
-- [Pytest Docs](https://docs.pytest.org/)
-- [Matplotlib Docs](https://matplotlib.org/stable/contents.html)
+### Import errors
+```bash
+uv sync --reinstall
+```
-**Tools**:
-- [uv Package Manager](https://github.com/astral-sh/uv)
-- [Ruff Linter/Formatter](https://github.com/astral-sh/ruff)
-- [Alembic Migrations](https://alembic.sqlalchemy.org/)
+### Database connection issues
+```bash
+# Test connection
+psql -U pyplots -d pyplots -h localhost
----
+# Check migrations
+uv run alembic current
+```
-*For architecture details, see [architecture/](./architecture/)*
+### Test failures
+- Unit/integration tests should work without DATABASE_URL
+- E2E tests are skipped if DATABASE_URL is not set
+- Run `uv run pytest tests/unit -v` to isolate issues
diff --git a/docs/index.md b/docs/index.md
new file mode 100644
index 0000000000..0cfe1a23cf
--- /dev/null
+++ b/docs/index.md
@@ -0,0 +1,85 @@
+# Documentation
+
+Welcome to the pyplots documentation. Start here to find what you're looking for.
+
+---
+
+## Quick Links
+
+| I want to... | Go to |
+|--------------|-------|
+| Understand the project | [Vision](concepts/vision.md) |
+| Contribute plot ideas | [Contributing](contributing.md) |
+| Set up local development | [Development Guide](development.md) |
+| See how automation works | [Workflows](workflows/overview.md) |
+| Look up API endpoints | [API Reference](reference/api.md) |
+| Understand the database | [Database Schema](reference/database.md) |
+| Explore repository structure | [Repository Structure](reference/repository.md) |
+
+---
+
+## Documentation Structure
+
+```
+docs/
+├── index.md # You are here
+├── contributing.md # How to contribute
+├── development.md # Local development setup
+├── concepts/ # Philosophy and design
+│ └── vision.md # Product vision and mission
+├── workflows/ # Process documentation
+│ └── overview.md # GitHub Actions automation
+├── reference/ # Technical details
+│ ├── api.md # REST API endpoints
+│ ├── database.md # PostgreSQL schema
+│ ├── repository.md # Directory structure
+│ ├── tagging-system.md # Tag taxonomy reference
+│ ├── plausible.md # Analytics integration
+│ └── seo.md # SEO configuration
+└── plot-types-catalog.md # Future plot ideas
+```
+
+---
+
+## Concepts
+
+High-level understanding of why things work the way they do.
+
+- **[Vision](concepts/vision.md)** - Product mission, the problem we solve, and how we're different
+
+---
+
+## Workflows
+
+How the automation pipeline works.
+
+- **[Overview](workflows/overview.md)** - Specification and implementation pipelines, label system
+
+---
+
+## Reference
+
+Technical details for development and integration.
+
+- **[API](reference/api.md)** - REST endpoints, request/response formats
+- **[Database](reference/database.md)** - PostgreSQL schema and models
+- **[Repository](reference/repository.md)** - Directory structure and file organization
+- **[Tagging System](reference/tagging-system.md)** - Tag taxonomy (used by spec-create workflow)
+- **[Plausible](reference/plausible.md)** - Analytics integration
+- **[SEO](reference/seo.md)** - Search engine optimization setup
+
+---
+
+## Contributing
+
+- **[Contributing Guide](contributing.md)** - How to propose plot ideas and improve specs
+- **[Development Guide](development.md)** - Local setup, testing, code quality
+
+---
+
+## Other Resources
+
+- **[README](../README.md)** - Project overview and quick start
+- **[CLAUDE.md](../CLAUDE.md)** - AI assistant instructions (for Claude Code)
+- **[copilot-instructions.md](../.github/copilot-instructions.md)** - AI assistant instructions (for GitHub Copilot)
+- **[prompts/](../prompts/)** - AI agent prompts for code generation
diff --git a/docs/plot-types-catalog.md b/docs/plot-types-catalog.md
index 1c0ce542df..f001561b3d 100644
--- a/docs/plot-types-catalog.md
+++ b/docs/plot-types-catalog.md
@@ -820,7 +820,7 @@ A comprehensive catalog of plot types for the pyplots platform. Each plot is imp
**Description:** A quiver plot displays vector fields using arrows positioned at grid points. Each arrow represents a vector at that location, with direction indicating the vector's angle and length proportional to its magnitude.
### streamline-basic 📋
-**Description:** Strömungslinien eines Vektorfelds als glatte Kurven.
+**Description:** Streamlines of a vector field as smooth curves.
### stem-basic ✅
**Description:** A stem plot displays data points as markers connected to a baseline by vertical lines (stems).
@@ -871,7 +871,7 @@ A comprehensive catalog of plot types for the pyplots platform. Each plot is imp
## 30. Printable & Fun
-Druckbare Vorlagen und spielerische Visualisierungen.
+Printable templates and playful visualizations.
### Puzzles & Games
diff --git a/docs/reference/api.md b/docs/reference/api.md
new file mode 100644
index 0000000000..a1bb7d1062
--- /dev/null
+++ b/docs/reference/api.md
@@ -0,0 +1,416 @@
+# 🔌 API Reference
+
+## Overview
+
+The pyplots API is a **FastAPI-based REST API** serving plot data to the frontend.
+
+**Base URL**: `https://api.pyplots.ai`
+
+**Key Principle**: Database is derived from repository via `sync-postgres.yml`. API reads from PostgreSQL.
+
+---
+
+## Core Endpoints
+
+### Specs
+
+#### GET `/specs`
+
+**Purpose**: List all specs with at least one implementation
+
+**Response**:
+```json
+[
+ {
+ "id": "scatter-basic",
+ "title": "Basic Scatter Plot",
+ "description": "A fundamental scatter plot showing...",
+ "tags": {
+ "plot_type": ["scatter"],
+ "domain": ["statistics"],
+ "features": ["basic", "2d"],
+ "data_type": ["numeric"]
+ },
+ "library_count": 9
+ }
+]
+```
+
+---
+
+#### GET `/specs/{spec_id}`
+
+**Purpose**: Get detailed spec with all implementations
+
+**Response**:
+```json
+{
+ "id": "scatter-basic",
+ "title": "Basic Scatter Plot",
+ "description": "A fundamental scatter plot...",
+ "applications": ["Show correlation", "Compare distributions"],
+ "data": ["x: numeric values", "y: numeric values"],
+ "notes": ["Use alpha for overlapping points"],
+ "tags": {
+ "plot_type": ["scatter"],
+ "domain": ["statistics"],
+ "features": ["basic"],
+ "data_type": ["numeric"]
+ },
+ "issue": 42,
+ "suggested": "CoolContributor",
+ "created": "2025-01-10T08:00:00Z",
+ "updated": "2025-01-15T10:30:00Z",
+ "implementations": [
+ {
+ "library_id": "matplotlib",
+ "library_name": "Matplotlib",
+ "preview_url": "https://storage.googleapis.com/pyplots-images/plots/scatter-basic/matplotlib/plot.png",
+ "preview_thumb": "https://storage.googleapis.com/pyplots-images/plots/scatter-basic/matplotlib/plot_thumb.png",
+ "preview_html": null,
+ "quality_score": 92.0,
+ "code": "import matplotlib.pyplot as plt...",
+ "generated_at": "2025-01-15T10:30:00Z",
+ "generated_by": "claude-opus-4-5-20251101",
+ "python_version": "3.13",
+ "library_version": "3.10.0",
+ "review_strengths": ["Clean code structure"],
+ "review_weaknesses": ["Grid could be more subtle"],
+ "review_image_description": "The plot shows...",
+ "review_criteria_checklist": {...},
+ "review_verdict": "APPROVED"
+ }
+ ]
+}
+```
+
+---
+
+#### GET `/specs/{spec_id}/images`
+
+**Purpose**: Get preview images for a spec across all libraries
+
+**Response**:
+```json
+{
+ "spec_id": "scatter-basic",
+ "images": [
+ {
+ "library": "matplotlib",
+ "url": "https://storage.googleapis.com/.../plot.png",
+ "thumb": "https://storage.googleapis.com/.../plot_thumb.png",
+ "html": null
+ }
+ ]
+}
+```
+
+---
+
+### Libraries
+
+#### GET `/libraries`
+
+**Purpose**: List supported plotting libraries
+
+**Response**:
+```json
+{
+ "libraries": [
+ {
+ "id": "matplotlib",
+ "name": "Matplotlib",
+ "version": "3.10.0",
+ "documentation_url": "https://matplotlib.org",
+ "description": "The classic standard..."
+ }
+ ]
+}
+```
+
+---
+
+#### GET `/libraries/{library_id}/images`
+
+**Purpose**: Get all plot images for a library across all specs
+
+**Response**:
+```json
+{
+ "library": "matplotlib",
+ "images": [
+ {
+ "spec_id": "scatter-basic",
+ "library": "matplotlib",
+ "url": "https://storage.googleapis.com/.../plot.png",
+ "thumb": "https://storage.googleapis.com/.../plot_thumb.png",
+ "html": null,
+ "code": "import matplotlib.pyplot as plt..."
+ }
+ ]
+}
+```
+
+---
+
+### Plots Filter
+
+#### GET `/plots/filter`
+
+**Purpose**: Filter plots with faceted counts for all filter categories
+
+**Query Parameters** (combinable):
+- `lib` - Library filter (matplotlib, seaborn, etc.)
+- `spec` - Spec ID filter
+- `plot` - Plot type tag filter
+- `data` - Data type tag filter
+- `dom` - Domain tag filter
+- `feat` - Features tag filter
+
+**Filter Logic**:
+- Comma-separated values: OR (`lib=matplotlib,seaborn`)
+- Multiple params same name: AND (`lib=matplotlib&lib=seaborn`)
+- Different categories: AND (`lib=matplotlib&plot=scatter`)
+
+**Response**:
+```json
+{
+ "total": 42,
+ "images": [
+ {
+ "spec_id": "scatter-basic",
+ "library": "matplotlib",
+ "quality": 92,
+ "url": "https://storage.googleapis.com/.../plot.png",
+ "thumb": "https://storage.googleapis.com/.../plot_thumb.png",
+ "html": null
+ }
+ ],
+ "counts": {
+ "lib": {"matplotlib": 5, "seaborn": 3},
+ "spec": {"scatter-basic": 2},
+ "plot": {"scatter": 10},
+ "data": {"numeric": 15},
+ "dom": {"statistics": 8},
+ "feat": {"basic": 12}
+ },
+ "globalCounts": {...},
+ "orCounts": [...]
+}
+```
+
+---
+
+### Stats
+
+#### GET `/stats`
+
+**Purpose**: Platform statistics
+
+**Response**:
+```json
+{
+ "specs": 42,
+ "plots": 378,
+ "libraries": 9
+}
+```
+
+---
+
+### Download
+
+#### GET `/download/{spec_id}/{library}`
+
+**Purpose**: Download plot image (proxy to avoid CORS)
+
+**Response**: PNG image file with `Content-Disposition: attachment`
+
+---
+
+### Health
+
+#### GET `/`
+
+**Purpose**: Root endpoint
+
+**Response**:
+```json
+{
+ "message": "Welcome to pyplots API",
+ "version": "0.2.0",
+ "docs": "/docs",
+ "health": "/health"
+}
+```
+
+---
+
+#### GET `/health`
+
+**Purpose**: Health check for Cloud Run
+
+**Response**:
+```json
+{
+ "status": "healthy",
+ "service": "pyplots-api",
+ "version": "0.2.0"
+}
+```
+
+---
+
+## SEO Endpoints
+
+### GET `/sitemap.xml`
+
+**Purpose**: Dynamic XML sitemap for search engines
+
+Includes: root, catalog, all specs with implementations, all implementation pages.
+
+---
+
+### GET `/seo-proxy/`
+
+**Purpose**: Bot-optimized home page with og:tags
+
+Used by nginx to serve correct meta tags to social media bots.
+
+---
+
+### GET `/seo-proxy/catalog`
+
+**Purpose**: Bot-optimized catalog page
+
+---
+
+### GET `/seo-proxy/{spec_id}`
+
+**Purpose**: Bot-optimized spec overview page with collage og:image
+
+---
+
+### GET `/seo-proxy/{spec_id}/{library}`
+
+**Purpose**: Bot-optimized implementation page with branded og:image
+
+---
+
+## OG Image Endpoints
+
+All endpoints are under `/og/` prefix.
+
+### GET `/og/home.png`
+
+**Purpose**: OG image for home page (with tracking)
+
+---
+
+### GET `/og/catalog.png`
+
+**Purpose**: OG image for catalog page
+
+---
+
+### GET `/og/{spec_id}.png`
+
+**Purpose**: Collage OG image for spec overview (2x3 grid of top implementations)
+
+---
+
+### GET `/og/{spec_id}/{library}.png`
+
+**Purpose**: Branded OG image for implementation (1200x630 with pyplots.ai header)
+
+---
+
+## Proxy Endpoints
+
+### GET `/proxy/html`
+
+**Purpose**: Proxy HTML from GCS with size reporting script injection
+
+**Query Parameters**:
+- `url` - GCS URL (must be from `pyplots-images` bucket)
+- `origin` - Target origin for postMessage (optional)
+
+Used to load interactive plots (plotly, bokeh, altair) in iframes with dynamic sizing.
+
+---
+
+## Error Responses
+
+### Standard Error Format
+
+```json
+{
+ "detail": "Spec not found"
+}
+```
+
+### HTTP Status Codes
+
+| Status | Description |
+|--------|-------------|
+| 200 | Success |
+| 400 | Bad request (invalid parameters) |
+| 404 | Resource not found |
+| 502 | External service error (GCS) |
+| 503 | Database not available |
+
+---
+
+## Caching
+
+### In-Memory Cache
+
+API uses in-memory caching with TTL:
+- Stats: 5 min
+- Specs list: 2 min
+- Individual specs: 2 min
+- Filter results: 30 sec
+
+### HTTP Cache Headers
+
+```http
+Cache-Control: public, max-age=120, stale-while-revalidate=600
+```
+
+Applied to:
+- `/libraries` - 5 min
+- `/stats` - 5 min
+- `/specs` - 2 min
+- `/specs/{spec_id}` - 2 min
+- `/plots/filter` - 30 sec
+
+---
+
+## CORS Configuration
+
+**Allowed Origins**:
+- `https://pyplots.ai`
+- `http://localhost:*` (development)
+
+**Allowed Methods**: All
+
+---
+
+## GZip Compression
+
+Responses > 500 bytes are compressed with GZip.
+
+Example: `/plots/filter` response: 301KB → ~40KB compressed.
+
+---
+
+## OpenAPI Documentation
+
+Interactive API documentation available at:
+- **Swagger UI**: `https://api.pyplots.ai/docs`
+- **ReDoc**: `https://api.pyplots.ai/redoc`
+- **OpenAPI JSON**: `https://api.pyplots.ai/openapi.json`
+
+---
+
+*For database schema, see [database.md](./database.md)*
diff --git a/docs/architecture/database.md b/docs/reference/database.md
similarity index 71%
rename from docs/architecture/database.md
rename to docs/reference/database.md
index 465cfc63a5..5f6ddb30c8 100644
--- a/docs/architecture/database.md
+++ b/docs/reference/database.md
@@ -2,9 +2,9 @@
## Overview
-pyplots uses **PostgreSQL** (Cloud SQL) to store metadata about plots, specs, and implementations. The database stores **references and metadata only** - not code or images.
+pyplots uses **PostgreSQL** (Cloud SQL) as the primary data store for the website. The database contains **all data needed to serve the frontend** - specs, implementations (including full code), and metadata.
-**Key Principle**: Lightweight metadata store, not a code repository.
+**Key Principle**: Repository is source of truth, database is derived via `sync-postgres.yml`.
---
@@ -12,33 +12,45 @@ pyplots uses **PostgreSQL** (Cloud SQL) to store metadata about plots, specs, an
| Database | Status | Use Case | When to Consider |
|----------|--------|----------|------------------|
-| **PostgreSQL** | ✅ **Current** | All data: specs, implementations, tags, quality scores, promotion queue | Start here - handles everything |
+| **PostgreSQL** | ✅ **Current** | All data: specs, implementations, tags, quality scores | Start here - handles everything |
| **Google Cloud Storage** | ✅ **Current** | Preview images, user-generated plots | Already implemented |
-| **GitHub** | ✅ **Current** | Code, specs, quality reports (as Issue comments) | Already implemented |
-| **Firestore** | 📋 **Future Optimization** | Multi-dimensional tag queries (5-level hierarchy) | IF tag search becomes performance bottleneck with >10,000 specs |
+| **GitHub** | ✅ **Current** | Code, specs, workflow state (via labels) | Already implemented |
-**Current Approach**: All data in PostgreSQL + GCS + GitHub. This is sufficient for MVP and beyond.
-
-**Future Optimization**: See [Firestore for Advanced Tagging](#future-optimization-firestore-for-advanced-tagging) section at the end of this document.
+**Current Approach**: All data in PostgreSQL + GCS + GitHub.
---
-## What's Stored vs. What's Not
+## What's Stored Where
+
+### ✅ Stored in Database (PostgreSQL)
+
+**Specs:**
+- Full spec content (title, description, applications, data, notes)
+- Tags (JSONB with plot_type, domain, features, data_type)
+- Metadata (created, updated, issue, suggested)
+
+**Implementations:**
+- Full Python source code (`impls.code`)
+- GCS URLs for preview images
+- Quality scores and review feedback
+- Generation metadata (model, workflow run, versions)
+
+**Other:**
+- Library information (name, version, docs URL)
+
+### ✅ Stored in GCS (Google Cloud Storage)
-### ✅ Stored in Database
+- Preview images (PNG, thumbnails)
+- Interactive HTML plots (plotly, bokeh, altair, etc.)
-- Spec metadata (title, description, tags)
-- Implementation metadata (library, variant, quality score)
-- GCS URLs (preview images)
-- Promotion queue (social media posts)
-- Library information
-- Usage analytics (optional)
+### ✅ Stored in GitHub
-### ❌ NOT Stored in Database
+- Source of truth for all code and specs (`plots/` directory)
+- Quality reports (as Issue comments)
+- Workflow state (via labels on Issues/PRs)
+
+### ❌ NOT Stored Anywhere Permanently
-- Plot code (stored in GitHub repository)
-- Preview images (stored in Google Cloud Storage)
-- Quality reports (stored in GitHub Issues as comments)
- User uploaded data (processed in-memory only)
---
@@ -90,7 +102,8 @@ CREATE TABLE libraries
id VARCHAR PRIMARY KEY, -- "matplotlib", "seaborn", "plotly"
name VARCHAR NOT NULL, -- "Matplotlib"
version VARCHAR, -- "3.9.0"
- documentation_url VARCHAR -- "https://matplotlib.org"
+ documentation_url VARCHAR, -- "https://matplotlib.org"
+ description TEXT -- Short library description
);
-- Library-specific implementations
@@ -127,6 +140,11 @@ CREATE TABLE impls
review_strengths VARCHAR[], -- What's good about this implementation
review_weaknesses VARCHAR[], -- What needs improvement
+ -- Extended review data (issue #2845)
+ review_image_description TEXT, -- AI's visual description of the plot
+ review_criteria_checklist JSONB, -- Detailed scoring breakdown
+ review_verdict VARCHAR(20), -- "APPROVED" or "REJECTED"
+
updated_at TIMESTAMP DEFAULT NOW(),
UNIQUE (spec_id, library_id)
@@ -137,7 +155,7 @@ CREATE INDEX idx_impls_spec ON impls (spec_id);
CREATE INDEX idx_impls_library ON impls (library_id);
```
-**Note**: The `tags` and `promotion_queue` tables are planned but not yet implemented.
+**Note**: Tags are stored as JSONB in the `specs` table (not a separate table).
---
@@ -275,7 +293,7 @@ Use **Alembic** for schema migrations:
```bash
# Create new migration
-alembic revision -m "add promotion queue table"
+alembic revision -m "add new column"
# Apply migrations
alembic upgrade head
@@ -454,97 +472,6 @@ await session.execute(f"SELECT * FROM specs WHERE id = '{spec_id}'")
---
-## Future Optimization: Firestore for Advanced Tagging
-
-**Status**: 📋 **Planned** (not currently implemented)
-
-**Current State**: Tags are stored as JSONB in the `specs.tags` column with structured categories (plot_type, domain, features, audience, data_type). This is sufficient for MVP and early growth.
-
-**Future Consideration**: As the platform scales beyond 10,000+ specs with complex multi-dimensional search requirements, consider adding Firestore for advanced tag functionality.
-
----
-
-### Why Firestore Could Help (Future)
-
-**Problem it solves**:
-- Multi-dimensional tag queries (5-level hierarchy: Library → Plot Type → Data Type → Domain → Features)
-- Filtering across multiple dimensions simultaneously (e.g., "matplotlib + timeseries + finance + beginner")
-- Real-time search index updates
-- Automatic scaling for high-volume tag searches
-
-**When to implement**:
-- PostgreSQL tag queries become slow (>500ms for common searches)
-- Need for complex tag hierarchy beyond simple array
-- User feedback requests advanced filtering
-- Catalog grows beyond 10,000 specs
-
----
-
-### Proposed Architecture (When Implemented)
-
-**Data Split**:
-- **PostgreSQL**: Spec metadata, implementation records, quality scores, promotion queue (no change)
-- **Firestore**: Multi-dimensional tags, search keywords, similarity clusters
-
-**Example Document Structure**:
-```javascript
-{
- "plot_id": "matplotlib-scatter-basic-001-default",
- "spec_id": "scatter-basic-001",
- "tags": {
- "library": "matplotlib",
- "plot_type": "scatter",
- "data_type": "tabular",
- "domain": "data-science",
- "features": {"complexity": "beginner", "interactivity": "static"}
- },
- "search_keywords": ["scatter", "matplotlib", "basic", "2d"],
- "confidence_scores": {"overall": 0.89}
-}
-```
-
-**Query Example**:
-```javascript
-// Find all beginner matplotlib plots for data-science
-db.collection('plot_tags')
- .where('tags.library', '==', 'matplotlib')
- .where('tags.domain', '==', 'data-science')
- .where('tags.features.complexity', '==', 'beginner')
- .get();
-```
-
----
-
-### Implementation Checklist (When Ready)
-
-- [ ] Confirm PostgreSQL performance is actually bottleneck
-- [ ] Design detailed Firestore schema (based on actual usage patterns)
-- [ ] Create composite indices for common query combinations
-- [ ] Implement sync mechanism (PostgreSQL → Firestore)
-- [ ] Add consistency checks (daily verification)
-- [ ] Monitor costs (estimated <$1/month for 10K docs)
-- [ ] Migrate existing tags from PostgreSQL to Firestore
-- [ ] Update API to query Firestore for tag searches
-- [ ] Keep PostgreSQL tags as backup/audit trail
-
----
-
-### Cost Estimate (For Future Reference)
-
-**Storage**: 10,000 documents × 3 KB = ~30 MB → <$0.50/month
-**Reads**: 1M reads/month → ~$0.36/month
-**Writes**: 100K writes/month → ~$0.18/month
-**Total**: <$1/month
-
----
-
-**See Also**:
-- **Tag Taxonomy**: `docs/concepts/tagging-system.md`
-- **Tagging Rules**: `rules/generation/v1.0.0-draft/tagging-rules.md`
-- **Auto-Tagging Workflow**: `docs/workflow.md` (Flow 4.5)
-
----
-
## Monitoring
### Key Metrics
diff --git a/docs/architecture/plausible.md b/docs/reference/plausible.md
similarity index 100%
rename from docs/architecture/plausible.md
rename to docs/reference/plausible.md
diff --git a/docs/architecture/repository.md b/docs/reference/repository.md
similarity index 86%
rename from docs/architecture/repository.md
rename to docs/reference/repository.md
index 5137fd5510..60a264d967 100644
--- a/docs/architecture/repository.md
+++ b/docs/reference/repository.md
@@ -18,7 +18,7 @@ plots/{specification-id}/
└── ...
```
-**Key Principle**: The repository contains **only production code and final specs**. Quality reports and workflow state are managed in GitHub Issues. Preview images are stored in GCS.
+**Key Principle**: The repository is the **source of truth** for all code, specs, and quality data. Preview images are stored in GCS. Database is derived via sync.
**Key Benefit**: Per-library metadata files eliminate merge conflicts when multiple implementations are generated in parallel.
@@ -64,23 +64,32 @@ pyplots/
├── prompts/ # AI agent prompts
│ ├── plot-generator.md # Base rules for code generation
│ ├── quality-criteria.md # Quality evaluation criteria
-│ ├── quality-evaluator.md # Multi-LLM evaluation prompt
-│ ├── auto-tagger.md # Automatic tagging
+│ ├── quality-evaluator.md # AI quality evaluation prompt
│ ├── spec-validator.md # Validates plot requests
│ ├── spec-id-generator.md # Assigns spec IDs
-│ └── library/ # Library-specific rules
-│ ├── matplotlib.md
-│ ├── seaborn.md
+│ ├── default-style-guide.md # Default visual style rules
+│ ├── library/ # Library-specific rules (9 files)
+│ │ ├── matplotlib.md
+│ │ ├── seaborn.md
+│ │ └── ...
+│ ├── templates/ # Templates for new specs
+│ │ ├── specification.md
+│ │ └── specification.yaml
+│ └── workflow-prompts/ # Workflow-specific prompts
│ └── ...
│
├── core/ # Shared business logic
│ ├── __init__.py
│ ├── config.py # Configuration (.env-based)
+│ ├── constants.py # Library metadata, constants
+│ ├── images.py # Image processing utilities
+│ ├── utils.py # General utilities
│ ├── database/ # Database layer
│ │ ├── __init__.py
│ │ ├── connection.py # Async connection management
│ │ ├── models.py # SQLAlchemy ORM models
-│ │ └── repositories.py # Repository pattern
+│ │ ├── repositories.py # Repository pattern
+│ │ └── types.py # Custom SQLAlchemy types
│ └── generators/ # Reusable code generators
│ └── plot_generator.py # Plot code generation utilities
│
@@ -105,11 +114,13 @@ pyplots/
│ └── workflow_cli.py # CLI for workflows
│
├── tests/ # Test suite
-│ └── unit/
-│ ├── api/
-│ ├── core/
-│ ├── prompts/
-│ └── workflows/
+│ ├── conftest.py # Shared fixtures
+│ ├── unit/ # Fast, mocked tests
+│ │ ├── api/
+│ │ ├── core/
+│ │ └── ...
+│ ├── integration/ # SQLite in-memory tests
+│ └── e2e/ # Real PostgreSQL tests
│
├── .github/
│ └── workflows/ # GitHub Actions CI/CD
@@ -136,10 +147,11 @@ pyplots/
│ └── upgrade_specs*.py # Spec upgrade utilities
│
├── docs/ # Documentation
-│ ├── architecture/
-│ ├── workflow.md
-│ ├── specs-guide.md
-│ └── development.md
+│ ├── concepts/
+│ ├── reference/
+│ ├── workflows/
+│ ├── contributing.md
+│ └── index.md
│
├── pyproject.toml # Python project config (uv)
├── uv.lock # Dependency lock file
@@ -171,12 +183,12 @@ plots/{specification-id}/
```
**Characteristics**:
-- ✅ Self-contained (spec + metadata + code together)
+- ✅ Self-contained (spec + metadata + code + quality reports together)
- ✅ Easy to navigate (one folder = one plot type)
- ✅ Synced to PostgreSQL via `sync-postgres.yml`
- ✅ No merge conflicts (per-library metadata files)
+- ✅ Quality reports in `metadata/{library}.yaml` (review section)
- ❌ NO preview images (stored in GCS)
-- ❌ NO quality reports (stored in GitHub Issues)
**Example**: `plots/scatter-basic/` contains everything for the basic scatter plot.
@@ -391,14 +403,17 @@ plt.savefig('plot.png', dpi=300)
**Purpose**: AI agent prompts for code generation and quality evaluation
**Subdirectories**:
-- `templates/` - Templates for new specs (`spec.md`, `metadata.yaml`)
+- `library/` - Library-specific rules (9 files: matplotlib, seaborn, plotly, etc.)
+- `templates/` - Templates for new specs (`specification.md`, `specification.yaml`)
+- `workflow-prompts/` - Workflow-specific prompt templates
**Files**:
- `plot-generator.md` - Base rules for all implementations
- `quality-criteria.md` - Definition of quality
-- `quality-evaluator.md` - Multi-LLM evaluation
-- `auto-tagger.md` - Automatic tagging
-- `library/*.md` - Library-specific rules (9 files)
+- `quality-evaluator.md` - AI quality evaluation
+- `spec-validator.md` - Validates plot requests
+- `spec-id-generator.md` - Assigns spec IDs
+- `default-style-guide.md` - Default visual style rules
---
@@ -508,15 +523,13 @@ Always named by library: `{library}.py`
- **Where**: Google Cloud Storage (`gs://pyplots-images/plots/...`)
- **Why**: Binary files bloat git history
-### ❌ Quality Reports
-- **Where**: GitHub Issues (as bot comments)
-- **Why**: Keeps repo clean, increases transparency
-
### ❌ Secrets
- **Where**: Environment variables, Cloud Secret Manager
- **Why**: Security
- **Note**: `.env.example` shows required variables without values
+**Note**: Quality reports ARE stored in the repository in `metadata/{library}.yaml` (the `review:` section with strengths, weaknesses, criteria_checklist, verdict).
+
---
## Database Sync
@@ -529,9 +542,10 @@ The `sync-postgres.yml` workflow syncs `plots/` to PostgreSQL on push to main:
- Implementation code (full Python source)
- Implementation metadata (quality score, generation info from metadata/*.yaml)
- Preview URLs from per-library metadata files
+- Quality review data (strengths, weaknesses, criteria_checklist, verdict)
**Source of Truth**: The `plots/` directory is authoritative. Database is derived.
---
-*For implementation details, see [specs-guide.md](../specs-guide.md) and [development.md](../development.md)*
+*For contribution guidelines, see [contributing.md](../contributing.md)*
diff --git a/docs/architecture/seo.md b/docs/reference/seo.md
similarity index 100%
rename from docs/architecture/seo.md
rename to docs/reference/seo.md
diff --git a/docs/concepts/tagging-system.md b/docs/reference/tagging-system.md
similarity index 100%
rename from docs/concepts/tagging-system.md
rename to docs/reference/tagging-system.md
diff --git a/docs/specs-guide.md b/docs/specs-guide.md
deleted file mode 100644
index 36f0d0b77e..0000000000
--- a/docs/specs-guide.md
+++ /dev/null
@@ -1,121 +0,0 @@
-# Plot Specification Guide
-
-## Overview
-
-Plot specifications are **library-agnostic descriptions** of what a plot should show. They live in `plots/{spec-id}/spec.md`.
-
-**Key Principle**: A spec describes **WHAT** to visualize, not **HOW** to implement it.
-
----
-
-## File Location
-
-Each spec lives in its own directory:
-```
-plots/{spec-id}/
-├── spec.md ← Specification file
-├── metadata.yaml ← Tags, generation info
-└── implementations/ ← Library code
-```
-
----
-
-## Spec Format
-
-```markdown
-# {spec-id}: {Title}
-
-## Description
-
-{2-4 sentences: What does this plot show? When should you use it?}
-
-## Applications
-
-- {Realistic scenario 1 with domain context}
-- {Realistic scenario 2}
-- {Realistic scenario 3}
-
-## Data
-
-- `{column}` ({type}) - {what it represents}
-- `{column}` ({type}) - {what it represents}
-- Size: {recommended data size}
-- Example: {dataset reference or description}
-
-## Notes
-
-- {Optional implementation hints or special requirements}
-```
-
----
-
-## Sections
-
-### Title
-Format: `# {spec-id}: {Human-Readable Title}`
-
-Example: `# scatter-basic: Basic Scatter Plot`
-
-### Description
-2-4 sentences (prose text) explaining:
-- What the plot visualizes
-- When to use it
-- What makes it useful
-
-### Applications
-3-4 realistic scenarios with domain context (finance, science, marketing, etc.)
-
-### Data
-Simple list format:
-- Required columns with types (numeric, categorical, datetime)
-- Recommended data size
-- Example dataset reference
-
-### Notes (Optional)
-Implementation hints, visual preferences, or special requirements.
-
----
-
-## Spec ID Naming
-
-Format: `{type}-{variant}` or `{type}-{variant}-{modifier}`
-
-Examples:
-- `scatter-basic` - Simple scatter plot
-- `scatter-color-groups` - Scatter with color-coded groups
-- `bar-grouped-horizontal` - Horizontal grouped bars
-
-Rules:
-- Lowercase only
-- Hyphens as separators
-- Descriptive names (no numbers needed)
-
----
-
-## Workflow
-
-1. **User creates GitHub Issue** with plot idea
-2. **Bot assigns spec ID** and validates request
-3. **Maintainer adds `approved` label**
-4. **AI generates spec file** in `plots/{spec-id}/spec.md`
-5. **AI generates implementations** for all 9 libraries
-6. **Quality check** runs automatically
-7. **Auto-merge** if quality passes
-
----
-
-## Writing Good Specs
-
-### DO
-- Be specific about data requirements
-- Use realistic applications with domain context
-- Keep description concise (2-4 sentences)
-
-### DON'T
-- Include library-specific details
-- Add quality criteria (handled by central prompts)
-- Over-specify styling (AI decides based on style guide)
-
----
-
-*See `prompts/templates/spec.md` for the full template.*
diff --git a/docs/workflow.md b/docs/workflow.md
deleted file mode 100644
index 3e6b9543ef..0000000000
--- a/docs/workflow.md
+++ /dev/null
@@ -1,562 +0,0 @@
-# 🔄 pyplots Automation Workflow
-
-## Overview
-
-pyplots is a **community-driven, AI-powered platform** that automatically discovers, generates, tests, and maintains Python plotting examples. This document describes the high-level automation architecture that makes this possible.
-
-### Philosophy
-
-- **Start Simple, Scale Intelligently**: Begin with basics (Twitter, matplotlib), expand based on learnings
-- **Cost-Conscious Design**: Leverage existing subscriptions and smart resource allocation
-- **Quality Over Quantity**: Multi-LLM validation ensures only excellent examples go live
-- **Community-Driven**: Ideas from the data science community, curated by AI, approved by humans
-- **Always Current**: Event-based maintenance keeps examples updated with latest libraries and LLMs
-
-### Key Principles
-
-1. **Images in GCS, Code in GitHub**: Plot PNGs stored in Google Cloud Storage, source code version-controlled
-2. **Multi-Version Support**: All plots tested across Python 3.11+ (3.11, 3.12, 3.13, 3.13 primary)
-3. **Hybrid Automation**: AI handles routine tasks, humans approve critical decisions
-4. **Standard Datasets**: Use well-known datasets (pandas iris, seaborn tips, kaggle) for realistic previews
-5. **Event-Based Optimization**: Update plots when LLM/library versions change, not on fixed schedules
-
----
-
-## System Architecture
-
-```mermaid
-graph TB
- subgraph "Discovery & Input"
- SM[Social Media
Twitter, Reddit, GitHub, ArXiv]
- GI[GitHub Issues
Community Ideas]
- end
-
- subgraph "Orchestration Layer"
- N8N[n8n Cloud
Workflow Engine]
- end
-
- subgraph "AI Processing"
- CCM[Claude Code Max
Primary AI]
- VTX[Vertex AI
Multi-LLM Critical Decisions]
- end
-
- subgraph "Testing & Generation"
- GHA[GitHub Actions
Multi-Version Tests + Preview Gen]
- DS[Standard Datasets
pandas, seaborn, kaggle]
- end
-
- subgraph "Storage & Deployment"
- GH[GitHub Repository
Code Storage]
- GCS[Google Cloud Storage
Image Hosting]
- SQL[Cloud SQL
Metadata]
- CR[Cloud Run
Web Platform]
- end
-
- SM --> N8N
- GI --> N8N
- N8N --> CCM
- CCM --> GHA
- GHA --> DS
- GHA --> GCS
- GHA --> GH
- CCM --> VTX
- VTX --> SQL
- GCS --> CR
- SQL --> CR
- GH --> CR
-```
-
-### Component Responsibilities
-
-| Component | Purpose | Usage Notes |
-|-----------|---------|-------------|
-| **GitHub Actions** | Code generation, testing, preview gen, quality checks, deployment | See `.github/workflows/` for implementation |
-| **n8n Cloud Pro** | Social media monitoring, posting, issue triage, maintenance scheduling | External service integration |
-| **Claude Code Max** | Code generation, routine evaluation, post content | Primary AI workload |
-| **Vertex AI (Multi-LLM)** | Critical quality decisions | Multi-LLM consensus for complex plots |
-| **Google Cloud Storage** | PNG hosting with lifecycle management | Preview images + generated plots |
-| **Cloud SQL (PostgreSQL)** | Metadata, tags, quality scores, promotion queue | All structured data |
-| **X (Twitter) API** | Social media posting | Max 2 posts/day |
-
-**Workflow files**: See `.github/workflows/` for all automation implementations (ci-*, bot-*, gen-*, util-*).
-
----
-
-## Core Automation Flows
-
-### Flow 1: Discovery & Ideation
-n8n monitors social media daily → AI extracts plot ideas → Creates GitHub issues with draft specs → Human reviews and approves
-
-### Flow 2: Specification Creation (with Approval Gate)
-
-User adds `spec-request` label to issue → **`spec-create.yml`** runs:
-
-1. Creates branch: `specification/{specification-id}`
-2. Claude generates: `plots/{specification-id}/specification.md` + `specification.yaml`
-3. Creates PR: `specification/{specification-id}` → `main`
-4. Posts analysis comment, waits for approval
-
-```
-Issue + [spec-request] label
- ↓
-spec-create.yml
- ├─ Creates branch: specification/scatter-basic
- ├─ Creates: plots/scatter-basic/specification.md
- ├─ Creates: plots/scatter-basic/specification.yaml
- └─ Creates PR → main (waits for approval)
- ↓
-Maintainer adds [approved] label
- ↓
-spec-create.yml (merge job)
- ├─ Merges PR to main
- ├─ Adds [spec-ready] label
- └─ sync-postgres.yml triggers
-```
-
-**Specification is now in main, ready for implementations.**
-
-### Flow 3: Implementation Generation
-
-Implementations run **independently** - each library gets its own workflow:
-
-**Triggers:**
-- `generate:{library}` label on issue (e.g., `generate:matplotlib`)
-- `workflow_dispatch` for manual triggering
-- `bulk-generate.yml` for batch operations
-
-**Process per library:**
-1. **`impl-generate.yml`** creates branch: `implementation/{specification-id}/{library}`
-2. Claude generates code, tests, uploads preview to GCS staging
-3. Creates PR: `implementation/{specification-id}/{library}` → `main`
-4. Triggers `impl-review.yml`
-
-```
-Issue + [generate:matplotlib] label OR workflow_dispatch
- ↓
-impl-generate.yml
- ├─ Creates branch: implementation/scatter-basic/matplotlib
- ├─ Generates: plots/scatter-basic/implementations/matplotlib.py
- ├─ Uploads to GCS staging
- └─ Creates PR → main, triggers impl-review.yml
-```
-
-**Key benefit**: Each library runs independently - no single point of failure!
-
-### Flow 4: Multi-Version Testing
-PR created → `ci-plottest.yml` runs tests across Python 3.11+ → Reports results
-
-### Flow 5: AI Review
-PR created → **`impl-review.yml`** runs:
-
-1. Downloads plot images from GCS staging
-2. Claude evaluates: Spec ↔ Code ↔ Preview
-3. Posts review comment with score
-4. Adds labels: `quality:XX`, `ai-approved` OR `ai-rejected`
-
-```
-impl-review.yml
- ├─ Score ≥90 → [ai-approved] → triggers impl-merge.yml
- └─ Score <90 → [ai-rejected] → triggers impl-repair.yml
-```
-
-### Flow 6: Repair Loop (max 3 attempts)
-PR labeled `ai-rejected` → **`impl-repair.yml`** triggers:
-
-1. Reads AI feedback from PR comments
-2. Claude fixes the implementation
-3. Re-uploads to GCS staging
-4. Re-triggers `impl-review.yml`
-5. After 3 failures: `not-feasible` label
-
-**Note**: Each library repairs independently - matplotlib can be on attempt 3 while seaborn already merged!
-
-### Flow 7: Auto-Merge
-
-PR labeled `ai-approved` → **`impl-merge.yml`** triggers:
-
-1. Squash-merges PR to main
-2. Creates `metadata/{library}.yaml` with quality score and generation info
-3. Promotes GCS images: staging → production
-4. Updates issue labels: `impl:{library}:done`
-5. `sync-postgres.yml` triggers automatically
-
-```
-impl-merge.yml
- ├─ Squash merge PR → main
- ├─ Creates: plots/scatter-basic/metadata/matplotlib.yaml
- ├─ Promotes GCS: staging → production
- └─ sync-postgres.yml triggers (database updated)
-```
-
-### Flow 8: Deployment & Maintenance
-Merged to main → Deploy to Cloud Run → Publicly visible on website → Event-based maintenance (LLM/library updates) → A/B test improvements
-
-### Flow 9: Social Media Promotion
-Deployed plot → Added to promotion queue (prioritized by quality score) → n8n posts 2x/day at 10 AM & 3 PM CET → Claude generates content → Posts to X with preview image
-
----
-
-## Decoupled Architecture
-
-The new architecture separates specification and implementation processes:
-
-**Benefits:**
-- **No single point of failure** - Each library runs independently
-- **Specifications can land in main without implementations**
-- **Partial implementations OK** - 6/9 done = fine
-- **No merge conflicts** - Per-library metadata files
-- **Flexible triggers** - Labels for single, dispatch for bulk
-- **PostgreSQL synced on every merge to main**
-
-### Implementation Lifecycle
-
-```mermaid
-graph LR
- A[Issue + generate:matplotlib] --> B[impl-generate.yml]
- B --> C[PR created]
- C --> D[impl-review.yml]
- D -->|Score ≥90| E[ai-approved]
- D -->|Score <90| F[ai-rejected]
- F -->|Attempt <3| G[impl-repair.yml]
- G --> D
- F -->|Attempt =3| H[not-feasible]
- E --> I[impl-merge.yml]
- I --> J[merged to main]
- J --> K[impl:matplotlib:done]
-```
-
-### Label System
-
-**Specification Labels:**
-| Label | Meaning |
-|-------|---------|
-| `spec-request` | New specification request |
-| `spec-update` | Update existing specification |
-| `spec-ready` | Specification merged to main |
-
-**Implementation Labels:**
-| Label | Meaning |
-|-------|---------|
-| `generate:{library}` | Trigger generation (e.g., `generate:matplotlib`) |
-| `impl:{library}:pending` | Generation in progress |
-| `impl:{library}:done` | Implementation merged to main |
-| `impl:{library}:failed` | Max retries exhausted |
-
-**PR Labels:**
-| Label | Meaning |
-|-------|---------|
-| `ai-approved` | Passed review (score ≥90, or ≥50 after 3 attempts) |
-| `ai-rejected` | Failed review (score <90), triggers repair loop |
-| `ai-attempt-1/2/3` | Retry counter |
-| `quality-poor` | Score <50, needs fundamental fixes |
-| `quality:XX` | Quality score (e.g., `quality:92`) |
-
-**Quality Workflow:**
-- **≥ 90**: ai-approved, merged immediately
-- **< 90**: ai-rejected, repair loop (up to 3 attempts)
-- **After 3 attempts**: ≥ 50 → merge, < 50 → close PR and regenerate
-
-### Bulk Operations
-
-Use `bulk-generate.yml` for batch operations:
-
-```bash
-# All libraries for one spec:
-workflow_dispatch: specification_id=scatter-basic, library=all
-
-# One library across all specs:
-workflow_dispatch: specification_id=all, library=matplotlib
-```
-
-**Concurrency**: Max 3 parallel implementation workflows globally.
-
----
-
-## Flow Integration
-
-```mermaid
-graph TD
- A[Flow 1: Discovery] -->|GitHub Issue| B{Human Review}
- B -->|Add spec-request| C[Flow 2: spec-create.yml]
- B -->|Rejected| Z[End]
-
- C -->|Creates PR| C1[Specification PR]
- C1 -->|Maintainer adds approved| C2[Merge to main]
- C2 -->|spec-ready label| D[Ready for implementations]
-
- D -->|Add generate:lib label| E[Flow 3: impl-generate.yml]
- E -->|Creates PR| F[Implementation PR]
- F --> G{Flow 5: impl-review.yml}
-
- G -->|Score ≥90| H[ai-approved]
- G -->|Score <90| I[ai-rejected]
-
- I --> J{Attempts < 3?}
- J -->|Yes| K[Flow 6: impl-repair.yml]
- K --> G
- J -->|No| L[not-feasible]
-
- H --> M[Flow 7: impl-merge.yml]
- M -->|Merge to main| M1[Creates metadata/lib.yaml]
- M1 --> M2[sync-postgres.yml]
- M2 --> N[🌐 Publicly Visible]
-
- L --> L1[impl:lib:failed label]
-
- N --> P[Flow 9: Promotion Queue]
- P --> Q{Daily Limit?}
- Q -->|< 2 posts| R[Post to X]
- Q -->|Limit| S[Wait]
- R --> Z
- S -.->|Next day| Q
-
- style A fill:#e1f5ff
- style C fill:#ffe4b5
- style E fill:#fff4e1
- style G fill:#f0e1ff
- style M fill:#98FB98
- style N fill:#90EE90
- style L fill:#FF6B6B
- style P fill:#E6E6FA
- style R fill:#98FB98
-```
-
----
-
-## Decision Framework
-
-### AI Decides Automatically
-
-✅ **Similar plots** (high semantic similarity to existing specs)
-✅ **Routine quality checks** (standard visualizations)
-✅ **Tag generation** (categorization and clustering)
-✅ **Version compatibility** detection (which Python versions supported)
-✅ **Standard optimizations** (code formatting, best practices)
-
-### Human Approval Required
-
-⚠️ **New plot types** (low similarity to existing specs)
-⚠️ **Complex visualizations** (3D, animations, interactive)
-⚠️ **Multi-LLM disagreement** (no majority consensus)
-⚠️ **Breaking changes** (major spec modifications)
-
-### Approval Mechanism
-
-Via **GitHub Issue Labels**:
-- `approved` → Proceed to code generation
-- `rejected` → Close issue
-- `needs-revision` → Request changes from proposer
-
----
-
-## Resource Management
-
-### Leveraging Existing Subscriptions
-
-| Resource | Subscription | Usage | Monthly Cost |
-|----------|-------------|-------|--------------|
-| **GitHub Pro** | ✅ Active | Actions (testing + preview gen) | Included |
-| **n8n Cloud Pro** | ✅ Active | Workflow orchestration | Included (subscribed) |
-| **Claude Code Max** | ✅ Active | Primary AI workload | Included |
-| **Google Cloud** | Pay-as-you-go | GCS, Cloud SQL, Cloud Run | Variable |
-| **Vertex AI** | Pay-per-use | Multi-LLM critical decisions only | Minimal |
-
-### Cost Optimization Strategies
-
-1. **Smart AI Usage**:
- - Claude Code Max for routine work (already subscribed)
- - Vertex AI multi-LLM only for critical decisions
- - Avoid redundant evaluations
-
-2. **Efficient Storage**:
- - Path structure: `plots/{spec-id}/{library}/plot.png`
- - Thumbnails: `plot_thumb.png` (600px width) for gallery views
- - Images never in git repository
- - Only latest version stored (no version history)
-
-3. **Smart Scheduling**:
- - Event-based maintenance (not daily scheduled)
- - Batch processing when possible
- - GitHub Actions matrix for parallel testing
-
-4. **Data Efficiency**:
- - Standard datasets (no AI generation needed)
- - Small CSVs in repo acceptable
- - Reuse datasets across similar plots
-
----
-
-## Data & Testing Strategy
-
-### Sample Data for Previews
-
-**Critical Principle**: All plot code must be **100% standalone and deterministic**
-
-**Data Embedding Strategy**:
-
-1. **Small datasets** - Hardcoded dict/list directly in code (recommended)
-2. **Standard datasets** - Use `sns.load_dataset('iris')` or similar (always produces same data)
-3. **AI-generated data** - AI generates once with fixed seed, then hardcoded
-4. **Seeded random** - Use `np.random.seed(42)` for reproducibility
-
-**Why This Matters**:
-- Same code must produce same image every single time
-- Quality reviewers must see the exact image that will be deployed
-- Users must see the exact image shown in previews
-- No surprises, no randomness, complete reproducibility
-
-**Code Requirements**:
-- ✅ Self-contained (no external file loading)
-- ✅ Deterministic (same output every run)
-- ✅ Includes explanation text as docstring
-- ✅ Sample data embedded directly in code
-- ❌ No CSV file loading
-- ❌ No random data without fixed seed
-- ❌ No external API calls
-
-### Multi-Version Testing
-
-**Python Versions Supported**: 3.11+ (tested on 3.11, 3.12, 3.13, 3.13)
-
-**Primary Version**: Python 3.13 (required to pass, generates plot images)
-
-**Testing Infrastructure**: GitHub Actions matrix tests all Python versions in parallel. See `ci-plottest.yml`.
-
-**Test Triggers**:
-- On Pull Request creation
-- Before Quality Assurance flow
-- Not on every commit (saves resources)
-
-**Version Compatibility Documentation**:
-- Code optimized for Python 3.13 (newest)
-- Older versions (3.11-3.13) run as compatibility tests
-- Failures in older versions don't block the PR
-
-**Test Requirements**:
-- Python 3.13 tests must pass (primary)
-- Plot images only generated with Python 3.13
-- Older version failures logged but don't block merge
-
----
-
-## Phased Rollout
-
-### Phase 1: MVP (Current Focus)
-
-**Scope**:
-- 🎯 **Monitoring**: Twitter only
-- 📊 **Libraries**: All 8 supported (matplotlib, seaborn, plotly, bokeh, altair, plotnine, pygal, highcharts)
-- 🐍 **Python**: 3.13+ (primary), tested on 3.11-3.13
-- ✋ **Approval**: Manual for all new plots
-- ✅ **Quality**: Basic Claude evaluation
-- 📱 **Promotion**: X (Twitter) posting with 2/day limit
-
-**Supported Libraries**:
-| Library | Strength |
-|---------|----------|
-| matplotlib | The classic standard, maximum flexibility |
-| seaborn | Statistical visualizations, beautiful defaults |
-| plotly | Interactive web plots, dashboards, 3D |
-| bokeh | Interactive, streaming data, large datasets |
-| altair | Declarative/Vega-Lite, elegant exploration |
-| plotnine | ggplot2 syntax for R users |
-| pygal | Minimalistic SVG charts |
-| highcharts | Interactive web charts, stock charts |
-| lets-plot | ggplot2 grammar of graphics by JetBrains |
-
-**Goal**: Prove automation pipeline works end-to-end with all libraries
-
----
-
-### Phase 2: Expansion
-
-**Add**:
-- 🎯 **Monitoring**: + Reddit (r/dataisbeautiful, r/Python)
-- 🎯 **Monitoring**: + GitHub Trending/Discussions
-- 🤖 **Approval**: Hybrid (auto for similar, manual for new)
-- ✅ **Quality**: Multi-LLM for critical decisions
-- 📱 **Promotion**: + LinkedIn posts for professional audience
-
-**Goal**: Scale content production and improve automation
-
----
-
-### Phase 3: Full Automation
-
-**Add**:
-- 🎯 **Monitoring**: + ArXiv papers (academic visualizations)
-- 📊 **Libraries**: + specialized libraries as needed
-- 🤖 **Approval**: Intelligent auto-approval (high confidence)
-- 🔄 **Maintenance**: Proactive optimization suggestions
-- 🌐 **Community**: Public spec submissions via issues
-- 📱 **Promotion**: + Reddit posts (r/dataisbeautiful, r/Python), cross-platform coordination
-
-**Goal**: Comprehensive, self-maintaining plot library
-
----
-
-## Rule Versioning & Testing
-
-**NEW**: The system now includes versioned rules for code generation and quality evaluation.
-
-**Location**: `rules/` directory
-
-**Key Features**:
-- 📋 **Versioned Rules**: Generation rules and quality criteria stored as Markdown (vX.Y.Z)
-- 🧪 **A/B Testing**: Compare rule versions before deploying
-- 📊 **Audit Trail**: Know which rule version generated each plot
-- 🔄 **Rollback**: Instant rollback to previous rules if issues arise
-- 📈 **Scientific Improvement**: Prove new rules are better with data
-
-**Current Status** (Documentation Phase):
-- ✅ Rule templates created (rules/templates/)
-- ✅ Initial draft rules (rules/generation/v1.0.0-draft/)
-- ⏳ Automation not yet implemented
-- ⏳ A/B testing framework planned
-
-**Integration with Workflow**:
-- When automation is implemented, all code generation will use specific rule versions
-- Quality evaluation will reference versioned criteria
-- Rule improvements will be A/B tested before deployment
-
-**See Also**:
-- [A/B Testing Strategies](docs/concepts/ab-testing-rules.md)
-- [Claude Skill Concept](docs/concepts/claude-skill-plot-generation.md)
-
----
-
-## Summary
-
-This workflow ensures:
-
-✅ **Decoupled Architecture**:
- - Specification and implementation processes run independently
- - No single point of failure
- - Specifications can land in main without implementations
- - Partial implementations OK (6/9 done = fine)
- - Per-library metadata files (no merge conflicts!)
-
-✅ **Flexible Triggers**:
- - Labels (`generate:matplotlib`) for single implementations
- - `workflow_dispatch` for manual control
- - `bulk-generate.yml` for batch operations
- - Max 3 parallel implementations globally
-
-✅ **Multi-Layer Quality Control**:
- - AI review with vision (code + image evaluation)
- - Self-repair loop (max 3 attempts per library)
- - Quality scores tracked in metadata
- - Feedback-driven optimization on rejection
-
-✅ **PostgreSQL Synced on Every Merge**:
- - `sync-postgres.yml` triggers on push to main
- - Database always reflects repository state
-
-✅ **Only High-Quality Plots on Website**: Failed attempts never publicly visible
-✅ **Automated Marketing**: Queue-based social media promotion with smart rate limiting (max 2 posts/day)
-✅ **Cost-Conscious** design leveraging existing subscriptions
-✅ **Smart Storage** with GCS staging/production flow
-✅ **Deterministic & Reproducible**: Same code = same image every time
-✅ **Community-Driven** with AI curation and human oversight
-
-The system is designed to **scale from MVP to full automation** while maintaining the highest quality standards, controlling costs, and automatically promoting the best content to the community.
diff --git a/docs/workflows/overview.md b/docs/workflows/overview.md
new file mode 100644
index 0000000000..517c78b9ed
--- /dev/null
+++ b/docs/workflows/overview.md
@@ -0,0 +1,150 @@
+# Workflow Overview
+
+## How pyplots Automation Works
+
+pyplots uses GitHub Actions to automate the entire plot lifecycle: from specification creation to implementation generation, quality review, and deployment.
+
+---
+
+## The Two Main Pipelines
+
+### 1. Specification Pipeline
+
+```
+Issue + [spec-request] label
+ |
+ v
+spec-create.yml
+ |-- Creates branch: specification/{spec-id}
+ |-- Generates: specification.md + specification.yaml
+ |-- Creates PR --> main
+ |-- Posts analysis comment
+ |
+ v (maintainer adds [approved] label to Issue)
+ |
+spec-create.yml (merge job)
+ |-- Merges PR to main
+ |-- Adds [spec-ready] label
+ |-- Triggers sync-postgres.yml
+```
+
+### 2. Implementation Pipeline
+
+```
+Issue + [generate:{library}] label OR workflow_dispatch
+ |
+ v
+impl-generate.yml
+ |-- Creates branch: implementation/{spec-id}/{library}
+ |-- AI generates code
+ |-- Creates metadata/{library}.yaml (initial)
+ |-- Tests execution
+ |-- Uploads preview to GCS staging
+ |-- Creates PR --> main
+ |
+ v
+impl-review.yml
+ |-- AI evaluates code + image
+ |-- Posts review comment with score
+ |-- Updates metadata/{library}.yaml (quality_score, review feedback)
+ |-- Adds [quality:XX] label
+ |
+ |-- Score >= 90 --> [ai-approved] --> impl-merge.yml
+ | |-- Squash merge
+ | |-- Promotes GCS: staging --> production
+ | |-- Triggers sync-postgres.yml
+ |
+ |-- Score < 90 --> [ai-rejected] --> impl-repair.yml (max 3 attempts)
+ |-- Reads AI feedback
+ |-- Fixes implementation
+ |-- Re-triggers impl-review.yml
+```
+
+---
+
+## Label System
+
+### Specification Labels (on Issues)
+
+| Label | Meaning | Set By |
+|-------|---------|--------|
+| `spec-request` | New specification request | User |
+| `spec-update` | Update existing specification | User |
+| `spec-ready` | Specification merged, ready for implementations | Workflow |
+
+### Implementation Labels (on Issues)
+
+| Label | Meaning | Set By |
+|-------|---------|--------|
+| `generate:{library}` | Trigger generation for library | User |
+| `impl:{library}:pending` | Generation in progress | Workflow |
+| `impl:{library}:done` | Implementation merged to main | Workflow |
+| `impl:{library}:failed` | Max retries exhausted | Workflow |
+
+### PR Labels (on Pull Requests)
+
+| Label | Meaning | Set By |
+|-------|---------|--------|
+| `ai-approved` | Quality check passed (score >= 90, or >= 50 after 3 attempts) | Workflow |
+| `ai-rejected` | Quality check failed, triggers repair | Workflow |
+| `ai-attempt-1/2/3` | Retry counter | Workflow |
+| `quality:XX` | Quality score (e.g., quality:92) | Workflow |
+| `quality-poor` | Score < 50, needs fundamental fixes | Workflow |
+
+### Approval Labels
+
+| Label | Meaning | Set By |
+|-------|---------|--------|
+| `approved` | Human approved specification | Maintainer |
+| `rejected` | Human rejected | Maintainer |
+
+---
+
+## Quality Workflow
+
+- **Score >= 90**: Immediately approved and merged
+- **Score < 90**: Repair loop (up to 3 attempts)
+- **After 3 attempts**:
+ - Score >= 50: Merge anyway
+ - Score < 50: Close PR, mark as failed
+
+---
+
+## Key Principles
+
+1. **Decoupled**: Each library runs independently (no single point of failure)
+2. **Partial OK**: 6/9 implementations done = fine
+3. **No merge conflicts**: Per-library metadata files
+4. **Auto-sync**: Database updated on every merge to main
+5. **GCS flow**: staging --> production only after merge
+
+---
+
+## Workflow Files
+
+Located in `.github/workflows/`:
+
+| Workflow | Purpose |
+|----------|---------|
+| `spec-create.yml` | Creates new specifications |
+| `spec-update.yml` | Updates existing specifications |
+| `impl-generate.yml` | Generates single implementation |
+| `impl-review.yml` | AI quality review |
+| `impl-repair.yml` | Fixes rejected implementations |
+| `impl-merge.yml` | Merges approved PRs |
+| `bulk-generate.yml` | Batch implementation generation |
+| `sync-postgres.yml` | Syncs plots/ to database |
+
+---
+
+## Bulk Operations
+
+```bash
+# All libraries for one spec:
+gh workflow run bulk-generate.yml -f specification_id=scatter-basic -f library=all
+
+# One library across all specs:
+gh workflow run bulk-generate.yml -f specification_id=all -f library=matplotlib
+```
+
+**Concurrency limit**: Max 3 parallel implementations globally.
diff --git a/plots/alluvial-basic/specification.yaml b/plots/alluvial-basic/specification.yaml
index 8432b133ed..2010e2bfde 100644
--- a/plots/alluvial-basic/specification.yaml
+++ b/plots/alluvial-basic/specification.yaml
@@ -11,7 +11,7 @@ issue: 1878
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- alluvial
diff --git a/plots/andrews-curves/specification.yaml b/plots/andrews-curves/specification.yaml
index 39da086d23..3dac679b76 100644
--- a/plots/andrews-curves/specification.yaml
+++ b/plots/andrews-curves/specification.yaml
@@ -11,7 +11,7 @@ issue: 2859
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- line
diff --git a/plots/area-stacked/specification.yaml b/plots/area-stacked/specification.yaml
index f07236c4f5..ee4bef11dc 100644
--- a/plots/area-stacked/specification.yaml
+++ b/plots/area-stacked/specification.yaml
@@ -11,7 +11,7 @@ issue: 2022
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- area
diff --git a/plots/bar-3d/specification.yaml b/plots/bar-3d/specification.yaml
index 8cbc3038d0..ff1844d7b1 100644
--- a/plots/bar-3d/specification.yaml
+++ b/plots/bar-3d/specification.yaml
@@ -11,7 +11,7 @@ issue: 2857
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- bar
diff --git a/plots/bar-diverging/specification.yaml b/plots/bar-diverging/specification.yaml
index 3c7b074c93..f0174fcc2a 100644
--- a/plots/bar-diverging/specification.yaml
+++ b/plots/bar-diverging/specification.yaml
@@ -11,7 +11,7 @@ issue: 2009
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- bar
diff --git a/plots/bar-error/specification.yaml b/plots/bar-error/specification.yaml
index d226e7a7be..98a3175409 100644
--- a/plots/bar-error/specification.yaml
+++ b/plots/bar-error/specification.yaml
@@ -11,7 +11,7 @@ issue: 2376
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- bar
diff --git a/plots/bar-feature-importance/specification.yaml b/plots/bar-feature-importance/specification.yaml
index 93261be854..7a41fd9c83 100644
--- a/plots/bar-feature-importance/specification.yaml
+++ b/plots/bar-feature-importance/specification.yaml
@@ -11,7 +11,7 @@ issue: 2276
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- bar
diff --git a/plots/bar-grouped/specification.yaml b/plots/bar-grouped/specification.yaml
index 31e655999a..ef8eea729a 100644
--- a/plots/bar-grouped/specification.yaml
+++ b/plots/bar-grouped/specification.yaml
@@ -11,7 +11,7 @@ issue: 1822
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- bar
diff --git a/plots/bar-horizontal/specification.yaml b/plots/bar-horizontal/specification.yaml
index d9c0ad9a90..7abbffe0c5 100644
--- a/plots/bar-horizontal/specification.yaml
+++ b/plots/bar-horizontal/specification.yaml
@@ -11,7 +11,7 @@ issue: 1946
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- bar
diff --git a/plots/bar-permutation-importance/specification.yaml b/plots/bar-permutation-importance/specification.yaml
index 5ba10dc348..7de1b64081 100644
--- a/plots/bar-permutation-importance/specification.yaml
+++ b/plots/bar-permutation-importance/specification.yaml
@@ -11,7 +11,7 @@ issue: 2998
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- bar
diff --git a/plots/bar-stacked-percent/specification.yaml b/plots/bar-stacked-percent/specification.yaml
index b70fb9650c..24506f6a99 100644
--- a/plots/bar-stacked-percent/specification.yaml
+++ b/plots/bar-stacked-percent/specification.yaml
@@ -11,7 +11,7 @@ issue: 2008
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- bar
diff --git a/plots/bar-stacked/specification.yaml b/plots/bar-stacked/specification.yaml
index 0894439dd5..a9c6013ee3 100644
--- a/plots/bar-stacked/specification.yaml
+++ b/plots/bar-stacked/specification.yaml
@@ -11,7 +11,7 @@ issue: 1947
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- bar
diff --git a/plots/bland-altman-basic/specification.yaml b/plots/bland-altman-basic/specification.yaml
index 54e13cf88f..4f0b72b92a 100644
--- a/plots/bland-altman-basic/specification.yaml
+++ b/plots/bland-altman-basic/specification.yaml
@@ -11,7 +11,7 @@ issue: 2032
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- bland-altman
diff --git a/plots/box-grouped/specification.yaml b/plots/box-grouped/specification.yaml
index 69bbefd815..1efde8c473 100644
--- a/plots/box-grouped/specification.yaml
+++ b/plots/box-grouped/specification.yaml
@@ -11,7 +11,7 @@ issue: 2017
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- box
diff --git a/plots/box-notched/specification.yaml b/plots/box-notched/specification.yaml
index c52a757a84..a69e48ce50 100644
--- a/plots/box-notched/specification.yaml
+++ b/plots/box-notched/specification.yaml
@@ -11,7 +11,7 @@ issue: 2019
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- box
diff --git a/plots/calibration-curve/specification.yaml b/plots/calibration-curve/specification.yaml
index bc6b0cb4d5..2494e72d0e 100644
--- a/plots/calibration-curve/specification.yaml
+++ b/plots/calibration-curve/specification.yaml
@@ -11,7 +11,7 @@ issue: 2331
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- calibration
diff --git a/plots/candlestick-volume/specification.yaml b/plots/candlestick-volume/specification.yaml
index d7a02a0bb5..d438372aad 100644
--- a/plots/candlestick-volume/specification.yaml
+++ b/plots/candlestick-volume/specification.yaml
@@ -11,7 +11,7 @@ issue: 3068
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- candlestick
diff --git a/plots/chernoff-basic/specification.yaml b/plots/chernoff-basic/specification.yaml
index 354b40b0f1..b956b165bd 100644
--- a/plots/chernoff-basic/specification.yaml
+++ b/plots/chernoff-basic/specification.yaml
@@ -11,7 +11,7 @@ issue: 3003
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- chernoff
diff --git a/plots/choropleth-basic/specification.yaml b/plots/choropleth-basic/specification.yaml
index 4af133ecd8..208fa1afbe 100644
--- a/plots/choropleth-basic/specification.yaml
+++ b/plots/choropleth-basic/specification.yaml
@@ -11,7 +11,7 @@ issue: 3069
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- choropleth
diff --git a/plots/circlepacking-basic/specification.yaml b/plots/circlepacking-basic/specification.yaml
index 8df1a1a1a7..c3903cf02c 100644
--- a/plots/circlepacking-basic/specification.yaml
+++ b/plots/circlepacking-basic/specification.yaml
@@ -11,7 +11,7 @@ issue: 2498
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- circle-packing
diff --git a/plots/circos-basic/specification.yaml b/plots/circos-basic/specification.yaml
index 20371ed253..37e35ccdd5 100644
--- a/plots/circos-basic/specification.yaml
+++ b/plots/circos-basic/specification.yaml
@@ -11,7 +11,7 @@ issue: 3005
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- circos
diff --git a/plots/confusion-matrix/specification.yaml b/plots/confusion-matrix/specification.yaml
index fdaad95a8b..11d03de4e9 100644
--- a/plots/confusion-matrix/specification.yaml
+++ b/plots/confusion-matrix/specification.yaml
@@ -11,7 +11,7 @@ issue: 2272
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- heatmap
diff --git a/plots/contour-decision-boundary/specification.yaml b/plots/contour-decision-boundary/specification.yaml
index 7e6bfa7cb2..243c4de52a 100644
--- a/plots/contour-decision-boundary/specification.yaml
+++ b/plots/contour-decision-boundary/specification.yaml
@@ -11,7 +11,7 @@ issue: 2921
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- contour
diff --git a/plots/contour-filled/specification.yaml b/plots/contour-filled/specification.yaml
index c56a94c08f..9a1120637d 100644
--- a/plots/contour-filled/specification.yaml
+++ b/plots/contour-filled/specification.yaml
@@ -11,7 +11,7 @@ issue: 2500
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- contour
diff --git a/plots/count-basic/specification.yaml b/plots/count-basic/specification.yaml
index 7a932da34c..77081e1c5d 100644
--- a/plots/count-basic/specification.yaml
+++ b/plots/count-basic/specification.yaml
@@ -11,7 +11,7 @@ issue: 2033
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- bar
diff --git a/plots/donut-nested/specification.yaml b/plots/donut-nested/specification.yaml
index 821decb846..8b8e2053b1 100644
--- a/plots/donut-nested/specification.yaml
+++ b/plots/donut-nested/specification.yaml
@@ -11,7 +11,7 @@ issue: 2015
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- donut
diff --git a/plots/elbow-curve/specification.yaml b/plots/elbow-curve/specification.yaml
index 7b98df8d41..1a26e43bc4 100644
--- a/plots/elbow-curve/specification.yaml
+++ b/plots/elbow-curve/specification.yaml
@@ -11,7 +11,7 @@ issue: 2333
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- line
diff --git a/plots/errorbar-asymmetric/specification.yaml b/plots/errorbar-asymmetric/specification.yaml
index 6899cf39da..123c23e78b 100644
--- a/plots/errorbar-asymmetric/specification.yaml
+++ b/plots/errorbar-asymmetric/specification.yaml
@@ -11,7 +11,7 @@ issue: 2781
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- errorbar
diff --git a/plots/forest-basic/specification.yaml b/plots/forest-basic/specification.yaml
index 61028c60b5..a8e770d22a 100644
--- a/plots/forest-basic/specification.yaml
+++ b/plots/forest-basic/specification.yaml
@@ -11,7 +11,7 @@ issue: 2378
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- forest
diff --git a/plots/gain-curve/specification.yaml b/plots/gain-curve/specification.yaml
index 9b7d1c0de0..3c245a049a 100644
--- a/plots/gain-curve/specification.yaml
+++ b/plots/gain-curve/specification.yaml
@@ -11,7 +11,7 @@ issue: 2440
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- line
diff --git a/plots/gantt-basic/specification.yaml b/plots/gantt-basic/specification.yaml
index 64f4a2c86b..6b3e169e67 100644
--- a/plots/gantt-basic/specification.yaml
+++ b/plots/gantt-basic/specification.yaml
@@ -11,7 +11,7 @@ issue: 2377
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- gantt
diff --git a/plots/heatmap-annotated/specification.yaml b/plots/heatmap-annotated/specification.yaml
index 86420f44d4..a5969451e8 100644
--- a/plots/heatmap-annotated/specification.yaml
+++ b/plots/heatmap-annotated/specification.yaml
@@ -11,7 +11,7 @@ issue: 1824
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- heatmap
diff --git a/plots/heatmap-clustered/specification.yaml b/plots/heatmap-clustered/specification.yaml
index fe59a092dc..ffe0d5dff2 100644
--- a/plots/heatmap-clustered/specification.yaml
+++ b/plots/heatmap-clustered/specification.yaml
@@ -11,7 +11,7 @@ issue: 2021
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- heatmap
diff --git a/plots/heatmap-correlation/specification.yaml b/plots/heatmap-correlation/specification.yaml
index f34eea4b54..2624442031 100644
--- a/plots/heatmap-correlation/specification.yaml
+++ b/plots/heatmap-correlation/specification.yaml
@@ -11,7 +11,7 @@ issue: 1948
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- heatmap
diff --git a/plots/histogram-2d/specification.yaml b/plots/histogram-2d/specification.yaml
index 98d4d49f52..122c0081f7 100644
--- a/plots/histogram-2d/specification.yaml
+++ b/plots/histogram-2d/specification.yaml
@@ -11,7 +11,7 @@ issue: 2012
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- histogram
diff --git a/plots/histogram-density/specification.yaml b/plots/histogram-density/specification.yaml
index 541849fca4..130805b58a 100644
--- a/plots/histogram-density/specification.yaml
+++ b/plots/histogram-density/specification.yaml
@@ -11,7 +11,7 @@ issue: 2442
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- histogram
diff --git a/plots/histogram-kde/specification.yaml b/plots/histogram-kde/specification.yaml
index bd8b10902a..5466c3431f 100644
--- a/plots/histogram-kde/specification.yaml
+++ b/plots/histogram-kde/specification.yaml
@@ -11,7 +11,7 @@ issue: 1823
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- histogram
diff --git a/plots/histogram-overlapping/specification.yaml b/plots/histogram-overlapping/specification.yaml
index 5cf51fa52a..fe010dc930 100644
--- a/plots/histogram-overlapping/specification.yaml
+++ b/plots/histogram-overlapping/specification.yaml
@@ -11,7 +11,7 @@ issue: 2010
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- histogram
diff --git a/plots/hive-basic/specification.yaml b/plots/hive-basic/specification.yaml
index 13e66fea2b..0374419016 100644
--- a/plots/hive-basic/specification.yaml
+++ b/plots/hive-basic/specification.yaml
@@ -11,7 +11,7 @@ issue: 1879
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- hive
diff --git a/plots/horizon-basic/specification.yaml b/plots/horizon-basic/specification.yaml
index 9c733faba5..980976bee1 100644
--- a/plots/horizon-basic/specification.yaml
+++ b/plots/horizon-basic/specification.yaml
@@ -11,7 +11,7 @@ issue: 1877
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- horizon
diff --git a/plots/learning-curve-basic/specification.yaml b/plots/learning-curve-basic/specification.yaml
index 367cc8f7bf..8a0c3cf6c3 100644
--- a/plots/learning-curve-basic/specification.yaml
+++ b/plots/learning-curve-basic/specification.yaml
@@ -11,7 +11,7 @@ issue: 2275
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- line
diff --git a/plots/lift-curve/specification.yaml b/plots/lift-curve/specification.yaml
index 83d97fe775..24c2c5ba9e 100644
--- a/plots/lift-curve/specification.yaml
+++ b/plots/lift-curve/specification.yaml
@@ -11,7 +11,7 @@ issue: 2379
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- lift
diff --git a/plots/line-annotated-events/specification.yaml b/plots/line-annotated-events/specification.yaml
index 40fa6346f4..09182ff94c 100644
--- a/plots/line-annotated-events/specification.yaml
+++ b/plots/line-annotated-events/specification.yaml
@@ -11,7 +11,7 @@ issue: 2997
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- line
diff --git a/plots/line-confidence/specification.yaml b/plots/line-confidence/specification.yaml
index f24539483f..3a4def987b 100644
--- a/plots/line-confidence/specification.yaml
+++ b/plots/line-confidence/specification.yaml
@@ -11,7 +11,7 @@ issue: 2007
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- line
diff --git a/plots/line-interactive/specification.yaml b/plots/line-interactive/specification.yaml
index dbc2ba5366..b96f6e1a8c 100644
--- a/plots/line-interactive/specification.yaml
+++ b/plots/line-interactive/specification.yaml
@@ -11,7 +11,7 @@ issue: 2787
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- line
diff --git a/plots/line-loss-training/specification.yaml b/plots/line-loss-training/specification.yaml
index 879a360003..4350aa2803 100644
--- a/plots/line-loss-training/specification.yaml
+++ b/plots/line-loss-training/specification.yaml
@@ -11,7 +11,7 @@ issue: 2860
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- line
diff --git a/plots/line-multi/specification.yaml b/plots/line-multi/specification.yaml
index c6430acb56..aaaf0b78f2 100644
--- a/plots/line-multi/specification.yaml
+++ b/plots/line-multi/specification.yaml
@@ -11,7 +11,7 @@ issue: 1825
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- line
diff --git a/plots/line-realtime/specification.yaml b/plots/line-realtime/specification.yaml
index dd7459aabf..068a2eb7f4 100644
--- a/plots/line-realtime/specification.yaml
+++ b/plots/line-realtime/specification.yaml
@@ -11,7 +11,7 @@ issue: 3073
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- line
diff --git a/plots/line-timeseries-rolling/specification.yaml b/plots/line-timeseries-rolling/specification.yaml
index 8bbb1da47a..e87fa74374 100644
--- a/plots/line-timeseries-rolling/specification.yaml
+++ b/plots/line-timeseries-rolling/specification.yaml
@@ -11,7 +11,7 @@ issue: 2786
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- line
diff --git a/plots/line-timeseries/specification.yaml b/plots/line-timeseries/specification.yaml
index b169884ae0..f13f7c9e29 100644
--- a/plots/line-timeseries/specification.yaml
+++ b/plots/line-timeseries/specification.yaml
@@ -11,7 +11,7 @@ issue: 2006
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- line
diff --git a/plots/manhattan-gwas/specification.yaml b/plots/manhattan-gwas/specification.yaml
index a9d70eccb0..6cb64b9844 100644
--- a/plots/manhattan-gwas/specification.yaml
+++ b/plots/manhattan-gwas/specification.yaml
@@ -11,7 +11,7 @@ issue: 2925
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- scatter
diff --git a/plots/network-directed/specification.yaml b/plots/network-directed/specification.yaml
index 54aeb72888..a1c216e6bb 100644
--- a/plots/network-directed/specification.yaml
+++ b/plots/network-directed/specification.yaml
@@ -11,7 +11,7 @@ issue: 2858
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- network
diff --git a/plots/parallel-categories-basic/specification.yaml b/plots/parallel-categories-basic/specification.yaml
index 4866541453..2083488dee 100644
--- a/plots/parallel-categories-basic/specification.yaml
+++ b/plots/parallel-categories-basic/specification.yaml
@@ -11,7 +11,7 @@ issue: 2501
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- parallel-categories
diff --git a/plots/parliament-basic/specification.yaml b/plots/parliament-basic/specification.yaml
index a7f74a207a..1f9c5c0064 100644
--- a/plots/parliament-basic/specification.yaml
+++ b/plots/parliament-basic/specification.yaml
@@ -11,7 +11,7 @@ issue: 2499
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- parliament
diff --git a/plots/pdp-basic/specification.yaml b/plots/pdp-basic/specification.yaml
index f4fc5fb529..4001404c3e 100644
--- a/plots/pdp-basic/specification.yaml
+++ b/plots/pdp-basic/specification.yaml
@@ -11,7 +11,7 @@ issue: 2922
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- line
diff --git a/plots/phase-diagram/specification.yaml b/plots/phase-diagram/specification.yaml
index 2c384e51c8..78a557f135 100644
--- a/plots/phase-diagram/specification.yaml
+++ b/plots/phase-diagram/specification.yaml
@@ -11,7 +11,7 @@ issue: 3004
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- line
diff --git a/plots/pie-drilldown/specification.yaml b/plots/pie-drilldown/specification.yaml
index 83df54ab11..110f65ae41 100644
--- a/plots/pie-drilldown/specification.yaml
+++ b/plots/pie-drilldown/specification.yaml
@@ -11,7 +11,7 @@ issue: 3072
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- pie
diff --git a/plots/pie-exploded/specification.yaml b/plots/pie-exploded/specification.yaml
index b065326ac0..72eb34565f 100644
--- a/plots/pie-exploded/specification.yaml
+++ b/plots/pie-exploded/specification.yaml
@@ -11,7 +11,7 @@ issue: 2013
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- pie
diff --git a/plots/precision-recall/specification.yaml b/plots/precision-recall/specification.yaml
index da7b912afe..d89e161aaf 100644
--- a/plots/precision-recall/specification.yaml
+++ b/plots/precision-recall/specification.yaml
@@ -11,7 +11,7 @@ issue: 2274
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- line
diff --git a/plots/radar-multi/specification.yaml b/plots/radar-multi/specification.yaml
index dfa7eb6852..fc918f9d0f 100644
--- a/plots/radar-multi/specification.yaml
+++ b/plots/radar-multi/specification.yaml
@@ -11,7 +11,7 @@ issue: 2026
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- radar
diff --git a/plots/raincloud-basic/specification.yaml b/plots/raincloud-basic/specification.yaml
index b3018aae4e..96687986db 100644
--- a/plots/raincloud-basic/specification.yaml
+++ b/plots/raincloud-basic/specification.yaml
@@ -11,7 +11,7 @@ issue: 1876
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- raincloud
diff --git a/plots/residual-basic/specification.yaml b/plots/residual-basic/specification.yaml
index de2c70d4f7..8b1035f263 100644
--- a/plots/residual-basic/specification.yaml
+++ b/plots/residual-basic/specification.yaml
@@ -11,7 +11,7 @@ issue: 2030
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- residual
diff --git a/plots/residual-plot/specification.yaml b/plots/residual-plot/specification.yaml
index 49eadcfe1d..92ec2598d1 100644
--- a/plots/residual-plot/specification.yaml
+++ b/plots/residual-plot/specification.yaml
@@ -11,7 +11,7 @@ issue: 2332
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- scatter
diff --git a/plots/roc-curve/specification.yaml b/plots/roc-curve/specification.yaml
index e2c43141d5..8277df6b6b 100644
--- a/plots/roc-curve/specification.yaml
+++ b/plots/roc-curve/specification.yaml
@@ -11,7 +11,7 @@ issue: 2273
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- line
diff --git a/plots/scatter-animated-controls/specification.yaml b/plots/scatter-animated-controls/specification.yaml
index 4f2c232599..197fc1c71b 100644
--- a/plots/scatter-animated-controls/specification.yaml
+++ b/plots/scatter-animated-controls/specification.yaml
@@ -11,7 +11,7 @@ issue: 3067
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- scatter
diff --git a/plots/scatter-annotated/specification.yaml b/plots/scatter-annotated/specification.yaml
index a12273f8f4..5434044c5c 100644
--- a/plots/scatter-annotated/specification.yaml
+++ b/plots/scatter-annotated/specification.yaml
@@ -11,7 +11,7 @@ issue: 2790
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- scatter
diff --git a/plots/scatter-color-mapped/specification.yaml b/plots/scatter-color-mapped/specification.yaml
index e3a1fd40d3..593cef7d8b 100644
--- a/plots/scatter-color-mapped/specification.yaml
+++ b/plots/scatter-color-mapped/specification.yaml
@@ -11,7 +11,7 @@ issue: 2004
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- scatter
diff --git a/plots/scatter-marginal/specification.yaml b/plots/scatter-marginal/specification.yaml
index 7c17c2bf94..a2fe7fab46 100644
--- a/plots/scatter-marginal/specification.yaml
+++ b/plots/scatter-marginal/specification.yaml
@@ -11,7 +11,7 @@ issue: 2005
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- scatter
diff --git a/plots/scatter-matrix/specification.yaml b/plots/scatter-matrix/specification.yaml
index 0579b84ed5..124e239419 100644
--- a/plots/scatter-matrix/specification.yaml
+++ b/plots/scatter-matrix/specification.yaml
@@ -11,7 +11,7 @@ issue: 2035
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- scatter
diff --git a/plots/scatter-regression-linear/specification.yaml b/plots/scatter-regression-linear/specification.yaml
index aef3800b08..72f68d0b6e 100644
--- a/plots/scatter-regression-linear/specification.yaml
+++ b/plots/scatter-regression-linear/specification.yaml
@@ -11,7 +11,7 @@ issue: 1821
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- scatter
diff --git a/plots/scatter-regression-lowess/specification.yaml b/plots/scatter-regression-lowess/specification.yaml
index f02736f4a2..67f692ce5e 100644
--- a/plots/scatter-regression-lowess/specification.yaml
+++ b/plots/scatter-regression-lowess/specification.yaml
@@ -11,7 +11,7 @@ issue: 2855
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- scatter
diff --git a/plots/scatter-regression-polynomial/specification.yaml b/plots/scatter-regression-polynomial/specification.yaml
index 6c97dd97a1..94d56ead4b 100644
--- a/plots/scatter-regression-polynomial/specification.yaml
+++ b/plots/scatter-regression-polynomial/specification.yaml
@@ -11,7 +11,7 @@ issue: 2028
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- scatter
diff --git a/plots/shap-summary/specification.yaml b/plots/shap-summary/specification.yaml
index d889f9c781..8bb7ecea12 100644
--- a/plots/shap-summary/specification.yaml
+++ b/plots/shap-summary/specification.yaml
@@ -11,7 +11,7 @@ issue: 2923
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- shap
diff --git a/plots/silhouette-basic/specification.yaml b/plots/silhouette-basic/specification.yaml
index 24095118a7..575e64cd00 100644
--- a/plots/silhouette-basic/specification.yaml
+++ b/plots/silhouette-basic/specification.yaml
@@ -11,7 +11,7 @@ issue: 2334
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- silhouette
diff --git a/plots/slider-control-basic/specification.yaml b/plots/slider-control-basic/specification.yaml
index 1e3fbbd94b..b93f749036 100644
--- a/plots/slider-control-basic/specification.yaml
+++ b/plots/slider-control-basic/specification.yaml
@@ -11,7 +11,7 @@ issue: 3071
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- scatter
diff --git a/plots/spectrogram-basic/specification.yaml b/plots/spectrogram-basic/specification.yaml
index b898b1fee9..8a95c38ff5 100644
--- a/plots/spectrogram-basic/specification.yaml
+++ b/plots/spectrogram-basic/specification.yaml
@@ -11,7 +11,7 @@ issue: 2927
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- spectrogram
diff --git a/plots/spectrum-basic/specification.yaml b/plots/spectrum-basic/specification.yaml
index ec2110beb1..0d6a1a16ac 100644
--- a/plots/spectrum-basic/specification.yaml
+++ b/plots/spectrum-basic/specification.yaml
@@ -11,7 +11,7 @@ issue: 2926
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- spectrum
diff --git a/plots/streamline-basic/specification.yaml b/plots/streamline-basic/specification.yaml
index 2bbef536c4..6ea31ad617 100644
--- a/plots/streamline-basic/specification.yaml
+++ b/plots/streamline-basic/specification.yaml
@@ -11,7 +11,7 @@ issue: 2861
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- streamline
diff --git a/plots/subplot-grid-custom/specification.yaml b/plots/subplot-grid-custom/specification.yaml
index 19a6c52952..97722a8ddc 100644
--- a/plots/subplot-grid-custom/specification.yaml
+++ b/plots/subplot-grid-custom/specification.yaml
@@ -11,7 +11,7 @@ issue: 2856
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- subplot
diff --git a/plots/subplot-grid/specification.yaml b/plots/subplot-grid/specification.yaml
index 6ce44fcc0a..d69678db56 100644
--- a/plots/subplot-grid/specification.yaml
+++ b/plots/subplot-grid/specification.yaml
@@ -11,7 +11,7 @@ issue: 2782
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- subplot
diff --git a/plots/subplot-mosaic/specification.yaml b/plots/subplot-mosaic/specification.yaml
index 9fa8523cb3..34d1965705 100644
--- a/plots/subplot-mosaic/specification.yaml
+++ b/plots/subplot-mosaic/specification.yaml
@@ -11,7 +11,7 @@ issue: 3002
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- subplot
diff --git a/plots/sudoku-basic/specification.yaml b/plots/sudoku-basic/specification.yaml
index 88260f4eb4..4f86d1463e 100644
--- a/plots/sudoku-basic/specification.yaml
+++ b/plots/sudoku-basic/specification.yaml
@@ -11,7 +11,7 @@ issue: 1311
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- grid
diff --git a/plots/survival-kaplan-meier/specification.yaml b/plots/survival-kaplan-meier/specification.yaml
index 9a197e02dc..3e0793f248 100644
--- a/plots/survival-kaplan-meier/specification.yaml
+++ b/plots/survival-kaplan-meier/specification.yaml
@@ -11,7 +11,7 @@ issue: 2441
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- survival
diff --git a/plots/timeline-basic/specification.yaml b/plots/timeline-basic/specification.yaml
index eed3c6f92b..b76b7f0db3 100644
--- a/plots/timeline-basic/specification.yaml
+++ b/plots/timeline-basic/specification.yaml
@@ -11,7 +11,7 @@ issue: 2443
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- timeline
diff --git a/plots/timeseries-decomposition/specification.yaml b/plots/timeseries-decomposition/specification.yaml
index a79b51d176..5dd40c9b02 100644
--- a/plots/timeseries-decomposition/specification.yaml
+++ b/plots/timeseries-decomposition/specification.yaml
@@ -11,7 +11,7 @@ issue: 2992
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- line
diff --git a/plots/tree-phylogenetic/specification.yaml b/plots/tree-phylogenetic/specification.yaml
index 709fa2fa74..3b30afdcae 100644
--- a/plots/tree-phylogenetic/specification.yaml
+++ b/plots/tree-phylogenetic/specification.yaml
@@ -11,7 +11,7 @@ issue: 3070
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- tree
diff --git a/plots/venn-basic/specification.yaml b/plots/venn-basic/specification.yaml
index 52393d51dd..b45f0b3267 100644
--- a/plots/venn-basic/specification.yaml
+++ b/plots/venn-basic/specification.yaml
@@ -11,7 +11,7 @@ issue: 2444
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- venn
diff --git a/plots/violin-split/specification.yaml b/plots/violin-split/specification.yaml
index b45810b60d..21a4e34a58 100644
--- a/plots/violin-split/specification.yaml
+++ b/plots/violin-split/specification.yaml
@@ -11,7 +11,7 @@ issue: 1949
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- violin
diff --git a/plots/volcano-basic/specification.yaml b/plots/volcano-basic/specification.yaml
index fc14b6ea48..133baa5b7a 100644
--- a/plots/volcano-basic/specification.yaml
+++ b/plots/volcano-basic/specification.yaml
@@ -11,7 +11,7 @@ issue: 2924
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- volcano
diff --git a/plots/windrose-basic/specification.yaml b/plots/windrose-basic/specification.yaml
index ccc27f94e2..c2e743710f 100644
--- a/plots/windrose-basic/specification.yaml
+++ b/plots/windrose-basic/specification.yaml
@@ -11,7 +11,7 @@ issue: 1880
suggested: MarkusNeusinger
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- windrose
diff --git a/prompts/README.md b/prompts/README.md
index f85be07129..6d27cc3941 100644
--- a/prompts/README.md
+++ b/prompts/README.md
@@ -12,10 +12,9 @@ Git history shows all changes (`git log -p prompts/plot-generator.md`).
| File | Agent | Task |
|------|-------|------|
| `plot-generator.md` | Plot Generator | Base rules for all plot implementations |
-| `library/*.md` | Plot Generator | Library-specific rules (8 files) |
+| `library/*.md` | Plot Generator | Library-specific rules (9 files) |
| `quality-criteria.md` | All | Definition of what "good code" means |
-| `quality-evaluator.md` | Quality Checker | Multi-LLM evaluation (Claude/Gemini/GPT) |
-| `auto-tagger.md` | Auto-Tagger | Automatic tagging across 5 dimensions |
+| `quality-evaluator.md` | Quality Checker | AI quality evaluation |
| `spec-validator.md` | Spec Validator | Validates plot request issues |
| `spec-id-generator.md` | Spec ID Generator | Assigns unique spec IDs |
| `workflow-prompts/*.md` | GitHub Actions | Workflow-specific prompts (see below) |
@@ -26,9 +25,9 @@ Located in `workflow-prompts/` - templates for GitHub Actions workflows:
| File | Workflow | Purpose |
|------|----------|---------|
-| `generate-implementation.md` | gen-library-impl.yml | Initial code generation |
-| `improve-from-feedback.md` | gen-update-plot.yml | Code improvement after rejection |
-| `ai-quality-review.md` | bot-ai-review.yml | Quality evaluation |
+| `generate-implementation.md` | impl-generate.yml | Initial code generation |
+| `improve-from-feedback.md` | impl-repair.yml | Code improvement after rejection |
+| `ai-quality-review.md` | impl-review.yml | Quality evaluation |
See `workflow-prompts/README.md` for variable reference and usage.
@@ -46,7 +45,7 @@ See `workflow-prompts/README.md` for variable reference and usage.
${PROMPT_LIB}
## Spec
- $(cat plots/${{ inputs.spec_id }}/spec.md)"
+ $(cat plots/${{ inputs.spec_id }}/specification.md)"
```
## Prompt Structure
diff --git a/prompts/templates/specification.yaml b/prompts/templates/specification.yaml
index 63213962e3..a08c0008c0 100644
--- a/prompts/templates/specification.yaml
+++ b/prompts/templates/specification.yaml
@@ -11,7 +11,7 @@ issue: null # GitHub issue number
suggested: null # GitHub username or 'pyplots' for seed plots
# Classification tags (applies to all library implementations)
-# See docs/concepts/tagging-system.md for detailed guidelines
+# See docs/reference/tagging-system.md for detailed guidelines
tags:
plot_type:
- {type} # Primary plot type (scatter, bar, line, heatmap, etc.)