Skip to content

jvgomg/semantic-kit

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

95 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

semantic-kit

Developer toolkit for understanding how websites are interpreted by search engines, AI crawlers, screen readers, and content extractors.

npm npm npm License: MIT

Packages

Package Description
@webspecs/cli CLI tool — Node.js ≥ 18, works with npm and npx
@webspecs/core Core library for programmatic use
@webspecs/tui Interactive terminal UI — requires Bun

Installation

# CLI (Node.js ≥ 18)
npm install -g @webspecs/cli

# Or use without installing
npx @webspecs/cli <command> [url]

# TUI (requires Bun)
bunx @webspecs/tui [url]

Usage

webspecs <command> [url] [options]

Commands

Commands are organized into two groups:

  • Lenses — Show how a specific consumer "sees" your page
  • Utilities — Task-oriented tools for analysis and validation

Lenses

Command Description
ai <url|file> Show how AI crawlers see your page
reader <url|file> Show how browser reader modes see your page
google <url|file> Show how Googlebot sees your page
social <url|file> Show how social platforms see your page (link preview)
screen-reader <url> Show how screen readers interpret your page

Analysis Utilities

Command Description
readability <url|file> Raw Readability extraction with full metrics
readability:js <url> Readability extraction after JavaScript rendering
readability:compare <url> Compare static vs JS-rendered content extraction
schema <url|file> View structured data (JSON-LD, OG, Twitter Cards)
schema:js <url> View structured data after JavaScript rendering
schema:compare <url> Compare static vs JS-rendered structured data
structure <url|file> Show page structure (landmarks, headings, links)
structure:js <url> Show structure after JavaScript rendering
structure:compare <url> Compare static vs hydrated structure
a11y-tree <url> Show accessibility tree (static HTML)
a11y-tree:js <url> Show accessibility tree (rendered DOM)
a11y-tree:compare <url> Compare static vs rendered accessibility tree

Validation Utilities

Command Description
validate:html <url|file> Validate HTML markup against W3C standards
validate:schema <url|file> Validate structured data against platform requirements
validate:a11y <url> Validate accessibility against WCAG guidelines

Other Utilities

Command Description
fetch <url> Fetch and prettify HTML
tui Launch interactive terminal UI

Documentation

Lenses

  • AI Crawlers — How AI tools see your content
  • Reader Mode — How browser reader modes see your page
  • Google — How Googlebot sees your page
  • Social — How social platforms see your page for link previews
  • Screen Reader — How screen readers interpret your page

Utilities

Validation

Other

Philosophy

This toolkit prioritizes:

  1. Observability over enforcement — Insight first, pass/fail second
  2. Documentation over implementation — Curating existing tools, not reinventing
  3. Breadth before depth — Coverage across perspectives before deep dives

See the full roadmap for details.

Lenses vs Utilities

The toolkit organizes commands into two conceptual groups:

Lenses

Lenses answer "How does X see my page?" for specific consumers:

Lens Consumer What it shows
ai ChatGPT, Claude, Perplexity Markdown content via Readability
reader Safari Reader, Pocket Reader mode extraction
google Googlebot Metadata, schema, structure
social WhatsApp, Slack, Twitter, iMessage Open Graph + Twitter Card previews
screen-reader VoiceOver, NVDA, JAWS Accessibility tree (JS-rendered)

Lenses are opinionated: each decides internally whether to use JavaScript rendering based on what the real consumer does.

Utilities

Utilities are task-oriented tools with explicit control over rendering mode:

Pattern Static JS-rendered Comparison
Content extraction readability readability:js readability:compare
Structured data schema schema:js schema:compare
Page structure structure structure:js structure:compare
Accessibility tree a11y-tree a11y-tree:js a11y-tree:compare
  • Static (no :js): Works on URLs and local files. Fast.
  • JS-rendered (:js): Requires Playwright. URL only. Shows hydrated DOM.
  • Compare (:compare): Shows differences between static and rendered.

Programmatic API

All command result types are exported for programmatic usage:

import type {
  // A11y-tree commands
  A11yResult,
  A11yCompareResult,
  // Structure commands
  StructureResult,
  StructureJsResult,
  StructureCompareResult,
  // Readability commands
  ReadabilityCompareResult,
  AiResult,
  // Schema commands
  SchemaResult,
  // Validation commands
  ValidateHtmlResult,
  ValidateSchemaResult,
  ValidateA11yResult,
  // Lib types
  StructureAnalysis,
  StructureComparison,
  AriaNode,
  SnapshotDiff,
} from '@webspecs/core'

These types define the exact structure of --format json output for each command, enabling type-safe consumption of results in scripts and tools.

Development

This project uses a monorepo structure with Bun workspaces and Turborepo for efficient task orchestration.

Setup

# Install dependencies
bun install

Development Workflow

# Run CLI commands during development (no build needed)
bun run dev:cli <command> [options]

# Run CLI with auto-rebuild on file changes
bun run watch:cli

# Run TUI during development
bun run dev:tui

# Run TUI with auto-rebuild on file changes
bun run watch:tui

Building

# Build all packages (with caching and parallel execution)
bun run build

# Type check all packages
bun run typecheck

# Clean build artifacts
bun run clean

Code Quality

# Lint all packages
bun run lint

# Format code with Prettier
bun run pretty

# Check code formatting
bun run pretty:check

Testing

# Run all tests (unit + integration)
bun run test

# Run unit tests only
bun run test:unit

# Run integration tests
bun run test:integration

# Run integration tests with watch mode
bun run test:integration:watch

Test Server

A local test server serves HTML fixtures for testing commands:

# Start the test server (localhost:4000)
bun run test-server

# Test commands against fixtures
bun run dev ai http://localhost:4000/good/semantic-article.html
bun run dev validate:a11y http://localhost:4000/bad/div-soup.html

Fixtures are organized by category (good/, bad/, edge-cases/, responses/) and support configurable response behaviors via .meta.json sidecar files.

See test-server/README.md for full documentation.

Monorepo Structure

The project is organized into packages:

  • @webspecs/core - Core analyzers, extractors, and validators
  • @webspecs/cli - Command-line interface (depends on core)
  • @webspecs/tui - Terminal UI (depends on core + cli)
  • @webspecs/integration-tests - Integration test suite
  • @webspecs/test-server - HTML fixture server
  • @webspecs/test-server-nextjs - Next.js streaming fixture

Turborepo handles task orchestration with intelligent caching and parallel execution.

License

MIT

About

Developer toolkit for understanding how websites are interpreted by search engines, AI crawlers, screen readers, and content extractors

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors