CrawlScope

See it. Fix it. Ship it.

Free, instant website auditing — SEO · Performance · Accessibility · Technical Health · Core Web Vitals

What is CrawlScope?

CrawlScope is a lightweight, opinionated website auditing tool. Enter a URL — get a prioritised, actionable audit report in under 30 seconds. No accounts, no database, no subscriptions, no saved history. Just scan and fix.

It runs a real Lighthouse audit inside a headless Chromium browser via Puppeteer, combines it with deep HTML analysis via Cheerio, and surfaces findings across 5 categories with severity-ranked fixes.

Features

50+ checks across SEO, Performance, Accessibility, Technical Health, and Content Structure
Real Lighthouse scores — Performance, SEO, Accessibility, Best Practices
Core Web Vitals — LCP, CLS, INP, TTFB, FCP with thresholds and recommendations
Priority Fixes — every issue includes why it matters, how to fix it, and expected impact
Desktop & mobile screenshots captured during the live scan
Page metadata — title, H1, description, word count, technologies detected
Export to Markdown, HTML, PDF (print), and JSON
Zero auth · Zero database · Zero tracking

Tech Stack

Frontend

Technology	Version	Purpose
Next.js	`^15.0.0`	App Router, SSR, API Routes
React	`^18.3.1`	UI framework
TypeScript	`^5.5.4`	Type safety throughout
Tailwind CSS	`^3.4.10`	Utility-first styling
Framer Motion	`^11.3.0`	Animations & transitions
Lucide React	`^0.460.0`	Icon library

Backend / Scanning

Technology	Version	Purpose
Lighthouse	`^12.2.0`	Performance, SEO, A11y, Best Practices scoring
Puppeteer	`^25.1.0`	Headless Chromium — screenshots + page load
Cheerio	`^1.0.0`	Fast HTML parsing for SEO & content checks

UI Primitives

Technology	Purpose
@radix-ui/react-tabs	Accessible tab components
@radix-ui/react-progress	Progress indicators
@radix-ui/react-tooltip	Tooltip primitives
clsx + tailwind-merge	Conditional class merging

Project Structure

crawlscope/
├── src/
│   ├── app/
│   │   ├── api/
│   │   │   └── scan/
│   │   │       └── route.ts          # POST /api/scan — runs Lighthouse + Puppeteer + Cheerio
│   │   ├── features/
│   │   │   └── page.tsx              # /features page
│   │   ├── docs/
│   │   │   └── page.tsx              # /docs page
│   │   ├── api-page/
│   │   │   └── page.tsx              # /api-page (API reference)
│   │   ├── privacy/
│   │   │   └── page.tsx              # /privacy page
│   │   ├── globals.css
│   │   ├── layout.tsx                # Root layout with Geist font
│   │   └── page.tsx                  # Landing page + scan orchestration
│   ├── components/
│   │   ├── shared/
│   │   │   ├── Nav.tsx               # Sticky nav with active link state
│   │   │   └── Footer.tsx            # Shared footer
│   │   ├── LandingPage.tsx           # Hero, URL input, feature cards, mock dashboard
│   │   ├── ScanLoader.tsx            # 12-step animated progress screen
│   │   ├── ResultsDashboard.tsx      # Full audit report with exports
│   │   ├── ScoreRing.tsx             # Animated SVG score ring
│   │   ├── VitalsSection.tsx         # Core Web Vitals cards
│   │   ├── AuditTabs.tsx             # SEO / Perf / A11y / Technical / Content tabs
│   │   ├── PriorityFixCard.tsx       # Expandable fix card with why/fix/impact
│   │   ├── CheckRow.tsx              # Pass / warn / fail check row
│   │   └── ScreenshotsSection.tsx    # Desktop / mobile screenshot toggle
│   ├── lib/
│   │   ├── scanner.ts                # Core scan engine (Puppeteer + Lighthouse + Cheerio)
│   │   └── utils.ts                  # cn(), scoreColor(), vitalStatusClasses(), etc.
│   └── types/
│       └── audit.ts                  # All TypeScript interfaces for AuditReport
├── public/
├── next.config.mjs
├── tailwind.config.ts
├── tsconfig.json
└── package.json

Getting Started

Prerequisites

Node.js 18.x or higher
npm / yarn / pnpm

Installation

# Clone the repo
git clone https://github.com/code2ahm/crawlscope.git
cd crawlscope

# Install dependencies
npm install

# Puppeteer downloads Chromium automatically on install (~170 MB)
# If it fails, run: npx puppeteer browsers install chrome

Development

npm run dev

Open http://localhost:3000.

Production Build

npm run build
npm run start

How a Scan Works

User submits URL
      │
      ▼
POST /api/scan
      │
      ├── 1. Puppeteer launches headless Chromium
      │         • Navigates to URL (networkidle2)
      │         • Captures desktop screenshot (1280px)
      │         • Captures mobile screenshot (390px)
      │         • Measures load time + page size
      │
      ├── 2. Lighthouse runs on the open Chrome port
      │         • Performance score
      │         • SEO score
      │         • Accessibility score
      │         • Best Practices score
      │         • Core Web Vitals (LCP, CLS, TBT→INP, TTFB, FCP)
      │         • 20+ individual audits (render-blocking, WebP, etc.)
      │
      ├── 3. Cheerio parses the raw HTML
      │         • Title, meta description, H1–H6 structure
      │         • Canonical, OG tags, JSON-LD, lang attribute
      │         • Image alt text coverage
      │         • Internal link count
      │         • Form label coverage
      │         • Technology fingerprinting
      │
      └── 4. Results assembled into AuditReport
                • Overall score (weighted average)
                • Priority fixes ranked by severity
                • 50+ pass/warn/fail checks across 5 categories
                • Returned as JSON to the client

API Reference

The scan endpoint is a standard Next.js Route Handler.

`POST /api/scan`

Request body

{
  "url": "https://example.com"
}

Response — success

{
  "success": true,
  "report": {
    "url": "https://example.com",
    "domain": "example.com",
    "scannedAt": "2025-06-07T10:30:00.000Z",
    "scanDuration": 18420,
    "overall": 84,
    "lighthouse": {
      "performance": 71,
      "seo": 92,
      "accessibility": 68,
      "bestPractices": 87
    },
    "vitals": { "lcp": {...}, "cls": {...}, "inp": {...}, "ttfb": {...}, "fcp": {...} },
    "stats": { "critical": 2, "warnings": 4, "passed": 31, "total": 37 },
    "priorityFixes": [...],
    "seoChecks": [...],
    "performanceChecks": [...],
    "accessibilityChecks": [...],
    "technicalChecks": [...],
    "contentChecks": [...],
    "screenshots": { "desktop": "data:image/jpeg;base64,...", "mobile": "data:image/jpeg;base64,..." },
    "meta": {
      "title": "Example Domain",
      "description": "...",
      "h1": "Example Domain",
      "wordCount": 312,
      "loadTime": 1840,
      "pageSize": 48,
      "statusCode": 200,
      "technologies": ["Next.js", "Google Analytics"]
    }
  }
}

Response — error

{
  "success": false,
  "error": "Invalid URL format"
}

Status codes

Code	Meaning
`200`	Scan completed successfully
`400`	Missing or invalid URL
`500`	Scan failed (timeout, unreachable host, etc.)

Note: Scans typically take 15–45 seconds depending on page complexity and server speed. The route handler timeout is set to 120 seconds (maxDuration = 120).

Export Formats

From the results dashboard, every report can be exported as:

Format	Contents
Markdown	Full report — scores table, all checks by category, priority fixes, vitals
HTML	Standalone styled page, no dependencies, renders in any browser
PDF	Opens print dialog in a new tab, formatted for A4 via `@page` CSS
JSON	Raw `AuditReport` object, pretty-printed — useful for integrations

Deployment

Vercel (recommended)

npm i -g vercel
vercel

Set the function timeout in vercel.json for the scan route:

{
  "functions": {
    "src/app/api/scan/route.ts": {
      "maxDuration": 120
    }
  }
}

Important: Puppeteer requires a Chromium binary. On Vercel, use @sparticuz/chromium with puppeteer-core instead of the full puppeteer package to stay within the serverless function size limit.

Self-hosted / VPS

Works out of the box with standard puppeteer as long as the server has:

Node.js 18+
Chrome dependencies: chromium-browser or equivalent system packages

# Ubuntu / Debian — install Chrome dependencies
sudo apt-get install -y \
  libgbm-dev libnss3 libatk-bridge2.0-0 libdrm2 \
  libxcomposite1 libxdamage1 libxfixes3 libxrandr2 \
  libgbm1 libasound2

npm run build && npm start

Docker

FROM node:18-slim

# Install Chromium dependencies
RUN apt-get update && apt-get install -y \
  chromium libgbm-dev libnss3 libatk-bridge2.0-0 \
  libdrm2 libxcomposite1 libxdamage1 libxfixes3 \
  libxrandr2 libasound2 --no-install-recommends \
  && rm -rf /var/lib/apt/lists/*

ENV PUPPETEER_SKIP_CHROMIUM_DOWNLOAD=true
ENV PUPPETEER_EXECUTABLE_PATH=/usr/bin/chromium

WORKDIR /app
COPY package*.json ./
RUN npm ci --production
COPY . .
RUN npm run build

EXPOSE 3000
CMD ["npm", "start"]

Environment Variables

No environment variables are required to run CrawlScope. All scanning happens server-side at request time.

Optional variables for future extension:

# .env.local (optional)
NEXT_PUBLIC_SITE_URL=https://crawlscope.vercel.app

Audit Categories

Category	Checks	Scoring weight
SEO	12 checks — title, meta desc, H1, canonical, JSON-LD, OG tags, lang, robots, internal links, HTTPS	30%
Performance	11 checks — render-blocking, WebP, compression, caching, minification, unused JS/CSS	30%
Accessibility	12 checks — alt text, contrast, ARIA, button names, form labels, skip nav	20%
Best Practices	Lighthouse native best-practices score	20%
Technical	9 checks — HTTPS, status code, viewport, deprecated HTML, vulnerable libraries	Info
Content	6 checks — title/desc length, heading order, word count, internal linking	Info

Overall score = Performance × 0.3 + SEO × 0.3 + Accessibility × 0.2 + BestPractices × 0.2

Contributing

Pull requests are welcome. For major changes, open an issue first.

# Fork and clone
git clone https://github.com/your-username/crawlscope.git
cd crawlscope

# Create a feature branch
git checkout -b feat/your-feature

# Make changes, then
npm run lint
npm run build

# Push and open a PR
git push origin feat/your-feature

Roadmap

Vercel-compatible deployment with @sparticuz/chromium
Batch URL scanning
Shareable report URLs (short-lived, no database)
CLI mode (npx crawlscope https://example.com)
GitHub Action for CI auditing
Score trend comparison (re-scan diff)

Built with Next.js · Lighthouse · Puppeteer · Tailwind CSS

crawlscope.vercel.app

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
public		public
src		src
.gitignore		.gitignore
next.config.mjs		next.config.mjs
package-lock.json		package-lock.json
package.json		package.json
postcss.config.mjs		postcss.config.mjs
project-file-tree.json		project-file-tree.json
readme.md		readme.md
tailwind.config.ts		tailwind.config.ts
tsconfig.json		tsconfig.json
vercel.json		vercel.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CrawlScope

What is CrawlScope?

Features

Tech Stack

Frontend

Backend / Scanning

UI Primitives

Project Structure

Getting Started

Prerequisites

Installation

Development

Production Build

How a Scan Works

API Reference

`POST /api/scan`

Export Formats

Deployment

Vercel (recommended)

Self-hosted / VPS

Docker

Environment Variables

Audit Categories

Contributing

Roadmap

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

CrawlScope

What is CrawlScope?

Features

Tech Stack

Frontend

Backend / Scanning

UI Primitives

Project Structure

Getting Started

Prerequisites

Installation

Development

Production Build

How a Scan Works

API Reference

POST /api/scan

Export Formats

Deployment

Vercel (recommended)

Self-hosted / VPS

Docker

Environment Variables

Audit Categories

Contributing

Roadmap

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`POST /api/scan`

Packages