@posty5/screenshoter

Crawl any website, discover all its pages, and take full-page screenshots — saved in a folder structure that mirrors the URL path. Built with Crawlee and Playwright.

🌟 What is @posty5/screenshoter?

@posty5/screenshoter is a TypeScript package that automates the process of capturing screenshots for every page on a website. It handles two tasks:

Crawl — Discover all URLs on a website using Crawlee's PlaywrightCrawler or sitemap parsing
Capture — Take a screenshot of each URL using Playwright and save it to disk
Single Page — Capture a screenshot of a single URL without crawling
Batch Capture — Capture screenshots of a list of URLs you provide

Screenshots are saved in a folder structure that mirrors the URL path:

https://posty5.com/en/social-media-publisher
  → screenshots/posty5.com/en/social-media-publisher/capture.webp

https://posty5.com/en/qr-code-generator
  → screenshots/posty5.com/en/qr-code-generator/capture.webp

https://posty5.com/
  → screenshots/posty5.com/capture.webp

Use Cases:

📸 Visual regression testing
🗂️ Website archiving and documentation
🖼️ Generating preview images for SEO / social sharing
📊 Auditing website pages at scale
🔍 Collecting URLs from a website for analysis

📦 Installation

npm install @posty5/screenshoter

After installing, make sure Playwright's Chromium browser is available:

npx playwright install chromium

🚀 Quick Start

Programmatic API

import { captureWebsite } from "@posty5/screenshoter";

const result = await captureWebsite({
  url: "https://posty5.com",
  outputDir: "./screenshots",
  format: "webp",
  excludePatterns: ["/api/**", "/user/*"],
});

console.log(`Captured ${result.captured} of ${result.totalUrls} pages`);

CLI

# Take screenshots of all pages
npx screenshoter capture https://posty5.com -o ./screenshots

# Collect URLs only (save to file)
npx screenshoter collect-urls https://posty5.com --file urls.txt

📚 API Reference

`captureWebsite(config)`

Crawl a website, discover all pages, and take a screenshot of each one.

Returns: Promise<CaptureResult>

import { captureWebsite } from "@posty5/screenshoter";

const result = await captureWebsite({
  url: "https://posty5.com",
  outputDir: "./screenshots",
  format: "webp",
  viewport: { width: 1440, height: 900 },
  maxPages: 50,
  maxDepth: 3,
  excludePatterns: ["/api/**", "/user/*", "*.pdf"],
  concurrency: 3,
  waitAfterLoad: 1000,
  headless: true,
});

console.log(result);
// {
//   totalUrls: 42,
//   captured: 40,
//   failed: 2,
//   skipped: 0,
//   pages: [
//     { url: 'https://posty5.com/en', filePath: './screenshots/posty5.com/en/capture.webp', status: 'ok' },
//     { url: 'https://posty5.com/en/about', filePath: './screenshots/posty5.com/en/about/capture.webp', status: 'ok' },
//     ...
//   ]
// }

`collectUrls(config)`

Crawl a website and save all discovered URLs to a text file — one URL per line.

Config type: CollectUrlsConfig — only URL collection options (no screenshot settings).

Returns: Promise<string[]>

import { collectUrls } from "@posty5/screenshoter";

const urls = await collectUrls({
  url: "https://posty5.com",
  outputFile: "./urls.txt",
  maxPages: 100,
  excludePatterns: ["/api/**"],
});

console.log(`Found ${urls.length} URLs`);
// urls.txt:
// https://posty5.com
// https://posty5.com/en
// https://posty5.com/en/social-media-publisher
// https://posty5.com/en/qr-code-generator
// ...

`capturePage(config)`

Capture a screenshot of a single page — no crawling involved.

Config type: CapturePageConfig — only screenshot options + url.

Returns: Promise<PageResult>

import { capturePage } from "@posty5/screenshoter";

const result = await capturePage({
  url: "https://posty5.com/en/about",
  outputDir: "./screenshots",
  format: "png",
  viewport: { width: 1440, height: 710 },
  waitAfterLoad: 1500,
});

console.log(result);
// {
//   url: 'https://posty5.com/en/about',
//   filePath: './screenshots/posty5.com/en/about/capture.png',
//   status: 'ok'
// }

`capturePages(urls, config?)`

Capture screenshots for a list of URLs. Reuses a single browser instance and supports concurrency.

Config type: CaptureConfig — only screenshot options (no url or crawl settings).

Returns: Promise<CaptureResult>

import { capturePages } from "@posty5/screenshoter";

const result = await capturePages(["https://posty5.com", "https://posty5.com/en/about", "https://posty5.com/en/social-media-publisher"], {
  outputDir: "./screenshots",
  format: "png",
  viewport: { width: 1440, height: 710 },
  concurrency: 2,
  waitAfterLoad: 1500,
});

console.log(`Captured ${result.captured} of ${result.totalUrls} pages`);

`scrollToEnd(page)`

A beforeScreenshot utility that smoothly scrolls to the bottom of the page and waits for scroll-triggered animations to settle before the screenshot is taken. Useful for pages with fade-in or reveal animations driven by IntersectionObserver.

Returns: Promise<void>

import { capturePages, scrollToEnd } from "@posty5/screenshoter";

const result = await capturePages(urls, {
  outputDir: "./screenshots",
  format: "png",
  fullPage: true,
  beforeScreenshot: scrollToEnd,
});

⚙️ Configuration

The configuration is split into two concerns:

URL Collection Options (`CollectUrlsConfig`)

Used by collectUrls() and captureWebsite().

Option	Type	Default	Description
`url`	`string`	(required)	Root URL to start crawling from
`strategy`	`'crawl' \| 'sitemap'`	`'crawl'`	URL discovery strategy
`sitemapUrl`	`string`	`<url>/sitemap.xml`	Custom sitemap URL (sitemap strategy only)
`maxPages`	`number`	`100`	Maximum number of pages to crawl
`maxDepth`	`number`	`5`	Maximum crawl depth from the root URL
`excludePatterns`	`string[]`	`[]`	Glob patterns for URL paths to skip
`includePatterns`	`string[]`	`[]`	Glob patterns — only matching URLs are captured
`sameDomainOnly`	`boolean`	`true`	Only capture URLs on the same domain
`headless`	`boolean`	`true`	Run browser in headless mode (crawl strategy)
`shouldCapture`	`(url: string) => boolean`	—	Programmatic filter — return `false` to skip a URL

For collectUrls(), an additional option is available:

Option	Type	Default	Description
`outputFile`	`string`	`'urls.txt'`	File path to save the URL list

Screenshot Options (`CaptureConfig`)

Used by capturePage(), capturePages(), and captureWebsite().

Option	Type	Default	Description
`outputDir`	`string`	`'./screenshots'`	Directory to save screenshots
`format`	`'webp' \| 'png' \| 'jpeg'`	`'webp'`	Screenshot image format
`viewport`	`{ width, height }`	`{ width: 1440, height: 900 }`	Browser viewport size
`fullPage`	`boolean`	`false`	Capture the full scrollable page instead of just the viewport
`waitAfterLoad`	`number`	`1000`	Milliseconds to wait after page load before screenshot
`concurrency`	`number`	`3`	Number of parallel browser pages
`headless`	`boolean`	`true`	Run browser in headless mode
`scrollToEnd`	`boolean`	`false`	Smoothly scroll to the bottom before screenshot (triggers scroll animations). Auto-enabled when `fullPage` is `true`
`beforeScreenshot`	`(page: Page) => Promise<void>`	—	Callback to run before each screenshot

captureWebsite() accepts both sets of options combined (CaptureWebsiteConfig).

🚫 Filtering Dynamic Pages

Three mechanisms to control which pages get captured:

1. Exclude Patterns

Skip URLs matching glob patterns:

await captureWebsite({
  url: "https://example.com",
  excludePatterns: [
    "/api/**", // Skip all API routes
    "/user/*", // Skip user profile pages
    "/*/edit", // Skip edit pages
    "*.pdf", // Skip PDF links
    "/auth/**", // Skip auth pages
  ],
});

2. Include Patterns (Whitelist)

When set, only matching URLs are captured:

await captureWebsite({
  url: "https://example.com",
  includePatterns: [
    "/en/**", // Only capture English pages
    "/products/*", // Only capture product pages
  ],
});

3. Custom Filter Callback

For complex logic:

await captureWebsite({
  url: "https://example.com",
  shouldCapture: (url) => {
    const parsed = new URL(url);
    // Skip URLs with query parameters
    if (parsed.search) return false;
    // Skip paths with more than 4 segments
    if (parsed.pathname.split("/").filter(Boolean).length > 4) return false;
    return true;
  },
});

🖥️ CLI Reference

`capture` — Take Screenshots

npx screenshoter capture <url> [options]

Option	Description	Default
`-o, --output <dir>`	Output directory	`./screenshots`
`-f, --format <format>`	Image format: `webp`, `png`, `jpeg`	`webp`
`--width <number>`	Viewport width	`1440`
`--height <number>`	Viewport height	`900`
`--max-pages <number>`	Maximum pages to crawl	`100`
`--max-depth <number>`	Maximum crawl depth	`5`
`--exclude <patterns...>`	Glob patterns to exclude	—
`--include <patterns...>`	Glob patterns to include	—
`--concurrency <number>`	Parallel browser pages	`3`
`--wait <number>`	Wait ms after page load	`1000`
`--no-headless`	Show browser window	—
`--urls-file <path>`	Use URLs from a file instead of crawling	—

Examples:

# Basic usage
npx screenshoter capture https://posty5.com

# Custom output and format
npx screenshoter capture https://posty5.com -o ./my-screenshots -f png

# Exclude dynamic pages
npx screenshoter capture https://posty5.com --exclude "/api/**" "/user/*" "/auth/**"

# Only capture specific sections
npx screenshoter capture https://posty5.com --include "/en/**"

# Limit crawl depth and page count
npx screenshoter capture https://posty5.com --max-pages 20 --max-depth 2

# Use a pre-collected URLs file
npx screenshoter capture https://posty5.com --urls-file urls.txt

`collect-urls` — Collect URLs Only

npx screenshoter collect-urls <url> [options]

Option	Description	Default
`--file <path>`	Output file path	`urls.txt`
`--max-pages <number>`	Maximum pages to crawl	`100`
`--max-depth <number>`	Maximum crawl depth	`5`
`--exclude <patterns...>`	Glob patterns to exclude	—
`--include <patterns...>`	Glob patterns to include	—
`--no-headless`	Show browser window	—

Examples:

# Collect all URLs
npx screenshoter collect-urls https://posty5.com

# Save to custom file with filters
npx screenshoter collect-urls https://posty5.com --file sitemap.txt --exclude "/api/**"

🔧 Advanced Usage

Scroll Animations (Fade-in on Scroll)

Some pages reveal content as you scroll (using IntersectionObserver or CSS fade-in classes). A viewport-only screenshot would capture those elements as invisible/empty.

Use scrollToEnd: true or the built-in scrollToEnd utility to scroll through the page first:

import { capturePages, scrollToEnd } from "@posty5/screenshoter";

// Option 1 — config flag (auto-enabled when fullPage: true)
const result = await capturePages(urls, {
  fullPage: true, // scrollToEnd is automatically true
  waitAfterLoad: 1000,
});

// Option 2 — explicit flag
const result = await capturePages(urls, {
  scrollToEnd: true,
  waitAfterLoad: 1000,
});

// Option 3 — use the scrollToEnd utility as a beforeScreenshot hook
const result = await capturePages(urls, {
  fullPage: true,
  beforeScreenshot: scrollToEnd,
});

Before Screenshot Hook

Dismiss cookie banners, close popups, or interact with the page before taking the screenshot:

await captureWebsite({
  url: "https://example.com",
  beforeScreenshot: async (page) => {
    // Dismiss cookie consent
    const cookieBtn = page.locator('button:has-text("Accept")');
    if (await cookieBtn.isVisible()) {
      await cookieBtn.click();
      await page.waitForTimeout(500);
    }
  },
});

Two-Step Workflow: Collect → Edit → Capture

First collect URLs, manually edit the list, then capture only the URLs you want:

# Step 1: Collect all URLs
npx screenshoter collect-urls https://posty5.com --file urls.txt

# Step 2: Edit urls.txt — remove any pages you don't want

# Step 3: Capture only the remaining URLs
npx screenshoter capture https://posty5.com --urls-file urls.txt

Or programmatically with separated configs:

import { collectUrls, capturePages } from "@posty5/screenshoter";

// Step 1: Collect URLs (CollectUrlsConfig only)
const urls = await collectUrls({
  url: "https://posty5.com",
  strategy: "sitemap",
  maxPages: 500,
  excludePatterns: ["/api/**"],
  outputFile: "./urls.txt",
});

// Step 2: Filter programmatically
const staticPages = urls.filter((u) => !u.includes("/trends/"));

// Step 3: Capture with separate config (CaptureConfig only)
const result = await capturePages(staticPages, {
  outputDir: "./screenshots",
  format: "png",
  viewport: { width: 1440, height: 710 },
  concurrency: 3,
});

📂 Output Structure

The output folder mirrors the URL path structure:

screenshots/
├── posty5.com/
│   ├── capture.webp                              ← https://posty5.com
│   ├── en/
│   │   ├── capture.webp                          ← https://posty5.com/en
│   │   ├── social-media-publisher/
│   │   │   └── capture.webp                      ← https://posty5.com/en/social-media-publisher
│   │   ├── qr-code-generator/
│   │   │   └── capture.webp                      ← https://posty5.com/en/qr-code-generator
│   │   └── url-shortener/
│   │       └── capture.webp                      ← https://posty5.com/en/url-shortener
│   └── ar/
│       ├── capture.webp                          ← https://posty5.com/ar
│       └── social-media-publisher/
│           └── capture.webp                      ← https://posty5.com/ar/social-media-publisher

💻 Requirements

Node.js: >= 18.0.0
TypeScript: Full type definitions included
Browser: Chromium (auto-installed via Playwright)

📖 Resources

Website: https://posty5.com
Dashboard: https://studio.posty5.com
GitHub: https://github.com/Posty5/Screenshoter
Support: https://posty5.com/contact-us

🆘 Support

Contact Us: https://posty5.com/contact-us
GitHub Issues: Report bugs or request features

🤝 Contributing

Contributions are welcome! Please follow these steps:

Fork the repository
Create a feature branch: git checkout -b feature/amazing-feature
Make your changes
Commit your changes: git commit -m 'Add amazing feature'
Push to the branch: git push origin feature/amazing-feature
Submit a pull request

📄 License

MIT License - see LICENSE file for details.

Made with ❤️ by the Posty5 team

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.github/workflows		.github/workflows
src		src
.gitignore		.gitignore
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json
tsup.config.ts		tsup.config.ts

Folders and files

Latest commit

History

Repository files navigation

@posty5/screenshoter

🌟 What is @posty5/screenshoter?

📦 Installation

🚀 Quick Start

Programmatic API

CLI

📚 API Reference

captureWebsite(config)

collectUrls(config)

capturePage(config)

capturePages(urls, config?)

scrollToEnd(page)

⚙️ Configuration

URL Collection Options (CollectUrlsConfig)

Screenshot Options (CaptureConfig)

🚫 Filtering Dynamic Pages

1. Exclude Patterns

2. Include Patterns (Whitelist)

3. Custom Filter Callback

🖥️ CLI Reference

capture — Take Screenshots

collect-urls — Collect URLs Only

🔧 Advanced Usage

Scroll Animations (Fade-in on Scroll)

Before Screenshot Hook

Two-Step Workflow: Collect → Edit → Capture

📂 Output Structure

💻 Requirements

📖 Resources

🆘 Support

🤝 Contributing

📄 License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`captureWebsite(config)`

`collectUrls(config)`

`capturePage(config)`

`capturePages(urls, config?)`

`scrollToEnd(page)`

URL Collection Options (`CollectUrlsConfig`)

Screenshot Options (`CaptureConfig`)

`capture` — Take Screenshots

`collect-urls` — Collect URLs Only

Packages