Crawl any website, discover all its pages, and take full-page screenshots β saved in a folder structure that mirrors the URL path. Built with Crawlee and Playwright.
@posty5/screenshoter is a TypeScript package that automates the process of capturing screenshots for every page on a website. It handles two tasks:
- Crawl β Discover all URLs on a website using Crawlee's PlaywrightCrawler or sitemap parsing
- Capture β Take a screenshot of each URL using Playwright and save it to disk
- Single Page β Capture a screenshot of a single URL without crawling
- Batch Capture β Capture screenshots of a list of URLs you provide
Screenshots are saved in a folder structure that mirrors the URL path:
https://posty5.com/en/social-media-publisher
β screenshots/posty5.com/en/social-media-publisher/capture.webp
https://posty5.com/en/qr-code-generator
β screenshots/posty5.com/en/qr-code-generator/capture.webp
https://posty5.com/
β screenshots/posty5.com/capture.webp
Use Cases:
- πΈ Visual regression testing
- ποΈ Website archiving and documentation
- πΌοΈ Generating preview images for SEO / social sharing
- π Auditing website pages at scale
- π Collecting URLs from a website for analysis
npm install @posty5/screenshoterAfter installing, make sure Playwright's Chromium browser is available:
npx playwright install chromiumimport { captureWebsite } from "@posty5/screenshoter";
const result = await captureWebsite({
url: "https://posty5.com",
outputDir: "./screenshots",
format: "webp",
excludePatterns: ["/api/**", "/user/*"],
});
console.log(`Captured ${result.captured} of ${result.totalUrls} pages`);# Take screenshots of all pages
npx screenshoter capture https://posty5.com -o ./screenshots
# Collect URLs only (save to file)
npx screenshoter collect-urls https://posty5.com --file urls.txtCrawl a website, discover all pages, and take a screenshot of each one.
Returns: Promise<CaptureResult>
import { captureWebsite } from "@posty5/screenshoter";
const result = await captureWebsite({
url: "https://posty5.com",
outputDir: "./screenshots",
format: "webp",
viewport: { width: 1440, height: 900 },
maxPages: 50,
maxDepth: 3,
excludePatterns: ["/api/**", "/user/*", "*.pdf"],
concurrency: 3,
waitAfterLoad: 1000,
headless: true,
});
console.log(result);
// {
// totalUrls: 42,
// captured: 40,
// failed: 2,
// skipped: 0,
// pages: [
// { url: 'https://posty5.com/en', filePath: './screenshots/posty5.com/en/capture.webp', status: 'ok' },
// { url: 'https://posty5.com/en/about', filePath: './screenshots/posty5.com/en/about/capture.webp', status: 'ok' },
// ...
// ]
// }Crawl a website and save all discovered URLs to a text file β one URL per line.
Config type: CollectUrlsConfig β only URL collection options (no screenshot settings).
Returns: Promise<string[]>
import { collectUrls } from "@posty5/screenshoter";
const urls = await collectUrls({
url: "https://posty5.com",
outputFile: "./urls.txt",
maxPages: 100,
excludePatterns: ["/api/**"],
});
console.log(`Found ${urls.length} URLs`);
// urls.txt:
// https://posty5.com
// https://posty5.com/en
// https://posty5.com/en/social-media-publisher
// https://posty5.com/en/qr-code-generator
// ...Capture a screenshot of a single page β no crawling involved.
Config type: CapturePageConfig β only screenshot options + url.
Returns: Promise<PageResult>
import { capturePage } from "@posty5/screenshoter";
const result = await capturePage({
url: "https://posty5.com/en/about",
outputDir: "./screenshots",
format: "png",
viewport: { width: 1440, height: 710 },
waitAfterLoad: 1500,
});
console.log(result);
// {
// url: 'https://posty5.com/en/about',
// filePath: './screenshots/posty5.com/en/about/capture.png',
// status: 'ok'
// }Capture screenshots for a list of URLs. Reuses a single browser instance and supports concurrency.
Config type: CaptureConfig β only screenshot options (no url or crawl settings).
Returns: Promise<CaptureResult>
import { capturePages } from "@posty5/screenshoter";
const result = await capturePages(["https://posty5.com", "https://posty5.com/en/about", "https://posty5.com/en/social-media-publisher"], {
outputDir: "./screenshots",
format: "png",
viewport: { width: 1440, height: 710 },
concurrency: 2,
waitAfterLoad: 1500,
});
console.log(`Captured ${result.captured} of ${result.totalUrls} pages`);A beforeScreenshot utility that smoothly scrolls to the bottom of the page and waits for scroll-triggered animations to settle before the screenshot is taken. Useful for pages with fade-in or reveal animations driven by IntersectionObserver.
Returns: Promise<void>
import { capturePages, scrollToEnd } from "@posty5/screenshoter";
const result = await capturePages(urls, {
outputDir: "./screenshots",
format: "png",
fullPage: true,
beforeScreenshot: scrollToEnd,
});The configuration is split into two concerns:
Used by collectUrls() and captureWebsite().
| Option | Type | Default | Description |
|---|---|---|---|
url |
string |
(required) | Root URL to start crawling from |
strategy |
'crawl' | 'sitemap' |
'crawl' |
URL discovery strategy |
sitemapUrl |
string |
<url>/sitemap.xml |
Custom sitemap URL (sitemap strategy only) |
maxPages |
number |
100 |
Maximum number of pages to crawl |
maxDepth |
number |
5 |
Maximum crawl depth from the root URL |
excludePatterns |
string[] |
[] |
Glob patterns for URL paths to skip |
includePatterns |
string[] |
[] |
Glob patterns β only matching URLs are captured |
sameDomainOnly |
boolean |
true |
Only capture URLs on the same domain |
headless |
boolean |
true |
Run browser in headless mode (crawl strategy) |
shouldCapture |
(url: string) => boolean |
β | Programmatic filter β return false to skip a URL |
For collectUrls(), an additional option is available:
| Option | Type | Default | Description |
|---|---|---|---|
outputFile |
string |
'urls.txt' |
File path to save the URL list |
Used by capturePage(), capturePages(), and captureWebsite().
| Option | Type | Default | Description |
|---|---|---|---|
outputDir |
string |
'./screenshots' |
Directory to save screenshots |
format |
'webp' | 'png' | 'jpeg' |
'webp' |
Screenshot image format |
viewport |
{ width, height } |
{ width: 1440, height: 900 } |
Browser viewport size |
fullPage |
boolean |
false |
Capture the full scrollable page instead of just the viewport |
waitAfterLoad |
number |
1000 |
Milliseconds to wait after page load before screenshot |
concurrency |
number |
3 |
Number of parallel browser pages |
headless |
boolean |
true |
Run browser in headless mode |
scrollToEnd |
boolean |
false |
Smoothly scroll to the bottom before screenshot (triggers scroll animations). Auto-enabled when fullPage is true |
beforeScreenshot |
(page: Page) => Promise<void> |
β | Callback to run before each screenshot |
captureWebsite() accepts both sets of options combined (CaptureWebsiteConfig).
Three mechanisms to control which pages get captured:
Skip URLs matching glob patterns:
await captureWebsite({
url: "https://example.com",
excludePatterns: [
"/api/**", // Skip all API routes
"/user/*", // Skip user profile pages
"/*/edit", // Skip edit pages
"*.pdf", // Skip PDF links
"/auth/**", // Skip auth pages
],
});When set, only matching URLs are captured:
await captureWebsite({
url: "https://example.com",
includePatterns: [
"/en/**", // Only capture English pages
"/products/*", // Only capture product pages
],
});For complex logic:
await captureWebsite({
url: "https://example.com",
shouldCapture: (url) => {
const parsed = new URL(url);
// Skip URLs with query parameters
if (parsed.search) return false;
// Skip paths with more than 4 segments
if (parsed.pathname.split("/").filter(Boolean).length > 4) return false;
return true;
},
});npx screenshoter capture <url> [options]| Option | Description | Default |
|---|---|---|
-o, --output <dir> |
Output directory | ./screenshots |
-f, --format <format> |
Image format: webp, png, jpeg |
webp |
--width <number> |
Viewport width | 1440 |
--height <number> |
Viewport height | 900 |
--max-pages <number> |
Maximum pages to crawl | 100 |
--max-depth <number> |
Maximum crawl depth | 5 |
--exclude <patterns...> |
Glob patterns to exclude | β |
--include <patterns...> |
Glob patterns to include | β |
--concurrency <number> |
Parallel browser pages | 3 |
--wait <number> |
Wait ms after page load | 1000 |
--no-headless |
Show browser window | β |
--urls-file <path> |
Use URLs from a file instead of crawling | β |
Examples:
# Basic usage
npx screenshoter capture https://posty5.com
# Custom output and format
npx screenshoter capture https://posty5.com -o ./my-screenshots -f png
# Exclude dynamic pages
npx screenshoter capture https://posty5.com --exclude "/api/**" "/user/*" "/auth/**"
# Only capture specific sections
npx screenshoter capture https://posty5.com --include "/en/**"
# Limit crawl depth and page count
npx screenshoter capture https://posty5.com --max-pages 20 --max-depth 2
# Use a pre-collected URLs file
npx screenshoter capture https://posty5.com --urls-file urls.txtnpx screenshoter collect-urls <url> [options]| Option | Description | Default |
|---|---|---|
--file <path> |
Output file path | urls.txt |
--max-pages <number> |
Maximum pages to crawl | 100 |
--max-depth <number> |
Maximum crawl depth | 5 |
--exclude <patterns...> |
Glob patterns to exclude | β |
--include <patterns...> |
Glob patterns to include | β |
--no-headless |
Show browser window | β |
Examples:
# Collect all URLs
npx screenshoter collect-urls https://posty5.com
# Save to custom file with filters
npx screenshoter collect-urls https://posty5.com --file sitemap.txt --exclude "/api/**"Some pages reveal content as you scroll (using IntersectionObserver or CSS fade-in classes). A viewport-only screenshot would capture those elements as invisible/empty.
Use scrollToEnd: true or the built-in scrollToEnd utility to scroll through the page first:
import { capturePages, scrollToEnd } from "@posty5/screenshoter";
// Option 1 β config flag (auto-enabled when fullPage: true)
const result = await capturePages(urls, {
fullPage: true, // scrollToEnd is automatically true
waitAfterLoad: 1000,
});
// Option 2 β explicit flag
const result = await capturePages(urls, {
scrollToEnd: true,
waitAfterLoad: 1000,
});
// Option 3 β use the scrollToEnd utility as a beforeScreenshot hook
const result = await capturePages(urls, {
fullPage: true,
beforeScreenshot: scrollToEnd,
});Dismiss cookie banners, close popups, or interact with the page before taking the screenshot:
await captureWebsite({
url: "https://example.com",
beforeScreenshot: async (page) => {
// Dismiss cookie consent
const cookieBtn = page.locator('button:has-text("Accept")');
if (await cookieBtn.isVisible()) {
await cookieBtn.click();
await page.waitForTimeout(500);
}
},
});First collect URLs, manually edit the list, then capture only the URLs you want:
# Step 1: Collect all URLs
npx screenshoter collect-urls https://posty5.com --file urls.txt
# Step 2: Edit urls.txt β remove any pages you don't want
# Step 3: Capture only the remaining URLs
npx screenshoter capture https://posty5.com --urls-file urls.txtOr programmatically with separated configs:
import { collectUrls, capturePages } from "@posty5/screenshoter";
// Step 1: Collect URLs (CollectUrlsConfig only)
const urls = await collectUrls({
url: "https://posty5.com",
strategy: "sitemap",
maxPages: 500,
excludePatterns: ["/api/**"],
outputFile: "./urls.txt",
});
// Step 2: Filter programmatically
const staticPages = urls.filter((u) => !u.includes("/trends/"));
// Step 3: Capture with separate config (CaptureConfig only)
const result = await capturePages(staticPages, {
outputDir: "./screenshots",
format: "png",
viewport: { width: 1440, height: 710 },
concurrency: 3,
});The output folder mirrors the URL path structure:
screenshots/
βββ posty5.com/
β βββ capture.webp β https://posty5.com
β βββ en/
β β βββ capture.webp β https://posty5.com/en
β β βββ social-media-publisher/
β β β βββ capture.webp β https://posty5.com/en/social-media-publisher
β β βββ qr-code-generator/
β β β βββ capture.webp β https://posty5.com/en/qr-code-generator
β β βββ url-shortener/
β β βββ capture.webp β https://posty5.com/en/url-shortener
β βββ ar/
β βββ capture.webp β https://posty5.com/ar
β βββ social-media-publisher/
β βββ capture.webp β https://posty5.com/ar/social-media-publisher
- Node.js: >= 18.0.0
- TypeScript: Full type definitions included
- Browser: Chromium (auto-installed via Playwright)
- Website: https://posty5.com
- Dashboard: https://studio.posty5.com
- GitHub: https://github.com/Posty5/Screenshoter
- Support: https://posty5.com/contact-us
- Contact Us: https://posty5.com/contact-us
- GitHub Issues: Report bugs or request features
Contributions are welcome! Please follow these steps:
- Fork the repository
- Create a feature branch:
git checkout -b feature/amazing-feature - Make your changes
- Commit your changes:
git commit -m 'Add amazing feature' - Push to the branch:
git push origin feature/amazing-feature - Submit a pull request
MIT License - see LICENSE file for details.
Made with β€οΈ by the Posty5 team