Skip to content
/ jawa Public

Visual scraper interface, exports to puppeteer script which you can run anywhere.

License

Notifications You must be signed in to change notification settings

capJavert/jawa

Repository files navigation

Jawa - Visual Scraper

npm Chrome Web Store

🇭🇷 Started in Croatia

DALL·E 2022-10-17 03 53 08 (2)

Visual scraper interface, exports to puppeteer script which you can run anywhere. You can try it out here https://jawa.sh

Jawa allows you to visually click elements of any website and then export selectors as a config that you can run in any node environment to scrape the content when needed.

This repo consists of the:

  • web app
  • cli
  • browser extension

Web app

Web app that provides embedded browser for visually selecting elements and creating the scraper config that you can download and run through the CLI or Cloud.

Cloud scraping (Beta)

It is now supported to run your scraper config in the cloud directly from web app. Cloud scrapers use the same Jawa CLI. Currently cloud scrapers have limited availability.

If you need more usage you can check out Jawa Pro.

CLI

Simple CLI to run configs created and exported from web app. You can run it like this:

npx jawa path/to/scraper/config/file.json

or npx jawa --help to see all the options.

jawa package now also exports scrape function so it can be used outside of CLI in your apps or services:

import { scrape } from 'jawa'
const { scrape } = require('jawa')

Browser extension

Browser extension that runs the embedded browser which powers the visual scraper interface.

It is available on:

  • Chrome Web Store
  • Chrome extensions also work on all Chromium based browsers like:
    • Opera
    • Microsoft Edge
    • Brave