Skip to content
master
Go to file
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
bin
 
 
 
 
lib
 
 
 
 
 
 
 
 
 
 
 
 
 
 

README.md

Homepage: http://nrabinowitz.github.io/pjscrape/

Overview

pjscrape is a framework for anyone who's ever wanted a command-line tool for web scraping using Javascript and jQuery. Built for PhantomJS, it allows you to scrape pages in a fully rendered, Javascript-enabled context from the command line, no browser required.

Dependencies

Features

  • Client-side, Javascript-based scraping environment with full access to jQuery functions
  • Easy, flexible syntax for setting up one or more scrapers
  • Recursive/crawl scraping
  • Delay scrape until a "ready" condition occurs
  • Load your own scripts on the page before scraping
  • Modular architecture for logging and writing/formatting scraped items
  • Client-side utilities for common tasks
  • Growing set of unit tests

Please see http://nrabinowitz.github.io/pjscrape/ for usage, examples, and documentation.

Comments and questions welcomed at: nick (at) nickrabinowitz (dot) com.

About

A web-scraping framework written in Javascript, using PhantomJS and jQuery

Resources

License

Packages

No packages published
You can’t perform that action at this time.