Skip to content
No description, website, or topics provided.
Branch: master
Clone or download
Type Name Latest commit message Commit time
Failed to load latest commit information.
bin Made require('') relative from ./bin Jul 20, 2011
lib/ console.debug is not defined May 18, 2013
vendor Update validator Nov 5, 2011
.gitmodules Fixed issue when htmlparser isn't installed Apr 18, 2011
LICENSE First commit Nov 16, 2010
Makefile The domain is no longer for sale Dec 4, 2014
index.js First commit Nov 16, 2010

Note: this library is no longer maintained.

I wrote in 2010 when node.js was still in its infancy and the npm repository didn't have the amazing choice of libraries as it does today.

Since it's now quite trivial to write your own scraper I've decided to stop maintaining the library.

Here's an example using request, cheerio and async.

var request = require('request')
  , cheerio = require('cheerio')
  , async = require('async')
  , format = require('util').format;

var reddits = [ 'programming', 'javascript', 'node' ]
  , concurrency = 2;

async.eachLimit(reddits, concurrency, function (reddit, next) {
    var url = format('', reddit);
    request(url, function (err, response, body) {
        if (err) throw err;
        var $ = cheerio.load(body);
        $('a.title').each(function () {
            console.log('%s (%s)', $(this).text(), $(this).attr('href'));

Happy scraping.

You can’t perform that action at this time.