Skip to content
No description, website, or topics provided.
Branch: master
Clone or download
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
bin Made require('node.io') relative from ./bin Jul 20, 2011
builtin
lib/node.io console.debug is not defined May 18, 2013
test
vendor Update validator Nov 5, 2011
.gitignore
.gitmodules Fixed issue when htmlparser isn't installed Apr 18, 2011
.npmignore
HISTORY.md
LICENSE First commit Nov 16, 2010
Makefile
README.md The domain is no longer for sale Dec 4, 2014
index.js First commit Nov 16, 2010
package.json

README.md

Note: this library is no longer maintained.

I wrote node.io in 2010 when node.js was still in its infancy and the npm repository didn't have the amazing choice of libraries as it does today.

Since it's now quite trivial to write your own scraper I've decided to stop maintaining the library.

Here's an example using request, cheerio and async.

var request = require('request')
  , cheerio = require('cheerio')
  , async = require('async')
  , format = require('util').format;

var reddits = [ 'programming', 'javascript', 'node' ]
  , concurrency = 2;

async.eachLimit(reddits, concurrency, function (reddit, next) {
    var url = format('http://reddit.com/r/%s', reddit);
    request(url, function (err, response, body) {
        if (err) throw err;
        var $ = cheerio.load(body);
        $('a.title').each(function () {
            console.log('%s (%s)', $(this).text(), $(this).attr('href'));
        });
        next();
    });
});

Happy scraping.

You can’t perform that action at this time.