Sample search engine with web crawler, built on Node.js + CouchDB + Limestone
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.
site Somehow updated to 0.1.93, HTTP module still throwing errors sometimes May 6, 2010 Added version info to Readme, added indexer info Apr 22, 2010
base64.js Added node-couch module Apr 15, 2010
settings.js bugfixes Mar 4, 2011
xmlpipe2.js Fixed numeration of saved pages, removed `published` from XML Apr 22, 2010

This is a really simple search engine, built with:

  • node.js
  • Sphinx search server
  • CouchDB
  • Express web framework

Right now it uses Node.js v0.1.90 (process.mixin is the only thing that won't work in 0.1.91 though).


First, you need to setup host to crawl in settings.js. Right now it's You also need CouchDB installed, and at least one database. Put database name and host into settings.js. You need to install Sphinx server from .



node spider.js

Spider starts crawling pages and putting them into DB.

To index crawled pages with Sphinx, enter node /path/to/xmlpipe2.js in sphinx.conf as xmlpipe2 source. To index newly created source right now, start indexer manually:

indexer <sourcename>


To start site, launch

node app.js

Site will be available at port 8000. Enter search terms in form and submit it.