Permalink
Browse files

Updated README and documentation

  • Loading branch information...
chriso committed Nov 17, 2010
1 parent 01eb509 commit 8f797fb657c6eff0e024d90cc16dfad26bcc026e
Showing with 24 additions and 21 deletions.
  1. +21 −12 README.md
  2. +3 −9 docs/README.md
View
@@ -1,27 +1,36 @@
-# [node.io](http://node.io/)
-
-A distributed data scraping and processing engine for [Node.js](http://nodejs.org/)
-
To install node.io, use [npm](http://github.com/isaacs/npm):
$ npm install node.io
For usage details, run
$ node.io --help
-
+
+## What is [node.io](http://node.io/)?
+
+node.io is a framework for scraping and processing data. A node.io job typically consists of a) taking some input, b) using or transforming it, and c) outputting something.
+
+node.io can simplify the process of:
+
+- Filtering / sanitizing a list
+- MapReduce
+- Loading a list of URLs and scraping and saving some data from each
+- Parsing log files
+- Transforming data from one format to another, e.g. from CSV to a database
+- Recursively load all files in a directory and it's subdirectories and execute a command on each file
+
## Why node.io?
- Create modular and extensible jobs for scraping and processing data
-- Seamlessly distribute work among child processes and other servers (soon)
-- Written in Node.js == FAST
-- Handles a variety of input / output
+- Written in Node.js and Javascript - jobs are concise, asynchronous and FAST
+- Speed up execution by distributing work among child processes and other servers (soon)
+- Easily handle a variety of input / output situations
* Reading / writing lines to and from files
- * Reading all files in a directory (and recursing if specified)
- * To / from a database
+ * Reading all files in a directory (and optionally recursing)
+ * Reading / writing rows to and from a database
* STDIN / STDOUT
* Piping between other node.io jobs
- * Custom IO / any combination of the above
+ * Any combination of the above, or completely custom IO
- Includes a robust framework for scraping and selecting web data
- Support for a variety of proxies when making requests
- Includes a data validation and sanitization framework
@@ -31,7 +40,7 @@ For usage details, run
Initial documentation is [available here](https://github.com/chriso/node.io/tree/master/docs/).
-Better documentation will be available once I have time to write it. See [http://node.io/](http://node.io/) for updates.
+Better documentation will be available once I have time to write it.. See [http://node.io/](http://node.io/) for updates.
## Examples
View
@@ -1,20 +1,14 @@
-node.io executes jobs in the following format.
-
-job.js
+A node.io job takes the following format
var Job = require('node.io').Job;
-
var options = {}, methods = {};
-
exports.job = new Job(options, methods);
-To run job.js from the command line, run the following command in the same directory:
+To run this job (e.g. saved as myjob.js) from the command line, run the following command in the same directory
$ node.io myjob
-A typical node.io job typically consists of a) taking some input, b) using or transforming it, and c) outputting something.
-
-A full list of available job methods and options is [available here](#). however jobs typically contain an input, run, and output method. If omitted, input and output default to STDIN and STDOUT.
+A full list of available job methods and options is [available here](#), however jobs typically contain an input, run, and output method. If omitted, input and output default to STDIN and STDOUT.
## Getting started

0 comments on commit 8f797fb

Please sign in to comment.