Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
Newer
Older
100644 94 lines (69 sloc) 3.842 kb
2a78533 Chris O'Hara Fixed IO cases where job is to run once or forever
authored
1 # What is [node.io](http://node.io/)?
8f797fb Chris O'Hara Updated README and documentation
authored
2
77811bb Chris O'Hara Updated README
authored
3 node.io is a data scraping and processing framework for [node.js](http://nodejs.org/).
4
8f797fb Chris O'Hara Updated README and documentation
authored
5 node.io can simplify the process of:
6
7 - Filtering / sanitizing a list
8 - MapReduce
b3309be Chris O'Hara Updated README and API
authored
9 - Scraping data from the web using with familiar CSS selectors / traversal methods
10 - Scraping web data through a proxy
8f797fb Chris O'Hara Updated README and documentation
authored
11 - Parsing log files
12 - Transforming data from one format to another, e.g. from CSV to a database
b3309be Chris O'Hara Updated README and API
authored
13 - Recursively load all files in a directory and its subdirs and execute a command on each
14 - ETC
5fab6c3 Chris O'Hara Updated README
authored
15
e535de9 Chris O'Hara Updated README
authored
16 ## Why node.io?
58de877 Chris O'Hara Updated README
authored
17
a3e56c1 Chris O'Hara Updated README and made some JSLint fixes
authored
18 - Create modular and extensible jobs for scraping and processing data
b3309be Chris O'Hara Updated README and API
authored
19 - Jobs are written in Javascript or Coffeescript and run in Node.js - jobs are concise, asynchronous and FAST
8f797fb Chris O'Hara Updated README and documentation
authored
20 - Speed up execution by distributing work among child processes and other servers (soon)
21 - Easily handle a variety of input / output situations
bc05fa1 Chris O'Hara Updated README
authored
22 * Reading / writing lines to and from files
b3309be Chris O'Hara Updated README and API
authored
23 * Traversing files in a directory
8f797fb Chris O'Hara Updated README and documentation
authored
24 * Reading / writing rows to and from a database
b3309be Chris O'Hara Updated README and API
authored
25 * STDIN / STDOUT / Custom streams
26 * Piping between other node.io jobs
24ee8df Chris O'Hara Updated documentation
authored
27 * Any combination of the above, or your own IO
e43509d Chris O'Hara Updated README
authored
28 - Includes a robust framework for scraping and selecting web data
b3309be Chris O'Hara Updated README and API
authored
29 - Support for a variety of proxies when scraping web data
a3e56c1 Chris O'Hara Updated README and made some JSLint fixes
authored
30 - Includes a data validation and sanitization framework
31 - Provides support for retries, timeouts, dynamically adding input, etc.
bc05fa1 Chris O'Hara Updated README
authored
32
7552dac Chris O'Hara Updated README
authored
33 ## Installation
34
35 To install node.io, use [npm](http://github.com/isaacs/npm):
36
37 $ npm install node.io
38
39 For usage details, run
40
41 $ node.io --help
42
01eb509 Chris O'Hara Updated README
authored
43 ## Documentation
bc05fa1 Chris O'Hara Updated README
authored
44
b3309be Chris O'Hara Updated README and API
authored
45 To get started, see the [documentation](https://github.com/chriso/node.io/blob/master/docs/README.md), [API](https://github.com/chriso/node.io/blob/master/docs/api.md), and [examples](https://github.com/chriso/node.io/tree/master/examples/).
bc05fa1 Chris O'Hara Updated README
authored
46
5fab6c3 Chris O'Hara Updated README
authored
47 Better documentation will be available once I have time to write it.
bc05fa1 Chris O'Hara Updated README
authored
48
22f8535 Chris O'Hara Added Roadmap
authored
49 ## Roadmap
50
5fab6c3 Chris O'Hara Updated README
authored
51 - Fix up the [http://node.io/](http://node.io/) site
1080b2f Chris O'Hara Fixed README bug with markdown
authored
52 - Automatically handle HTTP codes, e.g. redirect on 3xx or call fail() on 4xx/5xx
22f8535 Chris O'Hara Added Roadmap
authored
53 - Nested requests inherit referrer / cookies if to the same domain
54 - Add more DOM selector / traversal methods
b3309be Chris O'Hara Updated README and API
authored
55 - Test proxy callbacks and write proxy documentation
22f8535 Chris O'Hara Added Roadmap
authored
56 - Add distributed processing
c1dc9da Chris O'Hara Added initial documentation
authored
57 - Installation without NPM (install.sh)
58 - Refactoring
fab397c Chris O'Hara Updated README
authored
59 - More tests / better test coverage
22f8535 Chris O'Hara Added Roadmap
authored
60
d9da9dc Chris O'Hara Updated README
authored
61 ## Credits
62
5aa225e Chris O'Hara Updated README
authored
63 node.io wouldn't be possible without
d9da9dc Chris O'Hara Updated README
authored
64
459525b Chris O'Hara Updated README
authored
65 - [ry's](https://github.com/ry) [node.js](http://nodejs.org/)
1f8d7e7 Chris O'Hara Updated README
authored
66 - [tautologistics'](https://github.com/tautologistics) [node-htmlparser](https://github.com/tautologistics/node-htmlparser)
67 - [harryf's](https://github.com/harryf) [soupselect](https://github.com/harryf/node-soupselect)
68 - [kriszyp's](https://github.com/kriszyp) [multi-node](https://github.com/kriszyp/multi-node)
d9da9dc Chris O'Hara Updated README
authored
69
70 ## License
71
f11a5bb Chris O'Hara Updated README
authored
72 (MIT License)
73
74 Copyright (c) 2010 Chris O'Hara <cohara87@gmail.com>
75
76 Permission is hereby granted, free of charge, to any person obtaining
77 a copy of this software and associated documentation files (the
78 "Software"), to deal in the Software without restriction, including
79 without limitation the rights to use, copy, modify, merge, publish,
80 distribute, sublicense, and/or sell copies of the Software, and to
81 permit persons to whom the Software is furnished to do so, subject to
82 the following conditions:
83
84 The above copyright notice and this permission notice shall be
85 included in all copies or substantial portions of the Software.
86
87 THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
88 EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
89 MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
90 NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
91 LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
92 OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
93 WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
Something went wrong with that request. Please try again.