Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
Newer
Older
100644 73 lines (52 sloc) 2.892 kb
f5731bf @chriso Updated README
authored
1 To install node.io, use [npm](http://github.com/isaacs/npm):
2
3 $ npm install node.io
bc05fa1 @chriso Updated README
authored
4
d9da9dc @chriso Updated README
authored
5 For usage details, run
6
7 $ node.io --help
8f797fb @chriso Updated README and documentation
authored
8
0f51998 @chriso Updated README
authored
9 To get started, see the [documentation](https://github.com/chriso/node.io/blob/master/docs/README.md), [examples](https://github.com/chriso/node.io/tree/master/examples/), or [API](https://github.com/chriso/node.io/blob/master/docs/api.md).
5fab6c3 @chriso Updated README
authored
10
8f797fb @chriso Updated README and documentation
authored
11 ## What is [node.io](http://node.io/)?
12
77811bb @chriso Updated README
authored
13 node.io is a data scraping and processing framework for [node.js](http://nodejs.org/).
14
15 A node.io job typically consists of a) taking some input, b) using or transforming it, and c) outputting something.
8f797fb @chriso Updated README and documentation
authored
16
17 node.io can simplify the process of:
18
19 - Filtering / sanitizing a list
20 - MapReduce
24ee8df @chriso Updated documentation
authored
21 - Loading a list of URLs and scraping some data from each
8f797fb @chriso Updated README and documentation
authored
22 - Parsing log files
23 - Transforming data from one format to another, e.g. from CSV to a database
24ee8df @chriso Updated documentation
authored
24 - Recursively load all files in a directory and execute a command on each
77811bb @chriso Updated README
authored
25 - etc. etc.
5fab6c3 @chriso Updated README
authored
26
e535de9 @chriso Updated README
authored
27 ## Why node.io?
58de877 @chriso Updated README
authored
28
a3e56c1 @chriso Updated README and made some JSLint fixes
authored
29 - Create modular and extensible jobs for scraping and processing data
8f797fb @chriso Updated README and documentation
authored
30 - Written in Node.js and Javascript - jobs are concise, asynchronous and FAST
31 - Speed up execution by distributing work among child processes and other servers (soon)
32 - Easily handle a variety of input / output situations
bc05fa1 @chriso Updated README
authored
33 * Reading / writing lines to and from files
8f797fb @chriso Updated README and documentation
authored
34 * Reading all files in a directory (and optionally recursing)
35 * Reading / writing rows to and from a database
bc05fa1 @chriso Updated README
authored
36 * STDIN / STDOUT
01eb509 @chriso Updated README
authored
37 * Piping between other node.io jobs
24ee8df @chriso Updated documentation
authored
38 * Any combination of the above, or your own IO
e43509d @chriso Updated README
authored
39 - Includes a robust framework for scraping and selecting web data
a3e56c1 @chriso Updated README and made some JSLint fixes
authored
40 - Support for a variety of proxies when making requests
41 - Includes a data validation and sanitization framework
42 - Provides support for retries, timeouts, dynamically adding input, etc.
bc05fa1 @chriso Updated README
authored
43
01eb509 @chriso Updated README
authored
44 ## Documentation
bc05fa1 @chriso Updated README
authored
45
01eb509 @chriso Updated README
authored
46 Initial documentation is [available here](https://github.com/chriso/node.io/tree/master/docs/).
bc05fa1 @chriso Updated README
authored
47
5fab6c3 @chriso Updated README
authored
48 Better documentation will be available once I have time to write it.
bc05fa1 @chriso Updated README
authored
49
22f8535 @chriso Added Roadmap
authored
50 ## Roadmap
51
5fab6c3 @chriso Updated README
authored
52 - Fix up the [http://node.io/](http://node.io/) site
1080b2f @chriso Fixed README bug with markdown
authored
53 - Automatically handle HTTP codes, e.g. redirect on 3xx or call fail() on 4xx/5xx
22f8535 @chriso Added Roadmap
authored
54 - Nested requests inherit referrer / cookies if to the same domain
55 - Add more DOM selector / traversal methods
56 - Test proxy callbacks
57 - Add distributed processing
c1dc9da @chriso Added initial documentation
authored
58 - Installation without NPM (install.sh)
59 - Refactoring
fab397c @chriso Updated README
authored
60 - More tests / better test coverage
22f8535 @chriso Added Roadmap
authored
61
d9da9dc @chriso Updated README
authored
62 ## Credits
63
459525b @chriso Updated README
authored
64 node.io uses the following libraries
d9da9dc @chriso Updated README
authored
65
459525b @chriso Updated README
authored
66 - [ry's](https://github.com/ry) [node.js](http://nodejs.org/)
1f8d7e7 @chriso Updated README
authored
67 - [tautologistics'](https://github.com/tautologistics) [node-htmlparser](https://github.com/tautologistics/node-htmlparser)
68 - [harryf's](https://github.com/harryf) [soupselect](https://github.com/harryf/node-soupselect)
69 - [kriszyp's](https://github.com/kriszyp) [multi-node](https://github.com/kriszyp/multi-node)
d9da9dc @chriso Updated README
authored
70
71 ## License
72
1f8d7e7 @chriso Updated README
authored
73 [MIT License](https://github.com/chriso/node.io/raw/master/LICENSE)
Something went wrong with that request. Please try again.