Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
Newer
Older
100644 69 lines (48 sloc) 2.668 kb
f5731bf @chriso Updated README
authored
1 To install node.io, use [npm](http://github.com/isaacs/npm):
2
3 $ npm install node.io
bc05fa1 @chriso Updated README
authored
4
d9da9dc @chriso Updated README
authored
5 For usage details, run
6
7 $ node.io --help
8f797fb @chriso Updated README and documentation
authored
8
9 ## What is [node.io](http://node.io/)?
10
11 node.io is a framework for scraping and processing data. A node.io job typically consists of a) taking some input, b) using or transforming it, and c) outputting something.
12
13 node.io can simplify the process of:
14
15 - Filtering / sanitizing a list
16 - MapReduce
17 - Loading a list of URLs and scraping and saving some data from each
18 - Parsing log files
19 - Transforming data from one format to another, e.g. from CSV to a database
20 - Recursively load all files in a directory and it's subdirectories and execute a command on each file
21
e535de9 @chriso Updated README
authored
22 ## Why node.io?
58de877 @chriso Updated README
authored
23
a3e56c1 @chriso Updated README and made some JSLint fixes
authored
24 - Create modular and extensible jobs for scraping and processing data
8f797fb @chriso Updated README and documentation
authored
25 - Written in Node.js and Javascript - jobs are concise, asynchronous and FAST
26 - Speed up execution by distributing work among child processes and other servers (soon)
27 - Easily handle a variety of input / output situations
bc05fa1 @chriso Updated README
authored
28 * Reading / writing lines to and from files
8f797fb @chriso Updated README and documentation
authored
29 * Reading all files in a directory (and optionally recursing)
30 * Reading / writing rows to and from a database
bc05fa1 @chriso Updated README
authored
31 * STDIN / STDOUT
01eb509 @chriso Updated README
authored
32 * Piping between other node.io jobs
8f797fb @chriso Updated README and documentation
authored
33 * Any combination of the above, or completely custom IO
e43509d @chriso Updated README
authored
34 - Includes a robust framework for scraping and selecting web data
a3e56c1 @chriso Updated README and made some JSLint fixes
authored
35 - Support for a variety of proxies when making requests
36 - Includes a data validation and sanitization framework
37 - Provides support for retries, timeouts, dynamically adding input, etc.
bc05fa1 @chriso Updated README
authored
38
01eb509 @chriso Updated README
authored
39 ## Documentation
bc05fa1 @chriso Updated README
authored
40
01eb509 @chriso Updated README
authored
41 Initial documentation is [available here](https://github.com/chriso/node.io/tree/master/docs/).
bc05fa1 @chriso Updated README
authored
42
8f797fb @chriso Updated README and documentation
authored
43 Better documentation will be available once I have time to write it.. See [http://node.io/](http://node.io/) for updates.
01eb509 @chriso Updated README
authored
44
45 ## Examples
bc05fa1 @chriso Updated README
authored
46
01eb509 @chriso Updated README
authored
47 See [./examples](https://github.com/chriso/node.io/tree/master/examples/)
bc05fa1 @chriso Updated README
authored
48
22f8535 @chriso Added Roadmap
authored
49 ## Roadmap
50
1080b2f @chriso Fixed README bug with markdown
authored
51 - Automatically handle HTTP codes, e.g. redirect on 3xx or call fail() on 4xx/5xx
22f8535 @chriso Added Roadmap
authored
52 - Nested requests inherit referrer / cookies if to the same domain
53 - Add more DOM selector / traversal methods
54 - Test proxy callbacks
55 - Add distributed processing
c1dc9da @chriso Added initial documentation
authored
56 - Installation without NPM (install.sh)
57 - Refactoring
22f8535 @chriso Added Roadmap
authored
58
d9da9dc @chriso Updated README
authored
59 ## Credits
60
a3e56c1 @chriso Updated README and made some JSLint fixes
authored
61 node.io uses the following awesome libraries:
d9da9dc @chriso Updated README
authored
62
1f8d7e7 @chriso Updated README
authored
63 - [tautologistics'](https://github.com/tautologistics) [node-htmlparser](https://github.com/tautologistics/node-htmlparser)
64 - [harryf's](https://github.com/harryf) [soupselect](https://github.com/harryf/node-soupselect)
65 - [kriszyp's](https://github.com/kriszyp) [multi-node](https://github.com/kriszyp/multi-node)
d9da9dc @chriso Updated README
authored
66
67 ## License
68
1f8d7e7 @chriso Updated README
authored
69 [MIT License](https://github.com/chriso/node.io/raw/master/LICENSE)
Something went wrong with that request. Please try again.