Permalink
Switch branches/tags
Nothing to show
Find file
Fetching contributors…
Cannot retrieve contributors at this time
624 lines (433 sloc) 24.3 KB
<html>
<head>
<title>Mastering Node</title>
<style>
body {
font: 14px/1.4 "Lucida Grande", "Helvetica Neue", Arial, sans-serif;
padding: 50px 180px;
}
h1 {
padding-left: 5px;
border-bottom: 3px solid #eee;
}
pre {
margin: 15px 0;
padding: 15px;
border: 1px solid #eee;
}
a {
color: #00aaff;
}
</style>
</head>
<body><div class='mp'>
<h1>Mastering Node</h1>
<p><a href="http://nodejs.org/">Node</a> is an exciting new platform developed by <em>Ryan Dahl</em>, allowing JavaScript developers to create extremely high performance servers by leveraging <a href="http://code.google.com/p/v8/">Google's V8</a> JavaScript engine, and asynchronous I/O. In <em>Mastering Node</em> we will discover how to write high concurrency web servers, utilizing the CommonJS module system, node's core libraries, third party modules, high level web development and more.</p>
</div>
<div class='mp'>
<h1>Installing Node</h1>
<p>In this chapter we will be looking at the installation and compilation of node. Although there are several ways we may install node, we will be looking at <a href="http://github.com/mxcl/homebrew">homebrew</a>, <a href="http://github.com/visionmedia/ndistro">nDistro</a>, and the most flexible method, of course - compiling from source.</p>
<h3 id="Homebrew">Homebrew</h3>
<p>Homebrew is a package management system for <em>OSX</em> written in Ruby, is extremely well adopted, and easy to use. To install node via the <code>brew</code> executable simply run:</p>
<pre><code>$ brew install node.js
</code></pre>
<h2 id="nDistro">nDistro</h2>
<p><a href="http://github.com/visionmedia/ndistro">nDistro</a> is a distribution toolkit for node, which allows creation and installation of node distros within seconds. An <em>nDistro</em> is simply a dotfile named <em>.ndistro</em> which defines
module and node binary version dependencies. In the example
below we specify the node binary version <em>0.1.102</em>, as well as
several 3rd party modules.</p>
<pre><code>node 0.1.102
module senchalabs connect
module visionmedia express 1.0.0beta2
module visionmedia connect-form
module visionmedia connect-redis
module visionmedia jade
module visionmedia ejs
</code></pre>
<p>Any machine that can run a shell script can install distributions, and keeps dependencies defined to a single directory structure, making it easy to maintain an deploy. nDistro uses <a href="http://github.com/visionmedia/nodes">pre-compiled node binaries</a> making them extremely fast to install, and module tarballs which are fetched from <a href="http://github.com">GitHub</a> via <em>wget</em> or <em>curl</em> (auto detected).</p>
<p>To get started we first need to install nDistro itself, below we <em>cd</em> to our bin directory of choice, <em>curl</em> the shell script, and pipe the response to <em>sh</em> which will install nDistro to the current directory:</p>
<pre><code>$ cd /usr/local/bin &amp;&amp; curl http://github.com/visionmedia/ndistro/raw/master/install | sh
</code></pre>
<p>Next we can place the contents of our example in <em>./.ndistro</em>, and execute <em>ndistro</em> with no arguments, prompting the program to load the config, and start installing:</p>
<pre><code>$ ndistro
</code></pre>
<p>Installation of the example took less than 17 seconds on my machine, and outputs the following <em>stdout</em> indicating success. Not bad for an entire stack!</p>
<pre><code>... installing node-0.1.102-i386
... installing connect
... installing express 1.0.0beta2
... installing bin/express
... installing connect-form
... installing connect-redis
... installing jade
... installing bin/jade
... installing ejs
... installation complete
</code></pre>
<h2 id="Building-From-Source">Building From Source</h2>
<p>To build and install node from source, we first need to obtain the code. The first method of doing so is
via <code>git</code>, if you have git installed you can execute:</p>
<pre><code>$ git clone http://github.com/ry/node.git &amp;&amp; cd node
</code></pre>
<p>For those without <em>git</em>, or who prefer not to use it, we can also download the source via <em>curl</em>, <em>wget</em>, or similar:</p>
<pre><code>$ curl -# http://nodejs.org/dist/node-v0.1.99.tar.gz &gt; node.tar.gz
$ tar -zxf node.tar.gz
</code></pre>
<p>Now that we have the source on our machine, we can run <code>./configure</code> which discovers which libraries are available for node to utilize such as <em>OpenSSL</em> for transport security support, C and C++ compilers, etc. <code>make</code> which builds node, and finally <code>make install</code> which will install node.</p>
<pre><code>$ ./configure &amp;&amp; make &amp;&amp; make install
</code></pre>
</div>
<div class='mp'>
<h1>Globals</h1>
<p> As we have learnt, node's module system discourages the use of globals; however node provides a few important globals for use to utilize. The first and most important is the <code>process</code> global, which exposes process manipulation such as signalling, exiting, the process id (pid), and more. Other globals, such as the <code>console</code> object, are provided to those used to writing JavaScript for web browsers.</p>
<h2 id="console">console</h2>
<p>The <code>console</code> object contains several methods which are used to output information to <em>stdout</em> or <em>stderr</em>. Let's take a look at what each method does:</p>
<h3 id="console-log-">console.log()</h3>
<p>The most frequently used console method is <code>console.log()</code>, which simply writes to <em>stdout</em> and appends a line feed (<code>\n</code>). Currently aliased as <code>console.info()</code>.</p>
<pre><code>console.log('wahoo');
// =&gt; wahoo
console.log({ foo: 'bar' });
// =&gt; [object Object]
</code></pre>
<h3 id="console-error-">console.error()</h3>
<p>Identical to <code>console.log()</code>, however writes to <em>stderr</em>. Aliased as <code>console.warn()</code> as well.</p>
<pre><code>console.error('database connection failed');
</code></pre>
<h3 id="console-dir-">console.dir()</h3>
<p>Utilizes the <em>sys</em> module's <code>inspect()</code> method to pretty-print the object to
<em>stdout</em>.</p>
<pre><code>console.dir({ foo: 'bar' });
// =&gt; { foo: 'bar' }
</code></pre>
<h3 id="console-assert-">console.assert()</h3>
<p>Asserts that the given expression is truthy, or throws an exception.</p>
<pre><code>console.assert(connected, 'Database connection failed');
</code></pre>
<h2 id="process">process</h2>
<p>The <code>process</code> object is plastered with goodies. First we will take a look
at some properties that provide information about the node process itself:</p>
<h3 id="process-version">process.version</h3>
<p>The node version string, for example "v0.1.103".</p>
<h3 id="process-installPrefix">process.installPrefix</h3>
<p>The installation prefix. In my case "<em>/usr/local</em>", as node's binary was installed to "<em>/usr/local/bin/node</em>".</p>
<h3 id="process-execPath">process.execPath</h3>
<p>The path to the executable itself "<em>/usr/local/bin/node</em>".</p>
<h3 id="process-platform">process.platform</h3>
<p>The platform you are running on. For example, "darwin".</p>
<h3 id="process-pid">process.pid</h3>
<p>The process id.</p>
<h3 id="process-cwd-">process.cwd()</h3>
<p>Returns the current working directory. For example:</p>
<pre><code>cd ~ &amp;&amp; node
node&gt; process.cwd()
"/Users/tj"
</code></pre>
<h3 id="process-chdir-">process.chdir()</h3>
<p>Changes the current working directory to the path passed.</p>
<pre><code>process.chdir('/foo');
</code></pre>
<h3 id="process-getuid-">process.getuid()</h3>
<p>Returns the numerical user id of the running process.</p>
<h3 id="process-setuid-">process.setuid()</h3>
<p>Sets the effective user id for the running process. This method accepts both a numerical id, as well as a string. For example both <code>process.setuid(501)</code>, and <code>process.setuid('tj')</code> are valid.</p>
<h3 id="process-getgid-">process.getgid()</h3>
<p>Returns the numerical group id of the running process.</p>
<h3 id="process-setgid-">process.setgid()</h3>
<p>Similar to <code>process.setuid()</code> however operates on the group, also accepting a numerical value or string representation. For example, <code>process.setgid(20)</code> or <code>process.setgid('www')</code>.</p>
<h3 id="process-env">process.env</h3>
<p>An object containing the user's environment variables. For example:</p>
<pre><code>{ PATH: '/Users/tj/.gem/ruby/1.8/bin:/Users/tj/.nvm/current/bin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/X11/bin'
, PWD: '/Users/tj/ebooks/masteringnode'
, EDITOR: 'mate'
, LANG: 'en_CA.UTF-8'
, SHLVL: '1'
, HOME: '/Users/tj'
, LOGNAME: 'tj'
, DISPLAY: '/tmp/launch-YCkT03/org.x:0'
, _: '/usr/local/bin/node'
, OLDPWD: '/Users/tj'
}
</code></pre>
<h3 id="process-argv">process.argv</h3>
<p>When executing a file with the <code>node</code> executable <code>process.argv</code> provides access to the argument vector, the first value being the node executable, second being the filename, and remaining values being the arguments passed.</p>
<p>For example, our source file <em>./src/process/misc.js</em> can be executed by running:</p>
<pre><code>$ node src/process/misc.js foo bar baz
</code></pre>
<p>in which we call <code>console.dir(process.argv)</code>, outputting the following:</p>
<pre><code>[ 'node'
, '/Users/tj/EBooks/masteringnode/src/process/misc.js'
, 'foo'
, 'bar'
, 'baz'
]
</code></pre>
<h3 id="process-exit-">process.exit()</h3>
<p>The <code>process.exit()</code> method is synonymous with the C function <code>exit()</code>, in which an exit code > 0 is passed to indicate failure, or 0 is passed to indicate success. When invoked, the <em>exit</em> event is emitted, allowing a short time for arbitrary processing to occur before <code>process.reallyExit()</code> is called with the given status code.</p>
<h3 id="process-on-">process.on()</h3>
<p>The process itself is an <code>EventEmitter</code>, allowing you to do things like listen for uncaught exceptions via the <em>uncaughtException</em> event:</p>
<pre><code>process.on('uncaughtException', function(err){
console.log('got an error: %s', err.message);
process.exit(1);
});
setTimeout(function(){
throw new Error('fail');
}, 100);
</code></pre>
<h3 id="process-kill-">process.kill()</h3>
<p><code>process.kill()</code> method sends the signal passed to the given <em>pid</em>, defaulting to <strong>SIGINT</strong>. In the example below, we send the <strong>SIGTERM</strong> signal to the same node process to illustrate signal trapping, after which we output "terminating" and exit. Note that the second timeout of 1000 milliseconds is never reached.</p>
<pre><code>process.on('SIGTERM', function(){
console.log('terminating');
process.exit(1);
});
setTimeout(function(){
console.log('sending SIGTERM to process %d', process.pid);
process.kill(process.pid, 'SIGTERM');
}, 500);
setTimeout(function(){
console.log('never called');
}, 1000);
</code></pre>
<h3 id="errno">errno</h3>
<p>The <code>process</code> object is host of the error numbers, which reference what you would find in C-land. For example, <code>process.EPERM</code> represents a permission based error, while <code>process.ENOENT</code> represents a missing file or directory. Typically these are used within bindings to bridge the gap between C++ and JavaScript, but they're useful for handling exceptions as well:</p>
<pre><code>if (err.errno === process.ENOENT) {
// Display a 404 "Not Found" page
} else {
// Display a 500 "Internal Server Error" page
}
</code></pre>
</div>
<div class='mp'>
<h1>Events</h1>
<p> The concept of an "event" is crucial to node, and is used heavily throughout core and 3rd-party modules. Node's core module <em>events</em> supplies us with a single constructor, <em>EventEmitter</em>.</p>
<h2 id="Emitting-Events">Emitting Events</h2>
<p>Typically an object inherits from <em>EventEmitter</em>, however our small example below illustrates the API. First we create an <code>emitter</code>, after which we can define any number of callbacks using the <code>emitter.on()</code> method, which accepts the <em>name</em> of the event and arbitrary objects passed as data. When <code>emitter.emit()</code> is called, we are only required to pass the event <em>name</em>, followed by any number of arguments (in this case the <code>first</code> and <code>last</code> name strings).</p>
<pre><code>var EventEmitter = require('events').EventEmitter;
var emitter = new EventEmitter;
emitter.on('name', function(first, last){
console.log(first + ', ' + last);
});
emitter.emit('name', 'tj', 'holowaychuk');
emitter.emit('name', 'simon', 'holowaychuk');
</code></pre>
<h2 id="Inheriting-From-EventEmitter">Inheriting From EventEmitter</h2>
<p>A more practical and common use of <code>EventEmitter</code> is to inherit from it. This means we can leave <code>EventEmitter</code>'s prototype untouched while utilizing its API for our own means of world domination!</p>
<p>To do so, we begin by defining the <code>Dog</code> constructor, which of course will bark from time to time (also known as an <em>event</em>).</p>
<pre><code>var EventEmitter = require('events').EventEmitter;
function Dog(name) {
this.name = name;
}
</code></pre>
<p>Here we inherit from <code>EventEmitter</code> so we can use the methods it provides, such as <code>EventEmitter#on()</code> and <code>EventEmitter#emit()</code>. If the <code>__proto__</code> property is throwing you off, don't worry, we'll be coming back to this later.</p>
<pre><code>Dog.prototype.__proto__ = EventEmitter.prototype;
</code></pre>
<p>Now that we have our <code>Dog</code> set up, we can create... Simon! When Simon barks, we can let <em>stdout</em> know by calling <code>console.log()</code> within the callback. The callback itself is called in the context of the object (aka <code>this</code>).</p>
<pre><code>var simon = new Dog('simon');
simon.on('bark', function(){
console.log(this.name + ' barked');
});
</code></pre>
<p>Bark twice per second:</p>
<pre><code>setInterval(function(){
simon.emit('bark');
}, 500);
</code></pre>
<h2 id="Removing-Event-Listeners">Removing Event Listeners</h2>
<p>As we have seen, event listeners are simply functions which are called when we <code>emit()</code> an event. We can remove these listeners by calling the <code>removeListener(type, callback)</code> method, although this isn't seen often. In the example below we emit the <em>message</em> "foo bar" every <code>300</code> milliseconds, which has a callback of <code>console.log()</code>. After 1000 milliseconds, we call <code>removeListener()</code> with the same arguments that we passed to <code>on()</code> originally. We could also have used <code>removeAllListeners(type)</code>, which removes all listeners registered to the given <em>type</em>.</p>
<pre><code>var EventEmitter = require('events').EventEmitter;
var emitter = new EventEmitter;
emitter.on('message', console.log);
setInterval(function(){
emitter.emit('message', 'foo bar');
}, 300);
setTimeout(function(){
emitter.removeListener('message', console.log);
}, 1000);
</code></pre>
</div>
<div class='mp'>
<h1>Buffers</h1>
<p> To handle binary data, node provides us with the global <code>Buffer</code> object. <code>Buffer</code> instances represent memory allocated independently of V8's heap. There are several ways to construct a <code>Buffer</code> instance, and many ways you can manipulate its data.</p>
<p>The simplest way to construct a <code>Buffer</code> from a string is to simply pass a string as the first argument. As you can see in the log output, we now have a buffer object containing 5 bytes of data represented in hexadecimal.</p>
<pre><code>var hello = new Buffer('Hello');
console.log(hello);
// =&gt; &lt;Buffer 48 65 6c 6c 6f>
console.log(hello.toString());
// =&gt; "Hello"
</code></pre>
<p>By default, the encoding is "utf8", but this can be overridden by passing a string as the second argument. For example, the ellipsis below will be printed to stdout as the "&amp;" character when in "ascii" encoding.</p>
<pre><code>var buf = new Buffer('…');
console.log(buf.toString());
// =&gt;
var buf = new Buffer('…', 'ascii');
console.log(buf.toString());
// =&gt; &amp;
</code></pre>
<p>An alternative (but in this case functionality equivalent) method is to pass an array of integers representing the octet stream.</p>
<pre><code>var hello = new Buffer([0x48, 0x65, 0x6c, 0x6c, 0x6f]);
</code></pre>
<p>Buffers can also be created with an integer representing the number of bytes allocated, after which we can call the <code>write()</code> method, providing an optional offset and encoding. Below, we provide an offset of 2 bytes to our second call to <code>write()</code> (buffering "Hel") and then write another two bytes with an offset of 3 (completing "Hello").</p>
<pre><code>var buf = new Buffer(5);
buf.write('He');
buf.write('l', 2);
buf.write('lo', 3);
console.log(buf.toString());
// =&gt; "Hello"
</code></pre>
<p>The <code>.length</code> property of a buffer instance contains the byte length of the stream, as opposed to native strings, which simply return the number of characters. For example, the ellipsis character '…' consists of three bytes, so the buffer will respond with the byte length (3), and not the character length (1).</p>
<pre><code>var ellipsis = new Buffer('…', 'utf8');
console.log('… string length: %d', '…'.length);
// =&gt; … string length: 1
console.log('… byte length: %d', ellipsis.length);
// =&gt; … byte length: 3
console.log(ellipsis);
// =&gt; &lt;Buffer e2 80 a6>
</code></pre>
<p>To determine the byte length of a native string, pass it to the <code>Buffer.byteLength()</code> method.</p>
<p>The API is written in such a way that it is String-like. For example, we can work with "slices" of a <code>Buffer</code> by passing offsets to the <code>slice()</code> method:</p>
<pre><code>var chunk = buf.slice(4, 9);
console.log(chunk.toString());
// =&gt; "some"
</code></pre>
<p>Alternatively, when expecting a string, we can pass offsets to <code>Buffer#toString()</code>:</p>
<pre><code>var buf = new Buffer('just some data');
console.log(buf.toString('ascii', 4, 9));
// =&gt; "some"
</code></pre>
</div>
<div class='mp'>
<h1>Streams</h1>
<p> Streams are an important concept in node. The stream API is a unified way to handle stream-like data. For example, data can be streamed to a file, streamed to a socket to respond to an HTTP request, or streamed from a read-only source such as <em>stdin</em>. For now, we'll concentrate on the API, leaving stream specifics to later chapters.</p>
<h2 id="Readable-Streams">Readable Streams</h2>
<p> Readable streams such as an HTTP request inherit from <code>EventEmitter</code> in order to expose incoming data through events. The first of these events is the <em>data</em> event, which is an arbitrary chunk of data passed to the event handler as a <code>Buffer</code> instance.</p>
<pre><code>req.on('data', function(buf){
// Do something with the Buffer
});
</code></pre>
<p>As we know, we can call <code>toString()</code> on a buffer to return a string representation of the binary data. Likewise, we can call <code>setEncoding()</code> on a stream, after which the <em>data</em> event will emit strings.</p>
<pre><code>req.setEncoding('utf8');
req.on('data', function(str){
// Do something with the String
});
</code></pre>
<p>Another important event is <em>end</em>, which represents the ending of <em>data</em> events. For example, here's an HTTP echo server, which simply "pumps" the request body data through to the response. So if we POST "hello world", our response will be "hello world".</p>
<pre><code>var http = require('http');
http.createServer(function(req, res){
res.writeHead(200);
req.on('data', function(data){
res.write(data);
});
req.on('end', function(){
res.end();
});
}).listen(3000);
</code></pre>
<p>The <em>sys</em> module actually has a function designed specifically for this "pumping" action, aptly named <code>sys.pump()</code>. It accepts a read stream as the first argument, and write stream as the second.</p>
<pre><code>var http = require('http'),
sys = require('sys');
http.createServer(function(req, res){
res.writeHead(200);
sys.pump(req, res);
}).listen(3000);
</code></pre>
</div>
<div class='mp'>
<h1>File System</h1>
<p> To work with the filesystem, node provides the "fs" module. The commands emulate the POSIX operations, and most methods work synchronously or asynchronously. We will look at how to use both, then establish which is the better option.</p>
<h2 id="Working-with-the-filesystem">Working with the filesystem</h2>
<p> Lets start with a basic example of working with the filesystem. This example creates a directory, creates a file inside it, then writes the contents of the file to console:</p>
<pre><code>var fs = require('fs');
fs.mkdir('./helloDir',0777, function (err) {
if (err) throw err;
fs.writeFile('./helloDir/message.txt', 'Hello Node', function (err) {
if (err) throw err;
console.log('file created with contents:');
fs.readFile('./helloDir/message.txt','UTF-8' ,function (err, data) {
if (err) throw err;
console.log(data);
});
});
});
</code></pre>
<p> As evident in the example above, each callback is placed in the previous callback &mdash; these are referred to as chainable callbacks. This pattern should be followed when using asynchronous methods, as there's no guarantee that the operations will be completed in the order they're created. This could lead to unpredictable behavior.</p>
<p> The example can be rewritten to use a synchronous approach:</p>
<pre><code>fs.mkdirSync('./helloDirSync',0777);
fs.writeFileSync('./helloDirSync/message.txt', 'Hello Node');
var data = fs.readFileSync('./helloDirSync/message.txt','UTF-8');
console.log('file created with contents:');
console.log(data);
</code></pre>
<p> It is better to use the asynchronous approach on servers with a high load, as the synchronous methods will cause the whole process to halt and wait for the operation to complete. This will block any incoming connections or other events.</p>
<h2 id="File-information">File information</h2>
<p> The fs.Stats object contains information about a particular file or directory. This can be used to determine what type of object we're working with. In this example, we're getting all the file objects in a directory and displaying whether they're a file or a
directory object.</p>
<pre><code>var fs = require('fs');
fs.readdir('/etc/', function (err, files) {
if (err) throw err;
files.forEach( function (file) {
fs.stat('/etc/' + file, function (err, stats) {
if (err) throw err;
if (stats.isFile()) {
console.log("%s is file", file);
}
else if (stats.isDirectory ()) {
console.log("%s is a directory", file);
}
console.log('stats: %s',JSON.stringify(stats));
});
});
});
</code></pre>
<h2 id="Watching-files">Watching files</h2>
<p> The fs.watchfile method monitors a file and fires an event whenever the file is changed.</p>
<pre><code>var fs = require('fs');
fs.watchFile('./testFile.txt', function (curr, prev) {
console.log('the current mtime is: ' + curr.mtime);
console.log('the previous mtime was: ' + prev.mtime);
});
fs.writeFile('./testFile.txt', "changed", function (err) {
if (err) throw err;
console.log("file write complete");
});
</code></pre>
<p> A file can also be unwatched using the fs.unwatchFile method call. This should be used once a file no longer needs to be monitored.</p>
<h2 id="Nodejs-Docs-for-further-reading">Nodejs Docs for further reading</h2>
<p> The node API <a href="http://nodejs.org/api.html#file-system-106">docs</a> are very detailed and list all the possible filesystem commands
available when working with Nodejs.</p>
</div>
<div class='mp'>
<h1>TCP</h1>
<p> ...</p>
<h2 id="TCP-Servers">TCP Servers</h2>
<p> ...</p>
<h2 id="TCP-Clients">TCP Clients</h2>
<p> ...</p>
</div>
<div class='mp'>
<h1>HTTP</h1>
<p> ...</p>
<h2 id="HTTP-Servers">HTTP Servers</h2>
<p> ...</p>
<h2 id="HTTP-Clients">HTTP Clients</h2>
<p> ...</p>
</div>
<div class='mp'>
<h1>Connect</h1>
<p>Connect is a ...</p>
</div>
<div class='mp'>
<h1>Express</h1>
<p>Express is a ...</p>
</div>
<div class='mp'>
<h1>Testing</h1>
<p> ...</p>
<h2 id="Expresso">Expresso</h2>
<p> ...</p>
<h2 id="Vows">Vows</h2>
<p> ...</p>
</div>
<div class='mp'>
<h1>Deployment</h1>
<p> ...</p>
</div>
</body>
</html>