Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
Newer
Older
100644 380 lines (276 sloc) 16.888 kb
ab01f72 @xk 2nd commit, D day
authored
1 # Threads A GoGo for Node.js
2
3 A native module for Node.js that provides an asynchronous, evented and/or continuation passing style API for moving blocking/longish CPU-bound tasks out of Node's event loop to JavaScript threads that run in parallel in the background and that use all the available CPU cores automatically; all from within a single Node process.
4
5 ## Installing the module
6
7 With [npm](http://npmjs.org/):
8
9 npm install threads_a_gogo
10
11 From source:
12
13 git clone http://github.com/xk/node-threads-a-gogo.git
14 cd node-threads-a-gogo
15 node-waf configure install
16
17 To include the module in your project:
18
19 var threads_a_gogo= require('threads_a_gogo');
20
21 **You need a node with a v8 >= 3.2.4 to run this module. Any node >= 0.5.1 comes with a v8 >= 3.2.4.**
22
23 The module **runs fine, though, in any node >= 0.2.0** as long as you build it with a v8 >= 3.2.4. To do that you simply have to replace /node/deps/v8 with a newer version of v8 and recompile it (node). To get any version of node goto http://nodejs.org/dist/, and for v8 goto http://github.com/v8/v8, click on "branch", select the proper tag (>= 3.2.4), and download the .zip.
24
25 ## (not so) Quick Intro
26
27 After the initialization phase of a Node program, whose purpose is to setup listeners and callbacks to be executed in response to events, the next phase, the proper execution of the program, is orchestrated by the event loop whose duty is to [juggle events, listeners and callbacks quickly and without any hiccups nor interruptions that would ruin its performance](http://youtube.com/v/D0uA_NOb0PE?autoplay=1)
28
29 Both the event loop and said listeners and callbacks run sequentially in a single thread of execution, Node's main thread. If any of them ever blocks, nothing else will happen for the duration of the block: no more events will be handled, no more callbacks nor listeners nor timeouts nor nextTick()ed functions will have the chance to run and do their job, because they won't be called by the blocked event loop, and the program will turn sluggish at best, or appear to be frozen and dead at worst.
30
31 **A.-** Here's a program that makes Node's event loop spin freely and as fast as possible: it simply prints a dot to the console in each turn:
32
33 cat examples/quickIntro_loop.js
34
35 ``` javascript
36 (function spinForever () {
37 process.stdout.write(".");
38 process.nextTick(spinForever);
39 })();
40 ```
41
42 **B.-** Here's another program that adds to the one above a fibonacci(35) call in each turn, a CPU-bound task that takes quite a while to complete and that blocks the event loop making it spin slowly and clumsily. The point is simply to show that you can't put a job like that in the event loop because Node will stop performing properly when its event loop can't spin fast and freely due to a callback/listener/nextTick()ed function that's blocking.
43
44 cat examples/quickIntro_blocking.js
45
46 ``` javascript
47 function fibo (n) {
48 return n > 1 ? fibo(n - 1) + fibo(n - 2) : 1;
49 }
50
51 (function fiboLoop () {
52 process.stdout.write(fibo(35).toString());
53 process.nextTick(fiboLoop);
54 })();
55
56 (function spinForever () {
57 process.stdout.write(".");
58 process.nextTick(spinForever);
59 })();
60 ```
61
62 **C.-** The program below uses `threads_a_gogo` to run the fibonacci(35) calls in a background thread, so Node's event loop isn't blocked at all and can spin freely again at full speed:
63
64 cat examples/quickIntro_oneThread.js
65
66 ``` javascript
67 function fibo (n) {
68 return n > 1 ? fibo(n - 1) + fibo(n - 2) : 1;
69 }
70
71 function cb (err, data) {
72 process.stdout.write(data);
73 this.eval('fibo(35)', cb);
74 }
75
76 var thread= require('threads_a_gogo').create();
77
78 thread.eval(fibo).eval('fibo(35)', cb);
79
80 (function spinForever () {
81 process.stdout.write(".");
82 process.nextTick(spinForever);
83 })();
84 ```
85
86 **D.-** This example is almost identical to the one above, only that it creates 5 threads instead of one, each running a fibonacci(35) in parallel and in parallel too with Node's event loop that keeps spinning happily at full speed in its own thread:
87
88 cat examples/quickIntro_fiveThreads.js
89
90 ``` javascript
91 function fibo (n) {
92 return n > 1 ? fibo(n - 1) + fibo(n - 2) : 1;
93 }
94
95 function cb (err, data) {
96 process.stdout.write(" ["+ this.id+ "]"+ data);
97 this.eval('fibo(35)', cb);
98 }
99
100 var threads_a_gogo= require('threads_a_gogo');
101
102 threads_a_gogo.create().eval(fibo).eval('fibo(35)', cb);
103 threads_a_gogo.create().eval(fibo).eval('fibo(35)', cb);
104 threads_a_gogo.create().eval(fibo).eval('fibo(35)', cb);
105 threads_a_gogo.create().eval(fibo).eval('fibo(35)', cb);
106 threads_a_gogo.create().eval(fibo).eval('fibo(35)', cb);
107
108 (function spinForever () {
109 process.stdout.write(".");
110 process.nextTick(spinForever);
111 })();
112 ```
113
114 **E.-** The next one asks `threads_a_gogo` to create a pool of 10 background threads, instead of creating them manually one by one:
115
116 cat examples/multiThread.js
117
118 ``` javascript
119 function fibo (n) {
120 return n > 1 ? fibo(n - 1) + fibo(n - 2) : 1;
121 }
122
123 var numThreads= 10;
124 var threadPool= require('threads_a_gogo').createPool(numThreads).all.eval(fibo);
125
126 threadPool.all.eval('fibo(35)', function cb (err, data) {
127 process.stdout.write(" ["+ this.id+ "]"+ data);
128 this.eval('fibo(35)', cb);
129 });
130
131 (function spinForever () {
132 process.stdout.write(".");
133 process.nextTick(spinForever);
134 })();
135 ```
136
137 **F.-** This is a demo of the `threads_a_gogo` eventEmitter API, using one thread:
138
139 cat examples/quickIntro_oneThreadEvented.js
140
141 ``` javascript
142 var thread= require('threads_a_gogo').create();
143 thread.load(__dirname + '/quickIntro_evented_childThreadCode.js');
144
145 /*
146 This is the code that's .load()ed into the child/background thread:
147
148 function fibo (n) {
149 return n > 1 ? fibo(n - 1) + fibo(n - 2) : 1;
150 }
151
152 thread.on('giveMeTheFibo', function onGiveMeTheFibo (data) {
153 this.emit('theFiboIs', fibo(+data)); //Emits 'theFiboIs' in the parent/main thread.
154 });
155
156 */
157
158 //Emit 'giveMeTheFibo' in the child/background thread.
159 thread.emit('giveMeTheFibo', 35);
160
161 //Listener for the 'theFiboIs' events emitted by the child/background thread.
162 thread.on('theFiboIs', function cb (data) {
163 process.stdout.write(data);
164 this.emit('giveMeTheFibo', 35);
165 });
166
167 (function spinForever () {
168 process.stdout.write(".");
169 process.nextTick(spinForever);
170 })();
171 ```
172
173 **G.-** This is a demo of the `threads_a_gogo` eventEmitter API, using a pool of threads:
174
175 cat examples/quickIntro_multiThreadEvented.js
176
177 ``` javascript
178 var numThreads= 10;
179 var threadPool= require('threads_a_gogo').createPool(numThreads);
180 threadPool.load(__dirname + '/quickIntro_evented_childThreadCode.js');
181
182 /*
183 This is the code that's .load()ed into the child/background threads:
184
185 function fibo (n) {
186 return n > 1 ? fibo(n - 1) + fibo(n - 2) : 1;
187 }
188
189 thread.on('giveMeTheFibo', function onGiveMeTheFibo (data) {
190 this.emit('theFiboIs', fibo(+data)); //Emits 'theFiboIs' in the parent/main thread.
191 });
192
193 */
194
195 //Emit 'giveMeTheFibo' in all the child/background threads.
196 threadPool.all.emit('giveMeTheFibo', 35);
197
198 //Listener for the 'theFiboIs' events emitted by the child/background threads.
199 threadPool.on('theFiboIs', function cb (data) {
200 process.stdout.write(" ["+ this.id+ "]"+ data);
201 this.emit('giveMeTheFibo', 35);
202 });
203
204 (function spinForever () {
205 process.stdout.write(".");
206 process.nextTick(spinForever);
207 })();
208 ```
209
210 ## More examples
211
212 The `examples` directory contains a few more examples:
213
214 * [ex01_basic](https://github.com/xk/node-threads-a-gogo/blob/master/examples/ex01_basic.md): Running a simple function in a thread.
215 * [ex02_events](https://github.com/xk/node-threads-a-gogo/blob/master/examples/ex02_events.md): Sending events from a worker thread.
216 * [ex03_ping_pong](https://github.com/xk/node-threads-a-gogo/blob/master/examples/ex03_ping_pong.md): Sending events both ways between the main thread and a worker thread.
217 * [ex04_main](https://github.com/xk/node-threads-a-gogo/blob/master/examples/ex04_main.md): Loading the worker code from a file.
218 * [ex05_pool](https://github.com/xk/node-threads-a-gogo/blob/master/examples/ex05_pool.md): Using the thread pool.
219 * [ex06_jason](https://github.com/xk/node-threads-a-gogo/blob/master/examples/ex06_jason.md): Passing complex objects to threads.
220
221 ## API
222
223 ### Module API
224 ``` javascript
225 var threads_a_gogo= require('threads_a_gogo');
226 ```
227 ##### .create()
228 `threads_a_gogo.create( /* no arguments */ )` -> thread object
229 ##### .createPool( numThreads )
230 `threads_a_gogo.createPool( numberOfThreads )` -> threadPool object
231
232 ***
233 ### Thread API
234 ``` javascript
235 var thread= threads_a_gogo.create();
236 ```
237 ##### .id
238 `thread.id` -> a sequential thread serial number
239 ##### .load( absolutePath [, cb] )
240 `thread.load( absolutePath [, cb] )` -> reads the file at `absolutePath` and `thread.eval(fileContents, cb)`.
241 ##### .eval( program [, cb])
242 `thread.eval( program [, cb])` -> converts `program.toString()` and eval()s it in the thread's global context, and (if provided) returns the completion value to `cb(err, completionValue)`.
243 ##### .on( eventType, listener )
244 `thread.on( eventType, listener )` -> registers the listener `listener(data)` for any events of `eventType` that the thread `thread` may emit.
245 ##### .once( eventType, listener )
246 `thread.once( eventType, listener )` -> like `thread.on()`, but the listener will only be called once.
247 ##### .removeAllListeners( [eventType] )
248 `thread.removeAllListeners( [eventType] )` -> deletes all listeners for all eventTypes. If `eventType` is provided, deletes all listeners only for the event type `eventType`.
249 ##### .emit( eventType, eventData [, eventData ... ] )
250 `thread.emit( eventType, eventData [, eventData ... ] )` -> emit an event of `eventType` with `eventData` inside the thread `thread`. All its arguments are .toString()ed.
251 ##### .destroy( /* no arguments */ )
252 `thread.destroy( /* no arguments */ )` -> destroys the thread.
253
254 ***
255 ### Thread pool API
256 ``` javascript
257 threadPool= threads_a_gogo.createPool( numberOfThreads );
258 ```
259 ##### .load( absolutePath [, cb] )
260 `threadPool.load( absolutePath [, cb] )` -> `thread.load( absolutePath [, cb] )` in all the pool's threads.
261 ##### .any.eval( program, cb )
262 `threadPool.any.eval( program, cb )` -> like `thread.eval()`, but in any of the pool's threads.
263 ##### .any.emit( eventType, eventData [, eventData ... ] )
264 `threadPool.any.emit( eventType, eventData [, eventData ... ] )` -> like `thread.emit()` but in any of the pool's threads.
265 ##### .all.eval( program, cb )
266 `threadPool.all.eval( program, cb )` -> like `thread.eval()`, but in all the pool's threads.
267 ##### .all.emit( eventType, eventData [, eventData ... ] )
268 `threadPool.all.emit( eventType, eventData [, eventData ... ] )` -> like `thread.emit()` but in all the pool's threads.
269 ##### .on( eventType, listener )
270 `threadPool.on( eventType, listener )` -> like `thread.on()`, registers listeners for events from any of the threads in the pool.
271 ##### .totalThreads()
272 `threadPool.totalThreads()` -> returns the number of threads in this pool: as supplied in `.createPool( number )`
273 ##### .idleThreads()
274 `threadPool.idleThreads()` -> returns the number of threads in this pool that are currently idle (sleeping)
275 ##### .pendingJobs()
276 `threadPool.pendingJobs()` -> returns the number of jobs pending.
277 ##### .destroy( [ rudely ] )
278 `threadPool.destroy( [ rudely ] )` -> waits until `pendingJobs()` is zero and then destroys the pool. If `rudely` is truthy, then it doesn't wait for `pendingJobs === 0`.
279
280 ***
281 ### Global thread API
282
283 Inside every thread .create()d by threads_a_gogo, there's a global `thread` object with these properties:
284 ##### .id
285 `thread.id` -> the serial number of this thread
286 ##### .on( eventType, listener )
287 `thread.on( eventType, listener )` -> just like `thread.on()` above.
288 ##### .once( eventType, listener )
289 `thread.once( eventType, listener )` -> just like `thread.once()` above.
290 ##### .emit( eventType, eventData [, eventData ... ] )
291 `thread.emit( eventType, eventData [, eventData ... ] )` -> just like `thread.emit()` above.
292 ##### .removeAllListeners( [eventType] )
293 `thread.removeAllListeners( [eventType] )` -> just like `thread.removeAllListeners()` above.
294 ##### .nextTick( function )
295 `thread.nextTick( function )` -> like `process.nextTick()`, but twice as fast.
296
297 ***
298 ### Global puts
299
300 Inside every thread .create()d by threads_a_gogo, there's a global `puts`:
301 ##### puts(arg1 [, arg2 ...])
302 `puts(arg1 [, arg2 ...])` -> .toString()s and prints its arguments to stdout.
303
304 ## Rationale
305
306 [Node.js](http://nodejs.org) is the most [awesome, cute and super-sexy](http://javascriptology.com/threads_a_gogo/sexy.jpg) piece of free, open source software.
307
308 Its event loop can spin as fast and smooth as a turbo, and roughly speaking, **the faster it spins, the more power it delivers**. That's why [@ryah](http://twitter.com/ryah) took great care to ensure that no -possibly slow- I/O operations could ever block it: a pool of background threads (thanks to [Marc Lehmann's libeio library](http://software.schmorp.de/pkg/libeio.html)) handle any blocking I/O calls in the background, in parallel.
309
310 In Node it's verboten to write a server like this:
311
312 ``` javascript
313 http.createServer(function (req,res) {
314 res.end( fs.readFileSync(path) );
315 }).listen(port);
316 ```
317 Because synchronous I/O calls **block the turbo**, and without proper boost, Node.js begins to stutter and behaves clumsily. To avoid it there's the asynchronous version of `.readFile()`, in continuation passing style, that takes a callback:
318
319 ``` javascript
320 fs.readfile(path, function cb (err, data) { /* ... */ });
321 ```
322
323 It's cool, we love it (*), and there's hundreds of ad hoc built-in functions like this in Node to help us deal with almost any variety of possibly slow, blocking I/O.
324
325 ### But what's with longish, CPU-bound tasks?
326
327 How do you avoid blocking the event loop, when the task at hand isn't I/O bound, and lasts more than a few fractions of a millisecond?
328
329 ``` javascript
330 http.createServer(function cb (req,res) {
331 res.end( fibonacci(40) );
332 }).listen(port);
333 ```
334
335 You simply can't, because there's no way... well, there wasn't before `threads_a_gogo`.
336
337 ### What is Threads A GoGo for Node.js
338
339 `threads_a_gogo` provides the asynchronous API for CPU-bound tasks that's missing in Node.js. Both in continuation passing style (callbacks), and in event emitter style (event listeners).
340
341 The same API Node uses to delegate a longish I/O task to a background (libeio) thread:
342
343 `asyncIOTask(what, cb);`
344
345 `threads_a_gogo` uses to delegate a longish CPU task to a background (JavaScript) thread:
346
347 `thread.eval(program, cb);`
348
349 So with `threads_a_gogo` you can write:
350
351 ``` javascript
352 http.createServer(function (req,res) {
353 thread.eval('fibonacci(40)', function cb (err, data) {
354 res.end(data);
355 });
356 }).listen(port);
357 ```
358
359 And it won't block the event loop because the `fibonacci(40)` will run in parallel in a separate background thread.
360
361
362 ### Why Threads
363
364 Threads (kernel threads) are very interesting creatures. They provide:
365
366 1.- Parallelism: All the threads run in parallel. On a single core processor, the CPU is switched rapidly back and forth among the threads providing the illusion that the threads are running in parallel, albeit on a slower CPU than the real one. With 10 compute-bound threads in a process, the threads would appear to be running in parallel, each one on a CPU with 1/10th the speed of the real CPU. On a multi-core processor, threads are truly running in parallel, and get time-sliced when the number of threads exceed the number of cores. So with 12 compute bound threads on a quad-core processor each thread will appear to run at 1/3rd of the nominal core speed.
367
368 2.- Fairness: No thread is more important than another, cores and CPU slices are fairly distributed among threads by the OS scheduler.
369
370 3.- Threads fully exploit all the available CPU resources in your system. On a loaded system running many tasks in many threads, the more cores there are, the faster the threads will complete. Automatically.
371
372 4.- The threads of a process share exactly the same address space, that of the process they belong to. Every thread can access every memory address within the process' address space. This is a very appropriate setup when the threads are actually part of the same job and are actively and closely cooperating with each other. Passing a reference to a chunk of data via a pointer is many orders of magnitude faster than transferring a copy of the data via IPC.
373
374
375 ### Why not multiple processes.
376
377 The "can't block the event loop" problem is inherent to Node's evented model. No matter how many Node processes you have running as a [Node-cluster](http://blog.nodejs.org/2011/10/04/an-easy-way-to-build-scalable-network-programs/), it won't solve its issues with CPU-bound tasks.
378
379 Launch a cluster of N Nodes running the example B (`quickIntro_blocking.js`) above, and all you'll get is N -instead of one- Nodes with their event loops blocked and showing a sluggish performance.
Something went wrong with that request. Please try again.