schleyfox / peach

Parallel Each for JRuby

This URL has Read+Write access

schleyfox (author)
Tue May 06 14:42:07 -0700 2008
commit  0083989fc951b3fbccd6cc50c19199089ea1a64d
tree    b4b9f1cb2fe355eb6495bb622614f1704eba72ba
parent  6f1704f7bd664032b53db5035d32efef1002dd7b
peach /
name age message
file LICENSE Loading commit data...
file README
directory bn/
directory lib/
file peach.gemspec
directory web/
README
Parallel Each  (for ruby with threads) 

It is pretty common to have iterations over Arrays that can be safely run in parallel. With multicore chips becoming 
pretty common, single threaded processing is about as cool as Pog. Unfortunately, standard Ruby hates real threads 
pretty hardcore at the present time; however, for some ruby projects alternate VMs like  JRuby do give multicores some 
lovin'. Peach exists to make this power simple to use with minimal code changes. 

Functions like map, each, and delete_if are often used in a functional, side-effect free style. If the operation in the 
block is computationally intense, performance can often be gained by multithreading the process. That's where Peach 
comes in. In the simplest case, you are one letter away from harnessing the power of parallelism and unlocking the 
secret of a guilt-free tan. At this stage, the goggles are purely optional. 

Using Peach

Suppose you are going about your day job hacking away at code for the WOPR when you stumble upon the code: 

cities.each {|city| thermonuclear_war(city)}
        
Clearly, the only winning move is to declare war in parallel. With Peach, the new code is: 
require 'peach'

cities.peach {|city| thermonuclear_war(city)}
        
 Requiring peach.rb monkey patches Array into submission. Currently Peach provides peach, pmap, and pdelete_if. Each of 
 these functions takes an optional argument n, which represents the desired number of worker threads with the default 
 being one thread per Array element. For cheaper operations on a large number of elements, you probably want to set n to 
 something reasonably low. 

(0...10000).to_a.pmap(4) {|x| process(x)}
        
 Constructing the threads and adding on a few layers of indirection does add a bit of overhead to the iteration 
 especially on MRI. Keep this in mind and remember to benchmark when unsure.