We'll start with the function we defined before, to count how many stars remain after 10 dynamical times:

In [None]:
def premain(startn):
    """Run a plummer model for 10 dynamical times and return the number of stars remaining."""
    from subprocess import Popen, PIPE
    from starlab import parse_output, extract_particle_dynamics
    
    print "running %d particles" % startn
    cmds = []

    cmds.append(["makeking", "-n", "%d"%startn, "-w", "5", "-i",  "-u"])
    cmds.append(["makemass", "-f", "2", "-l", "0.1,", "-u", "20"])
    cmds.append(["makesecondary", "-f", "0.1", "-l", "0.25"])
    cmds.append(["makebinary", "-l", "1", "-u", "10"])
    cmds.append(["scale", "-m", "1", "-e", "-0.25", "-q", "0.5"]) 
    cmds.append(["kira", "-t", "100", "-d", "1", "-D", "2", "-f", "0.3", "-n", "10", "-q", "0.5", "-G", "2", "-B"])

    procs = []
    for index, cmd in enumerate(cmds):
        print index, cmd
        if index > 0:
            procs.append(Popen(cmd, stdout=PIPE, stderr=PIPE, stdin=procs[index-1].stdout))
        else:
            procs.append(Popen(cmd, stdout=PIPE, stderr=PIPE))
    inp = procs[-1].stdout
    
    result = procs[-1].communicate()
    slist = parse_output(result[0])
    return len(extract_particle_dynamics(slist[-1]))

We want to run this in parallel. To do so, we need to spin up a set of compute engines. Go to the IPython dashboard (where you see the collection of notebooks) and click on the clusters tab. Find the row with your name, type 50 in the box, and hit the start button.  Actually, the way things are set up, it doesn't matter what you type in the box, all of the engines assigned to you will start up.  To find out how many you've actually got, we need to set up a way to interact with the cluster (called a "Client").  The client keeps a list of all of the compute engines; by finding the length of this list, we will know how many engines we have access to.

The client needs to know which cluster profile to use. You'll supply your last name as a `profile` keyword when you set up the Client.

In [None]:
from IPython.parallel import Client
rc = Client(profile='Bragg')
len(rc.ids)

If we want to actually run jobs on the cluster, we construct what's called a view. We can either use a "direct view", which sends specific tasks to specific engines, or we can use a "load balanced view", which takes care of assignment for us. At least initially, we'll go with this latter option. 

In [None]:
lbv = rc.load_balanced_view()

As an example of how this all works, let's run a set of 500 simulations, each with 100 stars.  We first make a list that contains the number of stars (100) 500 times. Then we `map` the function we defined above on to that list, calling it once for each of the items in the list. We could just as easily build a list that had a variety of numbers in it, but since we want to look at variations in results given the same input, we don't want to do that just yet. We don't have to wait for results before we move on to other things (i.e., we can let it run without tying up our notebook) but for this short example, we'll keep an eye on how things are progressing by using the `wait_interactive()` function.

Note that if you try to do a large number of jobs (more than 1000 or so), the `map_async()` takes a longish time to actually queue all of the jobs, but they start running as soon as the first one gets queued.

In [None]:
thepoints = [100] * 80
results = lbv.map_async(premain, thepoints)
results.wait_interactive()

If we want to see the results, we need to fire up some plots.

In [None]:
%pylab inline
import numpy as np

We'll put the results into a `numpy` array (which is a little easier to deal with than a list) and then plot a histogram.

In [None]:
res = np.array(results)
plt.hist(res, res.max() - res.min())
plt.xlabel("Number of stars remaining")
plt.ylabel("Number of runs")
plt.title(r"Stars remaining after 100 dynamical times, $N_0 = 100$")

The results look about as close to Gaussian as you're going to get with only 500 points.

That's basic parallel use. When you're done with your computations, it's a good idea to stop the compute engines: go back to the cluster tab on the dashboard and hit the appropriate stop button.

Just for kicks, let's see how it looks with 5000 points.

In [None]:
thepoints = [100] * 5000
results = lbv.map_async(premain, thepoints)
results.wait_interactive()

In [None]:
res = np.array(results)
#plt.figsize(10,10) # make the figure a little bigger so we can see better
plt.hist(res, res.max() - res.min()) # arguments are the results array and the number of bins to use for the histogram
plt.xlabel("Number of stars remaining")
plt.ylabel("Number of runs")
plt.title(r"Stars remaining after 100 dynamical times, $N_0 = 100$")