Skip to content
cpettitt edited this page Sep 25, 2012 · 39 revisions

ParSeq is a framework that makes it easier to write and maintain fast, scalable applications in Java.

Some of the key benefits of ParSeq include:

  • Parallelization of asynchronous operations (such as IO)
  • Serialized execution for non-blocking computation
  • Code reuse via task composition
  • Simple error propagation and recovery
  • Execution tracing and visualization

Key Concepts

In ParSeq, we have a few basic concepts:

  • Promise: like a Java Future, a Promise allows the user to get the result of an asynchronous computation. However, a Promise allows the user to wait for the result asynchronously instead of requiring a blocking get call.
  • Task: a Task is a deferred execution that can be executed by an Engine. All Tasks are also Promises, so you can wait for the result of a Task asynchronously. Tasks can be sequenced using seq and par (see below).
  • par: composes a group of Tasks that can be executed in parallel.
  • seq: composes an ordered list of Tasks that will be executed sequentially.
  • Engine: a pool of workers that executes Tasks.

A Simple Example

In this example we show how to fetch several pages in parallel and how to combine them once they've all been retrieved.

First we can retrieve a single page using an asynchronous HTTP client as follows:

final Task<String> google = httpClient.fetch("http://www.google.com");
engine.run(google);
google.await();
System.out.println("Google Page: " + google.get());

This will print:

Google Page: <!doctype html><html>...

In this code snippet we don't really get any benefit from ParSeq. Essentially we create a task that can be run asynchronously, but then we block for completion using google.await(). In this case, the code is more complicated than issuing a simple synchronous call. We improve this slightly, by making it asynchronous:

final Task<String> google = httpClient.fetch("http://www.google.com");
google.addListener(new PromiseListener<String>() {
  @Override
  public void onResolved(final Promise<String> promise)
  {
    System.out.println("Google Page: " + promise.get());
  }
});
engine.run(google);

This snippet is fully asynchronous. The thread that calls engine.run(google) will return immediately while the fetch of the Google home page may still be in progress. Once the Google home page has been retrieved the listener prints it's contents to the console.

Instead of using a PromiseListener, we could have used another Task, as shown here:

final Task<String> google = httpClient.fetch("http://www.google.com");
final Task<Void> printResult = Tasks.action("printResult", new Runnable() {
  @Override
  public void run()
  {
    System.out.println("Google Page: " + google.get());
  }
});
final Task<Void> plan = Tasks.seq(google, printResult);
engine.run(plan);

In this snippet we've created a second task that prints out the result of getting the Google home page. We use Tasks.seq to tell the engine that it should first get the Google home page and then print the results. In this particular case a PromiseListener is a little easier to use. In general it is better to use a Task where possible, because they can be easily composed in ParSeq (e.g. using Tasks.seq and Tasks.par).

Now, let's expand the example so that we can fetch a few more pages in parallel.

final Task<String> google = httpClient.fetch("http://www.google.com");
final Task<String> yahoo = httpClient.fetch("http://www.yahoo.com");
final Task<String> bing = httpClient.fetch("http://www.bing.com");

final Task<Void> combineValues = Tasks.action("printResult", new Runnable() {
  @Override
  public void run()
  {
    // Com
  }
});

final Task<Void> plan = Tasks.seq(Tasks.par(google, yahoo, bing),
                                  printResult);
engine.run(plan);

This example is again fully asynchronous. The home pages for Google, Yahoo, and Bing are all fetched in parallel while the original thread has returned to the calling code. We used Tasks.par to tell the engine to parallelize these HTTP requests. Once all of the responses have been retrieved, the printResult task is invoked.

We can also do transforms on the data we retrieved. Here's a very simple transform that sums the length of the 3 pages that were fetched:

final Task<String> google = httpClient.fetch("http://www.google.com");
final Task<String> yahoo = httpClient.fetch("http://www.yahoo.com");
final Task<String> bing = httpClient.fetch("http://www.bing.com");

final Task<Integer> sumLengths = Tasks.callable("sumLengths", new Callable<Integer>() {
  @Override
  public Integer call() throws Exception
  {
    return google.get().length() + yahoo.get().length() + bing.get().length();
  }
});

final Task<Integer> plan = Tasks.seq(Tasks.par(google, yahoo, bing),
                                     sumLengths);

The plan task can be given to an engine for execution and it's result value will be set to the sum of the length of the 3 fetched pages.

For many more examples, please see the example module in the source code.

Resources

Clone this wiki locally