Skip to content
edsko edited this page May 17, 2012 · 22 revisions

This repository contains work developed towards a new implementation of a distributed computing interface, where processes communicate with one another through explicit message passing rather than shared memory. The modules contained are intended to provide functionality for distributed computing, using the model described in Towards Cloud Haskell where Haskell is used to provide computation across nodes that share data only through message passing.

We start with a very high level description of how this library works, before going into further specific details. There is also a recent presentation on Cloud Haskell and this reimplementation.

Infrastructure

One goal of this project is to separate the transport layer from the process layer, so that the transport backend is entirely independent: it is envisaged that this interface might later be used by models other than the Cloud Haskell paradigm, and that applications built using Cloud Haskell might be easily configured to work with different backend transports.

The following diagram shows dependencies between the various modules that are envisaged, where arrows represent explicit module dependencies.

+------------------------------------------------------------+
|                        Application                         |
+------------------------------------------------------------+
             |                               |
             V                               V
+-------------------------+   +------------------------------+
|      Cloud Haskell      |<--|    Cloud Haskell Backend     |
+-------------------------+   +------------------------------+
             |           ______/             |
             V           V                   V
+-------------------------+   +------------------------------+
|   Transport Interface   |<--|   Transport Implementation   |
+-------------------------+   +------------------------------+
                                             |
                                             V
                              +------------------------------+
                              | Haskell/C Transport Library  |
                              +------------------------------+

In this diagram, the various nodes roughly correspond to specific modules:

Cloud Haskell                : Control.Distributed.Process
Cloud Haskell Backend        : Control.Distributed.Process.TCP
Transport Interface          : Network.Transport
Transport Implementation     : Network.Transport.TCP

An application is built using the primitives provided by the Cloud Haskell layer, provided by Control.Distributed.Process module, which provides abstractions such as nodes and processes.

The application also depends on a Cloud Haskell Backend, which provides functions to allow the initialisation of the transport layer using whatever topology might be appropriate to the application, this is be provided by Control.Distributed.Process.TCP, but could be interchanged for another backend transport protocol.

Both the Cloud Haskell interface and implementation make use of Transport Interface, provided by the Network.Transport module. This also serves as an interface for the Network.Transport.TCP module, which provides a specific implementation for this transport, and may, for example, be based on some external library written in Haskell or C.

Transports

Abstracting over the transport layer allows different protocols for message passing, including TCP/IP, UDP, MPI, CCI, ZeroMQ, SSH, MVars, Unix pipes, and more. Each of these transports would provide its own implementation of the Network.Transport and provide a means of creating new connections for use within Control.Distributed.Process. This separation means that transports might be used for other purposes than Cloud Haskell.

Repositories, Packages, and Modules

The modules described above are found in the distributed-process repository, and split into two packages: distributed-process, and network-transport. These packages provide the basic interfaces, and a few standard implementations:

distributed-process:
  Control.Distributed.Process
  Control.Distributed.Process.TCP
  Control.Distributed.Process.UDP
  Control.Distributed.Process.MVar

network-transport:
  Network.Transport
  Network.Transport.TCP
  Network.Transport.UDP
  Network.Transport.MVar

Additional packages would include more exotic transports:

distributed-process-mpi:
  Control.Distributed.Process.MPI
  Network.Transport.MPI

distributed-process-cci:
  Control.Distributed.Process.CCI
  Network.Transport.CCI

More Information