Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Git branch: multicore
This project aims to add support for executing multiple threads of OCaml code in parallel within the same process. There are two sides to this project:
- Creating a multicore OCaml runtime.
- Creating programming models and libraries to support parallel programming in OCaml.
The basic idea of this runtime is to allow multiple OCaml runtimes (called contexts) to exist in the same process, and then to allow these contexts to share their major heap.
See OCaml Multicore runtime for a more detailed discussion.
Parallel programming models
We would like to implement a number of parallel programming models using the multicore runtime. The following, in order of complexity, would be useful.
Threads library needs to be extended with functions for
creating and manipulating contexts. It's synchronisation primitives also
need to be modified to protect from threads in other contexts.
ParMap library provides parallel map functionality using multiple
processes. This interface could probably reimplemented more efficiently
using the multicore runtime.
Xavier Clerc has created a parallel OCaml library for use in the OCaml-Java project. It provides most basic multi-threaded constructs, and would be useful to implement using the multicore runtime. This would allow us to compare OCaml's multicore runtime with the JVM.
Monadic task library
Monadic concurrency libraries like
Async provide a good model
for concurrent programming in OCaml. Creating a similar library that can
execute multiple tasks in parallel would be an effective model for parallel
Multiple tasks would be scheduled using work-stealing (with context-local dequeues accessed via proxies).
This library could also be extended with features such as concurrent revisions.
Actors with no shared state
Ensuring no shared state
Programming using actors that share no (mutable) state is an effective parallel programming model that can safely be scaled up to distributed systems. However ensuring that actors do not share mutable state is difficult in OCaml. OCaml does not track side-effects and abstraction allows you to easily hide mutable state inside a module.
One possible solution is to enforce a lack of shared state is not to allow different nodes to share any modules. Thus two nodes would effectively be separate programs that happened to share an OCaml runtime. Note that they could still share the code for modules, just not the actual module record.
To create these nodes we could use a mechanism similar to pack. It would construct an isolated node module and only expose message passing interfaces for communicating with it. This translation is similar to that of the big functor patch.
Note that nodes created in this way would not require the read barrier, so we can effectively wrap a single-threaded OCaml program into a node and safely include it within a multi-threaded OCaml program.
Message passing using copying channels is also difficult to support in OCaml. Copying is not in general a type-safe operation. This means that we require modules to expose an indication that their contents can be safely copied. Thus we have an interface:
val new_channel: 'a copyable -> 'a copy_channel
copyable is essentially a type-class. Creating values of
would only be supported through a type-conv style interface:
type t = Foo | Bar of s with copyable
The copyable value also allows us to only actually copy mutable types. Only the mutable parts of a type need to be copied and this could be expressed in the copyable value. With some clever use of cross-module inlining this should provide very efficient channels for passing immutable values.
Note that it would not be possible to make types containing closures (including objects) copyable, because a closure may capture some mutable state. However, channels would be copyable, so a function could be replaced with a channel that accepted input and a return channel, called the function on its own node, and then sent the result down the return channel.