Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multicore OCaml #10831

Merged
merged 4,103 commits into from
Jan 10, 2022
Merged

Multicore OCaml #10831

merged 4,103 commits into from
Jan 10, 2022

Conversation

kayceesrk
Copy link
Contributor

@kayceesrk kayceesrk commented Dec 21, 2021

This PR adds support for shared-memory parallelism through domains and direct-style concurrency through effect handlers (without syntactic support). It intends to have backwards compatibility in terms of language features, C API, and also the performance of single-threaded code.

For users

If you want to learn more about Multicore OCaml, please have a look at the multicore wiki for papers, talks, tutorials and blog posts.

If you are interested in using Multicore OCaml, we suggest having a look at the following work-in-progress libraries:

  1. domainslib -- a library for nested task parallelism
  2. eio -- asynchronous io in direct-style

Here is a snapshot of the multicore scalability results on parallel benchmarks from sandmark on a 2 processor AMD EPYC 7551 server with 64 cores in total:

The number in the parenthesis next to the benchmark name is the time (in seconds) taken by the sequential baseline of the corresponding benchmark.

Review process

This PR comes at the back of asynchronous code reviews by the OCaml core developers followed by a week-long synchronous code review session. The summary of the discussions is available in the November multicore monthly.

As mentioned in the November multicore monthly, this PR constitutes the minimum viable product (MVP) for Multicore OCaml. The "pre-MVP" tasks have been completed, and the "post-MVP for 5.00" tasks are yet to be done. The aim is that these tasks, and the ones that follow, can be completed on ocaml/ocaml Github repo, through the usual OCaml PR review process rather than on the Multicore OCaml repo.

Given that the PR is quite large, if you spot major breakages and functionality gaps, we suggest that you make separate issues on ocaml/ocaml Github repo. This will help keep the discussion threads readable. Feel free to comment on minor issues (typos, formatting edits, etc) directly in this PR, and we shall be happy to fix those.

What's in the box

The only supported backend is amd64 on Linux and macOS; arm64 is in the works. The following features are not implemented:

  • statmemprof
  • compilation with frame pointers
  • ocamldebug
  • AFL support
  • flambda

Acknowledgements

Multicore OCaml has been a long running project. We would like to thank all those who have helped find issues, debugged issues, reviewed code and contributed code along the way.

-- The Multicore OCaml team

abbysmal and others added 30 commits December 9, 2021 10:46
…o/skip-unsupported-tests

Skip unsupported and incompatible tests
…threads_wg3_comments

Systhreads WG3 comments
…ment_on_bt_install

Comment on caml_domain_spawn also calling in install_backup_thread
…nitize-exports

Rename / hide some global variables
…gnals_coalesce

Signals changes from sync review and WG3
@silene
Copy link

silene commented Jan 11, 2022

Function caml_process_pending_actions_exn seems to have disappeared. Is it on purpose?

@gadmm
Copy link
Contributor

gadmm commented Jan 11, 2022

@silene This is ocaml-multicore/ocaml-multicore#791. A lot of things making caml_process_pending_actions_exn possible are currently missing, my review above tracks what is missing.

@sadiqj
Copy link
Contributor

sadiqj commented Jan 11, 2022

A lot of things making caml_process_pending_actions_exn possible are currently missing, my review above tracks what is missing.

I think it's worth us pulling this in to a separate issue on ocaml/ocaml where we can work out exactly what needs to happen. There are some parts (like delaying finalisers and returning exceptions rather than raising) which actually require a fair bit of work to make happen.

@gadmm
Copy link
Contributor

gadmm commented Jan 11, 2022

I agree, just rebasing correctly alone would have been very hard for someone not familiar with the code, and working on this part in trunk is the best way to be able to help you.

runtime/memory.c Outdated
} else {
/* See Note [MM] above */
atomic_thread_fence(memory_order_acquire);
ret = atomic_exchange(Op_atomic_val(ref), v);
Copy link
Contributor

@gadmm gadmm Jan 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's a question about the implementation of the memory model again. For context, currently, Atomic.store and Atomic.exchange result directly in a call to caml_atomic_exchange.
Here if I follow naively the paper I expect instead something like:

    ret = atomic_exchange(Op_atomic_val(ref), v);
    atomic_thread_fence(memory_order_release);

On armv8 (clang) the original code results in:
dmb (ish)ld; L: ldaxr w2, [x0]; stlxr w3, w1, [x0]; cbnz w3, L (notice the missing dmb st)
whereas the above one results in:
L: ldaxr w2, [x0]; stlxr w3, w1, [x0]; cbnz w3, .L17; dmb (ish)
which is closer to the paper (it does not seem to be possible to emit the dmb st barrier with standard C atomics; the closest is to use a full barrier).

Of course at the moment armv8 is not supported (but I assume that the armv8 prototype is currently implemented with these functions). However this runs deeper:

  • The implementation currently seems correct on x86 when these functions are called from OCaml. However one could imagine that when they are called from C code, a compiler might try to inline them and be too smart. How do you ensure that these functions implement the OCaml memory model on x86 when called from C? (Edit: the answer is that currently caml_atomic_exchange is not (yet) accessible from C code; only caml_modify, which seems to implement a stronger operation than the OCaml memory model. Maybe this question will be relevant later.)
  • Maybe the memory model has evolved between the first prototype and the paper and the documentation at the top of the file was not updated. In that case should comment it be updated (more that I initially thought)?

Naively I would think that for the OCaml-C FFI you are eventually going to implement a conservative approximation of the OCaml memory model in C, whereas for native OCaml code you will eventually emit the exact assembly code you need (plus a C call to the write barrier). Sorry if there is a simple answer that I am missing!

Edit: the questions above will be relevant eventually, but for now (x86-only) it looks correct, even if misleading.

sabine added a commit to sabine/ocaml that referenced this pull request Feb 1, 2022
… rebase

Mentioned in ocaml#10831 review comment on runtime/io.c:552.
@gadmm gadmm mentioned this pull request Feb 1, 2022
xavierleroy pushed a commit that referenced this pull request Feb 1, 2022
…se (#10975)

Mentioned in #10831 review comment on runtime/io.c:552.

Co-authored-by: sabine <sabine@tarides.com>
moyodiallo added a commit to moyodiallo/ocaml-multicore-ci that referenced this pull request Mar 1, 2022
ocaml-multicore/multicore was already merge in ocaml/ocaml
(ocaml/ocaml#10831) and avalaible
with 5.00.0+trunk.

Now it makes sense to remove ocaml-multicore repo
and use directly the version 5.00 of OCaml.
@avsm avsm mentioned this pull request Jun 18, 2023
9 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.