Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make Eio.Condition lock-free #397

Merged
merged 1 commit into from Jan 3, 2023

Conversation

talex5
Copy link
Collaborator

@talex5 talex5 commented Dec 29, 2022

This makes Eio.Condition safe to use from a signal handler (see #374).

It's also faster. Before (using locks):

$ dune exec -- ./bench/bench_condition.exe
use_domains,  ms/iter, promoted/iter
      false,    3.93,        0.0000
       true,    9.62,    25513.410

After:

$ dune exec -- ./bench/bench_condition.exe
use_domains,  ms/iter, promoted/iter
      false,    1.12,        0.0000
       true,    2.71,    13066.2100

This is based on the paper A formally-verified framework for fair synchronization in kotlin coroutines.

The key part is cells.ml (based on Appendix C in the paper), which implements an infinite list of cells as a linked-list of segments, each containing an array of cells. When all cells in a segment are cancelled, the segment is removed from the list.

broadcast.ml then uses this to build a queue of waiters that can all be woken at once. Eio.Condition uses this to suspend and resume fibers. Broadcast isn't described in the paper; I had to add a resume_all operation to allow it. They do have a "Barrier", but that requires a separate counter and seems to have problems with cancellation ("Thus, similarly to the implementation in Java, we do not support cancellation."). I'm not sure why they didn't add a resume_all themselves.

lib_eio/tests/broadcast.md shows an example of how to use it.

lib_eio/tests/dscheck contains some tests that try to explore all possible interleavings of atomic operations. However, the official version runs out of memory and crashes after about 30s, while the optimised version has a known bug that may cause it to miss bugs.

It should be fairly easy to use this code to make the other synchronisation primitives lock-free (#382). The paper suggests creating a single abstraction to handle all cases, but I suspect it would be easier to create several simpler ones.

I want to read over the paper again and check it against my code, but it seems to be working well at the moment.

/cc @kayceesrk @bartoszmodelski @lyrm

@talex5 talex5 force-pushed the lock-free-condition branch 5 times, most recently from d351097 to 105608f Compare January 3, 2023 10:28
This makes Eio.Condition safe to use from a signal handler (or GC
finalizer).

The cells.ml abstraction should be useful for several other
synchronisation primitives (see https://arxiv.org/pdf/2111.12682.pdf).
@talex5 talex5 merged commit 62568f7 into ocaml-multicore:main Jan 3, 2023
@talex5 talex5 deleted the lock-free-condition branch January 3, 2023 11:29
talex5 added a commit to talex5/opam-repository that referenced this pull request Feb 1, 2023
CHANGES:

New features:

- Add `Eio.Net.run_server` (@bikallem @talex5 ocaml-multicore/eio#408).
  Runs an accept loop in one or more domains, with cancellation and graceful shutdown,
  and an optional maximum number of concurrent connections.

- Add `Buf_read.BE` and `LE` parsers (@Cjen1 ocaml-multicore/eio#399).
  Parse numbers in various binary formats.

- Add `Eio.Buf_read.uint8` (@talex5 ocaml-multicore/eio#418).

Performance:

- Make `Eio.Condition` lock-free (@talex5 ocaml-multicore/eio#397 ocaml-multicore/eio#381).
  In addition to being faster, this allows using conditions in signal handlers.

- Make `Eio.Semaphore` lock-free (@talex5 @polytypic ocaml-multicore/eio#398).

- Make `Eio.Stream` lock-free when the capacity is zero (@talex5 ocaml-multicore/eio#413 ocaml-multicore/eio#411).

- Make `Eio.Promise` lock-free (@talex5 ocaml-multicore/eio#401).

Bug fixes:

- eio_linux: call `Uring.submit` as needed (@talex5 @bikallem ocaml-multicore/eio#428).
  Previously, we could fail to submit a job promptly because the SQE queue was full.

- Fix luv signals (@haesbaert ocaml-multicore/eio#412).
  `libuv` automatically retries polling if it gets `EINTR`, without giving OCaml signal handlers a chance to run.

- eio_luv: fix some resource leaks (@talex5 @patricoferris ocaml-multicore/eio#421).

- eio_luv: fix "unavailable signal" error on Windows (@talex5 ocaml-multicore/eio#420, reported by @nojb).

- Fix `Buf_write.BE.uint48` and `LE.uint48` (@adatario ocaml-multicore/eio#418).

Documentation:

- Add example programs (@talex5 ocaml-multicore/eio#389).

- Update network examples to use `run_server` (@talex5 ocaml-multicore/eio#417).

- Add a warning to the tutorial about `Fiber.first` (@talex5 ocaml-multicore/eio#394).

- Clarify the epoch used for `Eio.Time.now` (@bikallem ocaml-multicore/eio#395).

- Describe `secure_random` as an infinite source (@patricoferris ocaml-multicore/eio#426).

- Update README for OCaml 5 release (@talex5 ocaml-multicore/eio#384 ocaml-multicore/eio#391 ocaml-multicore/eio#393).

Other changes:

- Delay setting `SIGPIPE` handler until the `run` function is called (@talex5 ocaml-multicore/eio#420).

- Remove debug-level logging (@talex5 ocaml-multicore/eio#403).

- eio-luv: improve `process.md` test (@smondet ocaml-multicore/eio#414).

- Update to Dune 3 (@talex5 ocaml-multicore/eio#410).

- Remove test dependency on Astring (@talex5 ocaml-multicore/eio#402 ocaml-multicore/eio#404).

- Simplify cancellation logic (@talex5 ocaml-multicore/eio#396).

- time: `Mtime.Spand.to_s` has been deprecated in mtime 2.0.0 (@bikallem ocaml-multicore/eio#385).
talex5 added a commit to talex5/opam-repository that referenced this pull request Feb 1, 2023
CHANGES:

New features:

- Add `Eio.Net.run_server` (@bikallem @talex5 ocaml-multicore/eio#408).
  Runs an accept loop in one or more domains, with cancellation and graceful shutdown,
  and an optional maximum number of concurrent connections.

- Add `Buf_read.BE` and `LE` parsers (@Cjen1 ocaml-multicore/eio#399).
  Parse numbers in various binary formats.

- Add `Eio.Buf_read.uint8` (@talex5 ocaml-multicore/eio#418).

Performance:

- Make `Eio.Condition` lock-free (@talex5 ocaml-multicore/eio#397 ocaml-multicore/eio#381).
  In addition to being faster, this allows using conditions in signal handlers.

- Make `Eio.Semaphore` lock-free (@talex5 @polytypic ocaml-multicore/eio#398).

- Make `Eio.Stream` lock-free when the capacity is zero (@talex5 ocaml-multicore/eio#413 ocaml-multicore/eio#411).

- Make `Eio.Promise` lock-free (@talex5 ocaml-multicore/eio#401).

Bug fixes:

- eio_linux: call `Uring.submit` as needed (@talex5 @bikallem ocaml-multicore/eio#428).
  Previously, we could fail to submit a job promptly because the SQE queue was full.

- Fix luv signals (@haesbaert ocaml-multicore/eio#412).
  `libuv` automatically retries polling if it gets `EINTR`, without giving OCaml signal handlers a chance to run.

- eio_luv: fix some resource leaks (@talex5 @patricoferris ocaml-multicore/eio#421).

- eio_luv: fix "unavailable signal" error on Windows (@talex5 ocaml-multicore/eio#420, reported by @nojb).

- Fix `Buf_write.BE.uint48` and `LE.uint48` (@adatario ocaml-multicore/eio#418).

Documentation:

- Add example programs (@talex5 ocaml-multicore/eio#389).

- Update network examples to use `run_server` (@talex5 ocaml-multicore/eio#417).

- Add a warning to the tutorial about `Fiber.first` (@talex5 ocaml-multicore/eio#394).

- Clarify the epoch used for `Eio.Time.now` (@bikallem ocaml-multicore/eio#395).

- Describe `secure_random` as an infinite source (@patricoferris ocaml-multicore/eio#426).

- Update README for OCaml 5 release (@talex5 ocaml-multicore/eio#384 ocaml-multicore/eio#391 ocaml-multicore/eio#393).

Other changes:

- Delay setting `SIGPIPE` handler until the `run` function is called (@talex5 ocaml-multicore/eio#420).

- Remove debug-level logging (@talex5 ocaml-multicore/eio#403).

- eio-luv: improve `process.md` test (@smondet ocaml-multicore/eio#414).

- Update to Dune 3 (@talex5 ocaml-multicore/eio#410).

- Remove test dependency on Astring (@talex5 ocaml-multicore/eio#402 ocaml-multicore/eio#404).

- Simplify cancellation logic (@talex5 ocaml-multicore/eio#396).

- time: `Mtime.Spand.to_s` has been deprecated in mtime 2.0.0 (@bikallem ocaml-multicore/eio#385).
talex5 added a commit to talex5/opam-repository that referenced this pull request Feb 1, 2023
CHANGES:

New features:

- Add `Eio.Net.run_server` (@bikallem @talex5 ocaml-multicore/eio#408).
  Runs an accept loop in one or more domains, with cancellation and graceful shutdown,
  and an optional maximum number of concurrent connections.

- Add `Buf_read.BE` and `LE` parsers (@Cjen1 ocaml-multicore/eio#399).
  Parse numbers in various binary formats.

- Add `Eio.Buf_read.uint8` (@talex5 ocaml-multicore/eio#418).

Performance:

- Make `Eio.Condition` lock-free (@talex5 ocaml-multicore/eio#397 ocaml-multicore/eio#381).
  In addition to being faster, this allows using conditions in signal handlers.

- Make `Eio.Semaphore` lock-free (@talex5 @polytypic ocaml-multicore/eio#398).

- Make `Eio.Stream` lock-free when the capacity is zero (@talex5 ocaml-multicore/eio#413 ocaml-multicore/eio#411).

- Make `Eio.Promise` lock-free (@talex5 ocaml-multicore/eio#401).

Bug fixes:

- eio_linux: call `Uring.submit` as needed (@talex5 @bikallem ocaml-multicore/eio#428).
  Previously, we could fail to submit a job promptly because the SQE queue was full.

- Fix luv signals (@haesbaert ocaml-multicore/eio#412).
  `libuv` automatically retries polling if it gets `EINTR`, without giving OCaml signal handlers a chance to run.

- eio_luv: fix some resource leaks (@talex5 @patricoferris ocaml-multicore/eio#421).

- eio_luv: fix "unavailable signal" error on Windows (@talex5 ocaml-multicore/eio#420, reported by @nojb).

- Fix `Buf_write.BE.uint48` and `LE.uint48` (@adatario ocaml-multicore/eio#418).

Documentation:

- Add example programs (@talex5 ocaml-multicore/eio#389).

- Update network examples to use `run_server` (@talex5 ocaml-multicore/eio#417).

- Add a warning to the tutorial about `Fiber.first` (@talex5 ocaml-multicore/eio#394).

- Clarify the epoch used for `Eio.Time.now` (@bikallem ocaml-multicore/eio#395).

- Describe `secure_random` as an infinite source (@patricoferris ocaml-multicore/eio#426).

- Update README for OCaml 5 release (@talex5 ocaml-multicore/eio#384 ocaml-multicore/eio#391 ocaml-multicore/eio#393).

Other changes:

- Delay setting `SIGPIPE` handler until the `run` function is called (@talex5 ocaml-multicore/eio#420).

- Remove debug-level logging (@talex5 ocaml-multicore/eio#403).

- eio-luv: improve `process.md` test (@smondet ocaml-multicore/eio#414).

- Update to Dune 3 (@talex5 ocaml-multicore/eio#410).

- Remove test dependency on Astring (@talex5 ocaml-multicore/eio#402 ocaml-multicore/eio#404).

- Simplify cancellation logic (@talex5 ocaml-multicore/eio#396).

- time: `Mtime.Spand.to_s` has been deprecated in mtime 2.0.0 (@bikallem ocaml-multicore/eio#385).
talex5 added a commit to talex5/opam-repository that referenced this pull request Feb 1, 2023
CHANGES:

New features:

- Add `Eio.Net.run_server` (@bikallem @talex5 ocaml-multicore/eio#408).
  Runs an accept loop in one or more domains, with cancellation and graceful shutdown,
  and an optional maximum number of concurrent connections.

- Add `Buf_read.BE` and `LE` parsers (@Cjen1 ocaml-multicore/eio#399).
  Parse numbers in various binary formats.

- Add `Eio.Buf_read.uint8` (@talex5 ocaml-multicore/eio#418).

Performance:

- Make `Eio.Condition` lock-free (@talex5 ocaml-multicore/eio#397 ocaml-multicore/eio#381).
  In addition to being faster, this allows using conditions in signal handlers.

- Make `Eio.Semaphore` lock-free (@talex5 @polytypic ocaml-multicore/eio#398).

- Make `Eio.Stream` lock-free when the capacity is zero (@talex5 ocaml-multicore/eio#413 ocaml-multicore/eio#411).

- Make `Eio.Promise` lock-free (@talex5 ocaml-multicore/eio#401).

Bug fixes:

- eio_linux: call `Uring.submit` as needed (@talex5 @bikallem ocaml-multicore/eio#428).
  Previously, we could fail to submit a job promptly because the SQE queue was full.

- Fix luv signals (@haesbaert ocaml-multicore/eio#412).
  `libuv` automatically retries polling if it gets `EINTR`, without giving OCaml signal handlers a chance to run.

- eio_luv: fix some resource leaks (@talex5 @patricoferris ocaml-multicore/eio#421).

- eio_luv: fix "unavailable signal" error on Windows (@talex5 ocaml-multicore/eio#420, reported by @nojb).

- Fix `Buf_write.BE.uint48` and `LE.uint48` (@adatario ocaml-multicore/eio#418).

Documentation:

- Add example programs (@talex5 ocaml-multicore/eio#389).

- Update network examples to use `run_server` (@talex5 ocaml-multicore/eio#417).

- Add a warning to the tutorial about `Fiber.first` (@talex5 ocaml-multicore/eio#394).

- Clarify the epoch used for `Eio.Time.now` (@bikallem ocaml-multicore/eio#395).

- Describe `secure_random` as an infinite source (@patricoferris ocaml-multicore/eio#426).

- Update README for OCaml 5 release (@talex5 ocaml-multicore/eio#384 ocaml-multicore/eio#391 ocaml-multicore/eio#393).

Other changes:

- Delay setting `SIGPIPE` handler until the `run` function is called (@talex5 ocaml-multicore/eio#420).

- Remove debug-level logging (@talex5 ocaml-multicore/eio#403).

- eio-luv: improve `process.md` test (@smondet ocaml-multicore/eio#414).

- Update to Dune 3 (@talex5 ocaml-multicore/eio#410).

- Remove test dependency on Astring (@talex5 ocaml-multicore/eio#402 ocaml-multicore/eio#404).

- Simplify cancellation logic (@talex5 ocaml-multicore/eio#396).

- time: `Mtime.Spand.to_s` has been deprecated in mtime 2.0.0 (@bikallem ocaml-multicore/eio#385).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant