Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make Eio_linux.wakeup signal-safe #381

Merged
merged 1 commit into from Dec 8, 2022
Merged

Conversation

talex5
Copy link
Collaborator

@talex5 talex5 commented Dec 5, 2022

This will eventually allow Waiters to be used from signal handlers and GC finalizers (once that's lock-free too).

Previously, wakeup couldn't be used from a signal handler because it took a lock (to ensure that the eventfd wasn't closed while it wrote to it). Now, we avoid ever closing eventfds and simply keep free ones in a pool.

/cc @haesbaert

@haesbaert haesbaert mentioned this pull request Dec 5, 2022
@haesbaert
Copy link
Contributor

haesbaert commented Dec 5, 2022

This is nice, although the forever leak of fds is kidna evil, especially when we have a single process running tests with multiple runs. Should top at 64 IIRC, since that's maximum number of domains/eventfd pairs.

I can fix the leak with barriers in the future, it's a pretty simple solution, I just have to write it.

@talex5
Copy link
Collaborator Author

talex5 commented Dec 6, 2022

I can fix the leak with barriers in the future, it's a pretty simple solution, I just have to write it.

That sounds interesting. How does it work?

@haesbaert
Copy link
Contributor

haesbaert commented Dec 6, 2022

I can fix the leak with barriers in the future, it's a pretty simple solution, I just have to write it.

That sounds interesting. How does it work?

Reinstating the current problem so we're on the same page

The current problem is t.eventfd might be closed in the middle of wakeup or worse, eventfd might now be something completely different (as in allocated by another Domain).

How barriers could help

Let there be two domains, A and B. A is trying to wakeup B while B is exiting.
t.eventfd would become something akin to a weak pointer Atomic.t Unix.filedescriptor option

At B_2 we remove the external visibility of eventfd, but we can't really close/destroy it since another domain might be in between A_2 and A_3, meaning it got the old value (Some) and will try writing to it, if the write happens before the remote close, we're safe.

Issuing a barrier call makes sure any remote Domain switched fibers at least once, meaning they can't be in the middle of a wakeup, which means any subsequent Atomic.get will see the updated None value, which means I'm now safe to close/destroy the descriptor.

In EIO, a window or epoch could be everyone switched fibers once, so Eio.Domain.barrier() would wait for the next epoch.

(* Domain A trying to wakeup domain B *)
A_0: let wakeup b =
A_1:   match Atomic.get b.eventfd with
A_2:  | Some fd ->
A_3:     Unix.single_write fd wake_buffer 0 8
A_4:  | None -> () (* Target domain must be exiting *)

(* Domain B in the middle of exit *)
B_0: let exit () =
B_1:   let myfd = self.eventfd in (* copy *)
B_2:   Atomic.set self.eventfd None; (* unlink *)
B_3:   Eio.Domain.barrier (); (* publish *)
B_4:   Unix.close myfd (* destroy *)

(* The most naive barrier implementation would be a zero sized
   Stream.t where you just put "nothing" on it, since buffer is zero,
   the reader has to acknowledge, which means it swi tched Fibers *)
let barrier () =
  List.iter (fun t -> Eio.Stream.put t.barrier_box ()) all_domains

Background

This is mostly what RCU does, OpenBSD calls it SMR (Safe Memory Reclamation), DPDK calls it something else, but it's quite popular in C in order to properly know when you can free(3) something. I've implemented this in our OpenBSD fork at work and rewrote most of pf(4) and pfsync(4) with it.

How the barrier is implemented varies, it just needs to guarantee "wait till next epoch". The classic implementation on non preemptive kernels was pinning your thread to every different cpu and just continuing. Barriers need not to be synchronous as well, it's common to be able to "schedule a callback for the next window" so you could say "call Unix.close () in the next epoch".

https://pdos.csail.mit.edu/6.828/2008/readings/rcu.pdf
https://www.osadl.org/fileadmin/dam/presentations/RTLWS11/paulmck-rcu.pdf

@talex5
Copy link
Collaborator Author

talex5 commented Dec 6, 2022

That would be useful, but I don't see how it could be implemented. It's not just all_domains: signals and GC finalizers can also run in other systhreads and non-Eio domains.

@haesbaert
Copy link
Contributor

haesbaert commented Dec 6, 2022

That would be useful, but I don't see how it could be implemented. It's not just all_domains: signals and GC finalizers can also run in other systhreads and non-Eio domains.

We would have to make concessions like "you're only supposed to have Eio Domains", or X is only safe within an Eio context and unsafe if accessed outside.

It would work out for this eventfd case though (I think).

@talex5
Copy link
Collaborator Author

talex5 commented Dec 6, 2022

We want to use wakeup to e.g. let a Domainslib domain send a message to an Eio one.

@haesbaert
Copy link
Contributor

We want to use wakeup to e.g. let a Domainslib domain send a message to an Eio one.

that's a bummer

end = struct
external eio_eventfd : int -> Unix.file_descr = "caml_eio_eventfd"

let free = Lf_queue.create ()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we change this to something else, free for me is the opposite of alloc.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What name do you suggest? It's the list of free FDs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

free_list, fd_list, fd_pool, free_pool, but don't let this block you, there's no point in me bikeshedding it over this, if free is of your preference please go ahead.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I'm going to keep it as it is. free_list can be misread as a command just like free can; fd_list doesn't say why the FDs are there, which seems the most important thing; and the module already has pool in the name. Also, it's only in scope for 6 lines.

@haesbaert
Copy link
Contributor

So regarding this PR, I don't like the name free but other than that I think you should commit.
The benefit of not locking for every wakeup outweights the fd leak imho.

This will eventually allow Waiters to be used from signal handlers and
GC finalizers (once that's lock-free too).

Previously, `wakeup` couldn't be used from a signal handler because it
took a lock (to ensure that the eventfd wasn't closed while it wrote to
it). Now, we avoid ever closing eventfds and simply keep free ones in a
pool.
@talex5 talex5 merged commit 90de1d4 into ocaml-multicore:main Dec 8, 2022
@talex5 talex5 deleted the wakeup branch December 8, 2022 09:52
talex5 added a commit to talex5/opam-repository that referenced this pull request Feb 1, 2023
CHANGES:

New features:

- Add `Eio.Net.run_server` (@bikallem @talex5 ocaml-multicore/eio#408).
  Runs an accept loop in one or more domains, with cancellation and graceful shutdown,
  and an optional maximum number of concurrent connections.

- Add `Buf_read.BE` and `LE` parsers (@Cjen1 ocaml-multicore/eio#399).
  Parse numbers in various binary formats.

- Add `Eio.Buf_read.uint8` (@talex5 ocaml-multicore/eio#418).

Performance:

- Make `Eio.Condition` lock-free (@talex5 ocaml-multicore/eio#397 ocaml-multicore/eio#381).
  In addition to being faster, this allows using conditions in signal handlers.

- Make `Eio.Semaphore` lock-free (@talex5 @polytypic ocaml-multicore/eio#398).

- Make `Eio.Stream` lock-free when the capacity is zero (@talex5 ocaml-multicore/eio#413 ocaml-multicore/eio#411).

- Make `Eio.Promise` lock-free (@talex5 ocaml-multicore/eio#401).

Bug fixes:

- eio_linux: call `Uring.submit` as needed (@talex5 @bikallem ocaml-multicore/eio#428).
  Previously, we could fail to submit a job promptly because the SQE queue was full.

- Fix luv signals (@haesbaert ocaml-multicore/eio#412).
  `libuv` automatically retries polling if it gets `EINTR`, without giving OCaml signal handlers a chance to run.

- eio_luv: fix some resource leaks (@talex5 @patricoferris ocaml-multicore/eio#421).

- eio_luv: fix "unavailable signal" error on Windows (@talex5 ocaml-multicore/eio#420, reported by @nojb).

- Fix `Buf_write.BE.uint48` and `LE.uint48` (@adatario ocaml-multicore/eio#418).

Documentation:

- Add example programs (@talex5 ocaml-multicore/eio#389).

- Update network examples to use `run_server` (@talex5 ocaml-multicore/eio#417).

- Add a warning to the tutorial about `Fiber.first` (@talex5 ocaml-multicore/eio#394).

- Clarify the epoch used for `Eio.Time.now` (@bikallem ocaml-multicore/eio#395).

- Describe `secure_random` as an infinite source (@patricoferris ocaml-multicore/eio#426).

- Update README for OCaml 5 release (@talex5 ocaml-multicore/eio#384 ocaml-multicore/eio#391 ocaml-multicore/eio#393).

Other changes:

- Delay setting `SIGPIPE` handler until the `run` function is called (@talex5 ocaml-multicore/eio#420).

- Remove debug-level logging (@talex5 ocaml-multicore/eio#403).

- eio-luv: improve `process.md` test (@smondet ocaml-multicore/eio#414).

- Update to Dune 3 (@talex5 ocaml-multicore/eio#410).

- Remove test dependency on Astring (@talex5 ocaml-multicore/eio#402 ocaml-multicore/eio#404).

- Simplify cancellation logic (@talex5 ocaml-multicore/eio#396).

- time: `Mtime.Spand.to_s` has been deprecated in mtime 2.0.0 (@bikallem ocaml-multicore/eio#385).
talex5 added a commit to talex5/opam-repository that referenced this pull request Feb 1, 2023
CHANGES:

New features:

- Add `Eio.Net.run_server` (@bikallem @talex5 ocaml-multicore/eio#408).
  Runs an accept loop in one or more domains, with cancellation and graceful shutdown,
  and an optional maximum number of concurrent connections.

- Add `Buf_read.BE` and `LE` parsers (@Cjen1 ocaml-multicore/eio#399).
  Parse numbers in various binary formats.

- Add `Eio.Buf_read.uint8` (@talex5 ocaml-multicore/eio#418).

Performance:

- Make `Eio.Condition` lock-free (@talex5 ocaml-multicore/eio#397 ocaml-multicore/eio#381).
  In addition to being faster, this allows using conditions in signal handlers.

- Make `Eio.Semaphore` lock-free (@talex5 @polytypic ocaml-multicore/eio#398).

- Make `Eio.Stream` lock-free when the capacity is zero (@talex5 ocaml-multicore/eio#413 ocaml-multicore/eio#411).

- Make `Eio.Promise` lock-free (@talex5 ocaml-multicore/eio#401).

Bug fixes:

- eio_linux: call `Uring.submit` as needed (@talex5 @bikallem ocaml-multicore/eio#428).
  Previously, we could fail to submit a job promptly because the SQE queue was full.

- Fix luv signals (@haesbaert ocaml-multicore/eio#412).
  `libuv` automatically retries polling if it gets `EINTR`, without giving OCaml signal handlers a chance to run.

- eio_luv: fix some resource leaks (@talex5 @patricoferris ocaml-multicore/eio#421).

- eio_luv: fix "unavailable signal" error on Windows (@talex5 ocaml-multicore/eio#420, reported by @nojb).

- Fix `Buf_write.BE.uint48` and `LE.uint48` (@adatario ocaml-multicore/eio#418).

Documentation:

- Add example programs (@talex5 ocaml-multicore/eio#389).

- Update network examples to use `run_server` (@talex5 ocaml-multicore/eio#417).

- Add a warning to the tutorial about `Fiber.first` (@talex5 ocaml-multicore/eio#394).

- Clarify the epoch used for `Eio.Time.now` (@bikallem ocaml-multicore/eio#395).

- Describe `secure_random` as an infinite source (@patricoferris ocaml-multicore/eio#426).

- Update README for OCaml 5 release (@talex5 ocaml-multicore/eio#384 ocaml-multicore/eio#391 ocaml-multicore/eio#393).

Other changes:

- Delay setting `SIGPIPE` handler until the `run` function is called (@talex5 ocaml-multicore/eio#420).

- Remove debug-level logging (@talex5 ocaml-multicore/eio#403).

- eio-luv: improve `process.md` test (@smondet ocaml-multicore/eio#414).

- Update to Dune 3 (@talex5 ocaml-multicore/eio#410).

- Remove test dependency on Astring (@talex5 ocaml-multicore/eio#402 ocaml-multicore/eio#404).

- Simplify cancellation logic (@talex5 ocaml-multicore/eio#396).

- time: `Mtime.Spand.to_s` has been deprecated in mtime 2.0.0 (@bikallem ocaml-multicore/eio#385).
talex5 added a commit to talex5/opam-repository that referenced this pull request Feb 1, 2023
CHANGES:

New features:

- Add `Eio.Net.run_server` (@bikallem @talex5 ocaml-multicore/eio#408).
  Runs an accept loop in one or more domains, with cancellation and graceful shutdown,
  and an optional maximum number of concurrent connections.

- Add `Buf_read.BE` and `LE` parsers (@Cjen1 ocaml-multicore/eio#399).
  Parse numbers in various binary formats.

- Add `Eio.Buf_read.uint8` (@talex5 ocaml-multicore/eio#418).

Performance:

- Make `Eio.Condition` lock-free (@talex5 ocaml-multicore/eio#397 ocaml-multicore/eio#381).
  In addition to being faster, this allows using conditions in signal handlers.

- Make `Eio.Semaphore` lock-free (@talex5 @polytypic ocaml-multicore/eio#398).

- Make `Eio.Stream` lock-free when the capacity is zero (@talex5 ocaml-multicore/eio#413 ocaml-multicore/eio#411).

- Make `Eio.Promise` lock-free (@talex5 ocaml-multicore/eio#401).

Bug fixes:

- eio_linux: call `Uring.submit` as needed (@talex5 @bikallem ocaml-multicore/eio#428).
  Previously, we could fail to submit a job promptly because the SQE queue was full.

- Fix luv signals (@haesbaert ocaml-multicore/eio#412).
  `libuv` automatically retries polling if it gets `EINTR`, without giving OCaml signal handlers a chance to run.

- eio_luv: fix some resource leaks (@talex5 @patricoferris ocaml-multicore/eio#421).

- eio_luv: fix "unavailable signal" error on Windows (@talex5 ocaml-multicore/eio#420, reported by @nojb).

- Fix `Buf_write.BE.uint48` and `LE.uint48` (@adatario ocaml-multicore/eio#418).

Documentation:

- Add example programs (@talex5 ocaml-multicore/eio#389).

- Update network examples to use `run_server` (@talex5 ocaml-multicore/eio#417).

- Add a warning to the tutorial about `Fiber.first` (@talex5 ocaml-multicore/eio#394).

- Clarify the epoch used for `Eio.Time.now` (@bikallem ocaml-multicore/eio#395).

- Describe `secure_random` as an infinite source (@patricoferris ocaml-multicore/eio#426).

- Update README for OCaml 5 release (@talex5 ocaml-multicore/eio#384 ocaml-multicore/eio#391 ocaml-multicore/eio#393).

Other changes:

- Delay setting `SIGPIPE` handler until the `run` function is called (@talex5 ocaml-multicore/eio#420).

- Remove debug-level logging (@talex5 ocaml-multicore/eio#403).

- eio-luv: improve `process.md` test (@smondet ocaml-multicore/eio#414).

- Update to Dune 3 (@talex5 ocaml-multicore/eio#410).

- Remove test dependency on Astring (@talex5 ocaml-multicore/eio#402 ocaml-multicore/eio#404).

- Simplify cancellation logic (@talex5 ocaml-multicore/eio#396).

- time: `Mtime.Spand.to_s` has been deprecated in mtime 2.0.0 (@bikallem ocaml-multicore/eio#385).
talex5 added a commit to talex5/opam-repository that referenced this pull request Feb 1, 2023
CHANGES:

New features:

- Add `Eio.Net.run_server` (@bikallem @talex5 ocaml-multicore/eio#408).
  Runs an accept loop in one or more domains, with cancellation and graceful shutdown,
  and an optional maximum number of concurrent connections.

- Add `Buf_read.BE` and `LE` parsers (@Cjen1 ocaml-multicore/eio#399).
  Parse numbers in various binary formats.

- Add `Eio.Buf_read.uint8` (@talex5 ocaml-multicore/eio#418).

Performance:

- Make `Eio.Condition` lock-free (@talex5 ocaml-multicore/eio#397 ocaml-multicore/eio#381).
  In addition to being faster, this allows using conditions in signal handlers.

- Make `Eio.Semaphore` lock-free (@talex5 @polytypic ocaml-multicore/eio#398).

- Make `Eio.Stream` lock-free when the capacity is zero (@talex5 ocaml-multicore/eio#413 ocaml-multicore/eio#411).

- Make `Eio.Promise` lock-free (@talex5 ocaml-multicore/eio#401).

Bug fixes:

- eio_linux: call `Uring.submit` as needed (@talex5 @bikallem ocaml-multicore/eio#428).
  Previously, we could fail to submit a job promptly because the SQE queue was full.

- Fix luv signals (@haesbaert ocaml-multicore/eio#412).
  `libuv` automatically retries polling if it gets `EINTR`, without giving OCaml signal handlers a chance to run.

- eio_luv: fix some resource leaks (@talex5 @patricoferris ocaml-multicore/eio#421).

- eio_luv: fix "unavailable signal" error on Windows (@talex5 ocaml-multicore/eio#420, reported by @nojb).

- Fix `Buf_write.BE.uint48` and `LE.uint48` (@adatario ocaml-multicore/eio#418).

Documentation:

- Add example programs (@talex5 ocaml-multicore/eio#389).

- Update network examples to use `run_server` (@talex5 ocaml-multicore/eio#417).

- Add a warning to the tutorial about `Fiber.first` (@talex5 ocaml-multicore/eio#394).

- Clarify the epoch used for `Eio.Time.now` (@bikallem ocaml-multicore/eio#395).

- Describe `secure_random` as an infinite source (@patricoferris ocaml-multicore/eio#426).

- Update README for OCaml 5 release (@talex5 ocaml-multicore/eio#384 ocaml-multicore/eio#391 ocaml-multicore/eio#393).

Other changes:

- Delay setting `SIGPIPE` handler until the `run` function is called (@talex5 ocaml-multicore/eio#420).

- Remove debug-level logging (@talex5 ocaml-multicore/eio#403).

- eio-luv: improve `process.md` test (@smondet ocaml-multicore/eio#414).

- Update to Dune 3 (@talex5 ocaml-multicore/eio#410).

- Remove test dependency on Astring (@talex5 ocaml-multicore/eio#402 ocaml-multicore/eio#404).

- Simplify cancellation logic (@talex5 ocaml-multicore/eio#396).

- time: `Mtime.Spand.to_s` has been deprecated in mtime 2.0.0 (@bikallem ocaml-multicore/eio#385).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants