Process.fork has dangerous semantics #6421

forksaber · 2018-07-21T02:34:27Z

`forkall`(Current)

Process.fork currently has forkall semantics with respect to fibers. What this means is that all the fibers which were alive during the Process.fork call remain alive in the forked process.

Advantages

no deadlocks possible due to locked mutexes in contrast to when only a single fiber survives the fork

Disadvantages

Since fd/sockets are shared across forks, fibers which were doing writes will do double writes which will cause:

broken http responses
duplicated data in files
corrupted db queries

Similarly, fd/sockets reads may get divided across forks. So, both processes will see

partial http requests
partial db responses
partial file data

An example of double writes:

spawn do
  5.times do |i|
    sleep rand
    puts "line #{i}"
  end 
end
process = Process.fork do
  sleep 10
end

process.wait

Expected output:

line 0
line 1
line 2
line 3
line 4

Actual output:

line 0
line 0
line 1
line 2
line 1
line 3
line 4
line 2
line 3
line 4

IMO, these are really severe disadvantages so these semantics ought to change.

Alternatives

1. `fork` like semantics (linux/unix)

In this scenario, only the fiber invoking Process.fork lives on in the forked process. Rest of the fibers are either collected(majority of them) and a few core runtime fibers like SignalHandler loop are recreated.

Advantages

no unintentional socket/fd corruption
easy to reason about
similar to how threads would behave once thread-support lands

Disadvantages

Deadlocks: Any attempt to lock a mutex which was locked in non forking Fiber will result in a deadlock. This is manageable when you own the mutexes but if mutexes belong to stdlib/shard then deadlocks are more or less imminent. pthread_atfork like api for fibers may help here though i am not sure how successful that will be.

2. Disable fork without exec (go/erlang)

This approach is in line with what go and erlang takes. They find fork too difficult to implement correctly in a multi-threaded/multi-fiber/multi-actor environment that they don't even bother implementing it.

Disadvantages

won't be able to daemonize the process using the double fork technique but can definitely use a C binary which exec()s to a crystal binary.

Conclusion

Personally, I would prefer that fork without exec be removed from the stdlib so there's no risk of deadlocks or data corruption. If that's unacceptable, even implementing fork semantics is still a step in the right direction.

PS: I've been implementing a supervisor clone in crystal which gave me an opportunity to look deeper into Crystal's internals.

#6416 is also caused by forkall behaviour. I'll provide a root cause analysis in that thread.

The text was updated successfully, but these errors were encountered:

asterite · 2018-07-21T03:30:46Z

Yes, I'd like that too. We have fork because we had it before we went with non-blocking IO and fibers like Go. Now that did that, it's probably time to remove fork (which also doesn't exist in Windows...)

asterite · 2018-07-21T03:37:00Z

Another reason: the experiments on trying to bring multithread support to Crystal were difficult to achieve, and getting fork to work was one of the reasons (not the reason, but one of them).

98-f355-f1 · 2018-07-21T04:37:57Z

I used your example and ran it... got a similar input... I have scanned over the source code to see what's going on.
I do kind of see what you are suggesting, and how it's possibly happening. I agree with the fact that the implementation of fork needs to be improved, but I don't think that we have to throw it away either.

I tweaked your example and ran a comparable in Ruby for expected output. I am fearful of runaway processes, so I will always use a wait.

`
chan = Channel(Nil).new
spawn do
10.times do |i|
sleep rand
puts "line #{i}"
end
chan.send(nil)
end

t = Process.fork do
sleep 1
end

chan.receive
t.wait
`

I didn't quite the same output:

line 0 line 0 line 1 line 1 line 2 line 3 line 4 line 5 line 6 line 7 line 8 line 9
Do you think that the differences my be OS related?

These are going to be fun bugs to eradicate... :P

@asterite I think that the lack of windows operability shouldn't mean that we need to discard the whole thing. Now that Microsoft is going into the deepend on interoperability with Linux... these functions will be able to be accessed by MS users. There is always spawn or other ways to do it also for windows users.

forksaber · 2018-07-21T05:18:51Z

Ah, the output was from 5.times and not 10.times. Edited my original post to reflect that. Also added process.wait.
@98-f355-f1 You are only getting line 0 twice because your forked process is only sleeping for 1 sec. Increase it to see similar output.

I think the only way to improve fork while keeping it would be implement forkone semantics but then we'll have to worry about mutexes and deadlocks. I am not sure if that will be considered an improvement.
So removing it is the safest choice.

see
https://thorstenball.com/blog/2014/10/13/why-threads-cant-fork/
https://news.ycombinator.com/item?id=8449164

NIFR91 · 2018-07-21T07:23:35Z

I think fork should be kept if possible, maybe as suggested by enforcing fork and exec pattern.

jhass · 2018-07-21T07:46:13Z

Random idea: can we allow it only before any user fibers, or generally any fibers not explicitly marked as safe for fork, are created?

ysbaddaden · 2018-07-21T09:41:39Z

~~I'm leaning towards both positions (keep or remove) but I still lean a bit on the keep it side.~~

Yes, fork is dangerous, it can lead to disasters if invoked at whatever time; you should never have to use it; except for some rare edge cases where it makes sense on a POSIX system in a controlled environment (e.g. zero downtime binary reloads) without requiring C.

Just like daemonizing, I believe we should have a proper fork handling (that fixes a few bolts, like reseeding rand), mark it as unsafe and document it as very dangerous. Maybe remove the global ::fork and keep only Process.fork. I think it's better than having users just call LibC.fork manually (never do that).

~~I think it's possible to keep fork with a multithreaded event loop. The GC must signal all threads to stop before it can start collecting memory, fork must do the same (unless it's a fork/exec).~~

kostya · 2018-07-21T09:46:59Z

fork is need for example here: https://github.com/kostya/curl-downloader, i not find any efficient ways except fork (i tried scheduler mapper to events, threads, curl multi). just write some danger comment for all who wants to use fork.

98-f355-f1 · 2018-07-21T15:27:20Z

@forksaber you wrote above "PS: I've been implementing a supervisor clone in crystal which gave me an opportunity to look deeper into Crystal's internals." Could you explain how you did this?

@forksaber on the second note, I got similar type erratic behavior also, but I am impatient so I wasn't going to wait 14 min for the sleep LOL. I tried it at different times, but curiously when I rewrote with the wait, it sort of straightened up the behavior.

I guess that it seems that most folks want to keep the fork, perhaps remove the exec, and put in strong warning, use at your own risk, explaining some possible side effects.

RX14 · 2018-07-21T22:28:28Z

I agree with the "keep fork as-is (but fix as many bugs as we can) but mark it as super dangerous" idea. We should provide a "sanitized" safe fork that works before fibers have been spawned too.

forksaber · 2018-07-23T11:27:15Z

@98-f355-f1 You can check out my supervisor implementation here https://github.com/forksaber/supervisor
Almost feature complete but not much in the way of documentation and tests.

straight-shoota · 2019-11-26T13:22:43Z

I think we should remove plain Process.fork from the stdlib.

fork is not supported on win32 at all. That makes it a very platform-specific feature which we would like to avoid in stdlib.
On other platforms, there are difficulties with fibers and it's entirely unsupported with multithreading enabled. AFAIK this can't work reliably ever. In a forked process only one thread continues which makes it error-prone. So with MT moving forward, future use cases look very slim.

This answer on SO explains the issues with fork in Golang, which should apply exactly to Crystal as well: https://stackoverflow.com/a/28371586/7401689

As a replacement, fork could be made available in a platform-specific shard which can be added as a dependency for the few use cases where you would really want to use bare fork behaviour.

fork immediately followed by exec would obviously still be required as implementation for Process.exec on POSIX platforms.

j8r · 2019-11-26T14:07:36Z

fork can be kept under a platform-specific module. This modularization is common in other languages.

jan-zajic · 2019-11-27T11:46:40Z

I think we can keep platform depended apis along with their logical modules but mark them as so. If someone uses them he should know about consequences (i.e non-portability). See my proposal about this concept - #8524.

ysbaddaden · 2019-11-28T10:09:28Z

I finally changed my mind on fork. Keeping support would require a lot of work and lots of bugs to fix, especially with MT (we already had a bunch with a single thread) but... for what purposes?

Daemonizing a process? There are many tools specialized to spawn, monitor and log processes starting from init systems themselves nowadays. They'll do a better job, and lets users use whatever they prefer, rather than each application re-doing everything (that probably won't be used because users will use external tools).
Sharing a server socket? This is a legit use-case (and performance easily beat MT), but we can create the socket with SO_REUSEPORT and just start multiple processes.
Take advantage of multiple cores, then communicating in some ways (e.g. through sockets)? That's what MT is for! Otherwise we can use an IPC protocol (e.g. ZMQ, AMQP server) and start multiple processes, or plain UNIX sockets.

Let's not bother.

RX14 · 2019-12-04T22:52:26Z

The only concevable use of fork would be to insert safe privilege-dropping commands like a chroot or setrlimit between the fork and the exec. This would be useful for a container scheduler for example.

I would support an unsafe non-documented API to make these happen if someone was interested in the capability, but otherwise i'm fully in support of pulling fork.

straight-shoota · 2019-12-04T23:53:49Z

I would support an unsafe non-documented API to make these happen if someone was interested in the capability, but otherwise i'm fully in support of pulling fork.

That would certainly be better placed in a documented API of a dedicated shard.

RX14 · 2019-12-06T23:15:10Z

@straight-shoota well Process.exec needs to be part of the stdlib, and it can't be obvious or documented how to use a post-fork exec callback since it's very unsafe. I don't see how this could be a shard.

straight-shoota mentioned this issue Nov 26, 2019

Windows Process API #8518

Closed

straight-shoota mentioned this issue Dec 2, 2019

Platform dependent public API #8524

Draft

straight-shoota mentioned this issue Apr 19, 2020

Deprecate top-level fork #9136

Merged

HertzDevil added status:discussion topic:stdlib:system labels Aug 4, 2021

straight-shoota mentioned this issue Jan 10, 2023

Internalize Process.fork #12934

Merged

straight-shoota closed this as completed in #12934 Jan 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Process.fork has dangerous semantics #6421

Process.fork has dangerous semantics #6421

forksaber commented Jul 21, 2018 •

edited

asterite commented Jul 21, 2018

asterite commented Jul 21, 2018

98-f355-f1 commented Jul 21, 2018 •

edited

forksaber commented Jul 21, 2018

NIFR91 commented Jul 21, 2018

jhass commented Jul 21, 2018

ysbaddaden commented Jul 21, 2018 •

edited

kostya commented Jul 21, 2018 •

edited

98-f355-f1 commented Jul 21, 2018

RX14 commented Jul 21, 2018

forksaber commented Jul 23, 2018

straight-shoota commented Nov 26, 2019

j8r commented Nov 26, 2019

jan-zajic commented Nov 27, 2019

ysbaddaden commented Nov 28, 2019 •

edited

RX14 commented Dec 4, 2019

straight-shoota commented Dec 4, 2019

RX14 commented Dec 6, 2019 •

edited

Process.fork has dangerous semantics #6421

Process.fork has dangerous semantics #6421

Comments

forksaber commented Jul 21, 2018 • edited

forkall(Current)

Advantages

Disadvantages

Alternatives

1. fork like semantics (linux/unix)

Advantages

Disadvantages

2. Disable fork without exec (go/erlang)

Disadvantages

Conclusion

asterite commented Jul 21, 2018

asterite commented Jul 21, 2018

98-f355-f1 commented Jul 21, 2018 • edited

forksaber commented Jul 21, 2018

NIFR91 commented Jul 21, 2018

jhass commented Jul 21, 2018

ysbaddaden commented Jul 21, 2018 • edited

kostya commented Jul 21, 2018 • edited

98-f355-f1 commented Jul 21, 2018

RX14 commented Jul 21, 2018

forksaber commented Jul 23, 2018

straight-shoota commented Nov 26, 2019

j8r commented Nov 26, 2019

jan-zajic commented Nov 27, 2019

ysbaddaden commented Nov 28, 2019 • edited

RX14 commented Dec 4, 2019

straight-shoota commented Dec 4, 2019

RX14 commented Dec 6, 2019 • edited

forksaber commented Jul 21, 2018 •

edited

`forkall`(Current)

1. `fork` like semantics (linux/unix)

98-f355-f1 commented Jul 21, 2018 •

edited

ysbaddaden commented Jul 21, 2018 •

edited

kostya commented Jul 21, 2018 •

edited

ysbaddaden commented Nov 28, 2019 •

edited

RX14 commented Dec 6, 2019 •

edited