New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Documentation for Unix.[in/out]_channel_of_descr is misleading and leads to unwanted closing of file descriptor #9786
Comments
Cc @gadmm, who discovered first the bug in the |
What about the following: if there is already a channel using the given file descriptor Determining whether a file descriptor is already used by a channel can be done today by scanning the list of all active channels, or if that is not efficient enough, by using a more efficient data structure (did I say "skiplist"?). |
I am not sure to understand why this is closer to an ownership model, i.e. what would the high-level documentation be in terms of ownership? I think the specification would be complicated, while the current behaviour is the simplest one where the function assumes ownership of the file descriptor (I mean at least the the corrected version that correctly closes the file descriptor on error at https://github.com/ocaml/ocaml/pull/8997/files#diff-ab941d7100a652a3a5b09eec8270173cR72-R104). I like the proposition in the first bullet point (clarify the documentation) for that reason. Moreover one often learns that things that used to work accidentally in single-threaded (e.g. closing a file descriptor twice) becomes dangerous in multi-threaded. With a clear documentation of ownership, which is along what C++/Rust programmers would expect, the working programmer should not make the error. (A helper function for creating in and out channels for sockets could be included if only to draw the programmer's attention to the issue.) |
I see two other solutions:
|
That's a neat idea! Probably the simplest way to implement the documented behavior. |
I hear “skiplist” but there is also an upcoming multithreading issue. It is not yet clear to me how the global linked list of channels will evolve with multicore (it seems that the multicore version of This is a costly price for a feature that is actually not of much use. The few situations where you could rely on this part of the documented behaviour are also those where you could rewrite them not to. In particular, the documentation of posix Alternatively, Damien proposes optionally to raise an error. I assume checking for In the spirit of @jhjourdan's first bullet point I propose to simply remove (In addition, it would make sense to have a flag to indicate that a channel does not own its file descriptor, set upon creation, or a function to close a channel without closing the file descriptor. Intuitively, you rarely want a channel on stdin/out/err to close its file descriptor. This would be a clean way to express the idiom described by Jacques-Henri without having to call @jhjourdan also proposed to play with the implementation of |
Before we start speculating on performance issues in multicore, I wanted to fact-check some points of the discussion.
This has not been the case since 2002: 0738514 Closing an already-closed channel is a no-op. Closing a channel whose file descriptor has already been closed is a In particular, the following code has been broken for more than 18 years: let sock = Unix.socket [...] Unix.SOCK_STREAM 0 in
let inchan = Unix.in_channel_of_descr sock in
let outchan = Unix.out_channel_of_descr sock in
...
close_out outchan; close_in inchan The So, I'm pretty sure our users actually write let sock = Unix.socket [...] Unix.SOCK_STREAM 0 in
let inchan = Unix.in_channel_of_descr sock in
let outchan = Unix.out_channel_of_descr sock in
...
close_out outchan (* or: close_in inchan *) I agree this should be documented better. We could say that if several channels are opened on the same file descriptor, only one should be closed (preferably the output channel so that it gets flushed correctly) and the other should not be used further. We could also recommend to avoid this situation by appropriate use of There is still a potential for error if the other channel (
If we really want to protect against this, @damiendoligez 's proposal (close the other channels sharing the same FD) sounds good to me. It has the side-effect of un-breaking the code above that has been broken for more than 18 years, but that's not the main motivation. The naive implementation is a bit expensive, but we can improve that with better data structures. My suggestion of duplicating the FD so that each channel has its own FD could break example number 2 above. After So, what do we do?
|
Indeed, one should not confuse Since it is fact-checking time, I went for some sightseeing. By order of commonness:
I only looked for usage of I understand how the solution 2. can appear more intuitive than 3. This mimics a semantics where the channel borrows the file descriptor. Given the involvement of The other goal you mention (B) is to have a clean exception on use-after-free (for 3., but also desirable for 2.).
The first one magically fixes existing code (4. and 5. above, provided they are really buggy), while the second one gets to a situation with clearer ownership semantics (although still a bit fuzzy). Finally let us appreciate how ownership is far from being an obvious notion, even for seemingly-simply issues such as double-free. |
Thank you for the survey of usages in the wild. One correction: you're unfair to Coming back to the survey: the documentation for I'm not sure we need more than that, but I'll think about it some more. |
Yes of course, it is fine to leak for short-lived processes. Only for the example files it is worrying. The missing functionality programmers cannot write themselves is probably captured by a function val consume_X : X_channel -> file_descr that does the effect of let close_out_without_closing_descr c = ignore (consume_out c) (* for the flowtype code linked above *)
let close_two (i,o) =
let fd1, fd2 = consume_in i, consume_out o in
Unix.close fd1 ;
if fd2 <> fd1 then Unix.close fd2 |
Explain more precisely what to do and what not to do to close channels and their underlying descriptor. Fixes: ocaml#9786
Explain more precisely what to do and what not to do to close channels and their underlying descriptor. Fixes: ocaml#9786
Explain more precisely what to do and what not to do to close channels and their underlying descriptor. Fixes: ocaml#9786
A common pattern for using sockets in OCaml is to transform them in two channels: one for input and one for output:
However, at some point,
inchan
andoutchan
will be closed, and this will close the underlying file descriptor. The documentation ofUnix.[in/out]_channel_of_descr
says:In other words, it says implicitly that it is fine to call both
close_in
andclose_out
oninchan
andoutchan
for closing the socket: what says the documentation says that the closing of the second channel will simply not close the underlying file descriptor, since it has already been closed.However, things actually happens differently in the current implementation:
close_in
andclose_out
both always call theclose
system call, and simply ignore the return value. This is problematic, since the POSIX file descriptor may very well be reused between the closing of the two channels, resulting in the closing of a possibly completely unrelated file descriptor.This is exemplified with the following OCaml program:
This pattern is rather common. It is used in several tutorials for using the OCaml socket API. For example, see https://caml.inria.fr/pub/docs/oreilly-book/html/book-ora187.html, https://rosettacode.org/wiki/Sockets, https://ocaml.github.io/ocamlunix/sockets.html, ... Moreover, this bug is already present in
ocamldebug
implementation (not using the Unix interface, but by using the internal C API, but the idea is the same).The reason nobody discovered the bug in real code is probably that the input and output channels are most often closed at the same time. The impression of soundness is, however, misleading: indeed, between the two closing calls, many things can happen, including a context switch to another thread, which may very well open e.g., another socket.
I can see two fixes to this bug:
Unix.in_channel_of_descr
andUnix.out_channel_of_descr
on a given file descriptor. There is a simple workaround: simply callUnix.dup
on the file descriptor. However, the various books, tutorials, etc... will remain there and the pattern will very probably will still be used in the wild for long.Unix.file_descr
type, and to make it a record of two fields: the first one contains the actual POSIX file descriptor number, while the second is a mutable flag indicating whether the file descriptor has been closed. But this is a massive change in theUnix
implementation, since pretty much all its functions use thefile_descr
type.The text was updated successfully, but these errors were encountered: