Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A seek on a closed channel can allow invalid inputs #11878

Closed
shym opened this issue Jan 9, 2023 · 3 comments
Closed

A seek on a closed channel can allow invalid inputs #11878

shym opened this issue Jan 9, 2023 · 3 comments

Comments

@shym
Copy link
Contributor

shym commented Jan 9, 2023

While running some qcheck-lin tests of In_channel, I ran into a problem which boils down to this test:

let _ =
  let f = open_in Sys.argv.(0) in
  seek_in f 1 ;
  close_in_noerr f ;
  seek_in f 0 ;
  Printf.printf "Read: %d\n%!" (input_byte f) ;
  Printf.printf "Read: %d\n%!" (input_byte f)

yielding:

Read: 0
Fatal error: exception Sys_error("Bad file descriptor")
Raised by primitive operation at Seekextra in file "seekextra.ml", line 7, characters 31-45

Replacing the first seek_in by seek_in f 10 would allow inputting 10 bytes before getting an exception.
In this example, the extra byte(s) we input were never really read from the file, they are what’s currently in the buffer (which is not initialized, I think, which would explain that random testing it could find such a code giving non-deterministic results).

As I understand caml_ml_close_channel, we want to avoid testing explicitly whether the channel is indeed open in every function and rather rely on failing syscalls to detect it. So I checked that simply resetting the offset when closing the file:

@@ -681,6 +681,7 @@ CAMLprim value caml_ml_close_channel(value vchannel)
      immediate caml_flush_partial or caml_refill, thus raising a Sys_error
      exception */
   channel->curr = channel->max = channel->end;
+  channel->offset = 0;
 
   /* If already closed, we are done */
   if (channel->fd != -1) {

is enough to get an exception on the first input after closing. It does still allow for the seek_in on the closed channel, so that’s maybe not the best way to go.

@xavierleroy
Copy link
Contributor

Interesting example, thanks! I'm surprised that "the extra byte(s) we input were never really read from the file". Seeking can move within data already read in the buffer, but should not expose parts of the buffer that have not been read. Maybe this is the root of the problem, independently of whether the channel is closed between the two seek_in.

@shym
Copy link
Contributor Author

shym commented Jan 10, 2023

Here is, step by step, my understanding of what happens in that example.
First we open the file with caml_open_descriptor_in, offset is set to 0, curr and max point to the beginning of the buffer.

ocaml/runtime/io.c

Lines 171 to 174 in d70eae0

channel->offset = lseek(fd, 0, SEEK_CUR);
caml_leave_blocking_section();
channel->curr = channel->max = channel->buff;
channel->end = channel->buff + IO_BUFFER_SIZE;

Then we seek to position 1 with caml_seek_in, offset is 1, curr and max are assigned again at the beginning of the buffer.

ocaml/runtime/io.c

Lines 437 to 443 in d70eae0

if (lseek(channel->fd, dest, SEEK_SET) != dest) {
caml_leave_blocking_section();
caml_sys_error(NO_ARG);
}
caml_leave_blocking_section();
channel->offset = dest;
channel->curr = channel->max = channel->buff;

Then we close the channel with caml_ml_close_channel, curr and max are set to the end of the buffer but, critically, offset is not modified (so keep at 1).

ocaml/runtime/io.c

Lines 680 to 683 in d70eae0

/* Ensure that every read or write on the channel will cause an
immediate caml_flush_partial or caml_refill, thus raising a Sys_error
exception */
channel->curr = channel->max = channel->end;

Then we seek back with caml_seek_in, but this time the condition is true, the function assumes that we are seeking inside data in the buffer (note though that no read ever happened), so curr points now to the last byte of the buffer.

ocaml/runtime/io.c

Lines 431 to 434 in d70eae0

if (dest >= channel->offset - (channel->max - channel->buff)
&& dest <= channel->offset
&& (channel->flags & CHANNEL_TEXT_MODE) == 0) {
channel->curr = channel->max - (channel->offset - dest);

Then the next input will read that byte, because it is deemed valid.

@xavierleroy
Copy link
Contributor

Thanks for the step-by-step. It's now clear in my mind that the problem is with close_in.

shym added a commit to shym/ocaml that referenced this issue Jan 26, 2023
shym added a commit to shym/ocaml that referenced this issue Jan 26, 2023
If offset keeps its value when the channel is closed, it is possible to
`seek_in` backward which will only decrease `channel->curr` and therefore
mark some buffer bytes as valid: further calls to `input_byte` would
succeed on the closed channel

Fix issue ocaml#11878
shym added a commit to shym/ocaml that referenced this issue Jan 27, 2023
shym added a commit to shym/ocaml that referenced this issue Jan 27, 2023
If offset keeps its value when the channel is closed, it is possible to
`seek_in` backward which will only decrease `channel->curr` and therefore
mark some buffer bytes as valid: further calls to `input_byte` would
succeed on the closed channel

Fix issue ocaml#11878
shym added a commit to shym/ocaml that referenced this issue Jan 27, 2023
shym added a commit to shym/ocaml that referenced this issue Jan 27, 2023
If offset keeps its value when the channel is closed, it is possible to
`seek_in` backward which will only decrease `channel->curr` and therefore
mark some buffer bytes as valid: further calls to `input_byte` would
succeed on the closed channel

Fix issue ocaml#11878
gasche added a commit to gasche/ocaml that referenced this issue Jan 29, 2024
This reverts a technical decision made in ocaml#11878, ocaml#11965, for the
purpose of simplifying the implementation of [close_in] and the
specification of [seek_in], which now consistently fails on closed
channels.
edwintorok pushed a commit to edwintorok/ocaml that referenced this issue Feb 10, 2024
… channel (ocaml#11965)

* Test calls to input_byte after close_in

Add a test demonstrating issue ocaml#11878

* Reset offset when closing a channel

If offset keeps its value when the channel is closed, it is possible to
`seek_in` backward which will only decrease `channel->curr` and therefore
mark some buffer bytes as valid: further calls to `input_byte` would
succeed on the closed channel

Fixes: ocaml#11878
(cherry picked from commit a606e92)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants