New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Read has very surprising behaviour (does not match spec) #119

Open
talex5 opened this Issue Dec 5, 2018 · 3 comments

Comments

Projects
None yet
3 participants
@talex5
Copy link
Contributor

talex5 commented Dec 5, 2018

A VChan endpoint claims to implement Mirage_flow_lwt.S. This says (https://github.com/mirage/mirage-flow/blob/master/src/mirage_flow.mli#L62):

  val read: flow -> (buffer or_eof, error) result io
  (** [read flow] blocks until some data is available and returns a
      fresh buffer containing it.

However, reading the code it seems that it does not return a fresh buffer. Instead, it returns a region of the underlying ring (which is under the control of the remote VM).

ocaml-vchan/lib/endpoint.ml

Lines 327 to 360 in 5fcd4e3

(* Read a chunk in a blocking fashion. Note this returns a
reference to the data in the ring. *)
let rec _read_one vch event =
(* wait until at least 1 byte is available or the connection has closed *)
let avail = fast_get_data_ready vch 1 in
let state = state vch in
if avail = 0 && state = Connected
then E.recv vch.evtchn event >>= fun event -> _read_one vch event
else
if avail = 0 && state <> Connected
then Lwt.return `Eof
else
let real_idx = Int32.(logand (rd_cons vch) (of_int (rd_ring_size vch) - 1l) |> to_int) in
let bytes_before_wraparound = rd_ring_size vch - real_idx in
let buf =
if bytes_before_wraparound = 0 then begin
(* all bytes are in a contiguous block starting at 0 *)
Cstruct.sub vch.read 0 avail
end else begin
(* we'll only consume the bytes before wraparound on this iteration *)
Cstruct.sub vch.read real_idx (min avail bytes_before_wraparound)
end in
Lwt.return (`Ok buf)
let read vch =
(* signal the remote that we've consumed the last block of data it sent us *)
set_rd_cons vch Int32.(of_int vch.ack_up_to);
send_notify vch Read;
(* get the fresh data *)
_read_one vch E.initial >>= function
| `Ok buf ->
(* we'll signal the remote we've consumed this data on the next iteration *)
vch.ack_up_to <- vch.ack_up_to + (Cstruct.len buf);
Lwt.return @@ Ok (`Data buf)

This means that:

  • The data in the returned buffer can change at any time if the remote VM is malicious. Users need to protect against this (e.g. by never reading the same byte more than once).
  • The data is only valid until the next read, which seems to be when the library acks the read.
  • Reading data is not sufficient to create more space in the ring buffer. In particular, the sender may be blocked waiting for space even after the receiver has read all of the data.

I think this needs documenting, at least. Perhaps consider renaming the current read to read_unsafe, and providing a copying alternative? (the C implementation copies)

@djs55

This comment has been minimized.

Copy link
Member

djs55 commented Dec 5, 2018

@avsm

This comment has been minimized.

Copy link
Member

avsm commented Dec 12, 2018

I'd be in favour of removing the non-copying version entirely, and implementing the safe version. It seems very hard to use the current one correctly.

@talex5

This comment has been minimized.

Copy link
Contributor

talex5 commented Dec 14, 2018

mirage-qubes actually uses this safely to fill a buffer of known size, which is a common thing to want to do, so I suggest that if we remove this read function, we provide a (safe) read_exactly function to fill a buffer of known size completely, without extra copies.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment