Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

irmin-pack: Fix data race in control file reads #2100

Merged
merged 3 commits into from
Oct 4, 2022
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions CHANGES.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,11 @@
- Move unix specific details for `Pack_key` and `Pack_value` from `irmin-pack`
to `irmin-pack.unix` (#2084, @metanivek)

### Fixed

- **irmin-pack**
- Fix data race in RO instances when reading control file (#2100, @Ngoguey42)

## 3.4.1 (2022-09-07)

### Added
Expand Down
5 changes: 2 additions & 3 deletions src/irmin-pack/unix/control_file.ml
Original file line number Diff line number Diff line change
Expand Up @@ -69,9 +69,8 @@ module Make (Io : Io.S) = struct

let read io =
let open Result_syntax in
let* len = Io.read_size io in
let len = Int63.to_int len in
let* string = Io.read_to_string io ~off:Int63.zero ~len in
Comment on lines -72 to -74
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the RW instance updates the file between the read_size and the read_to_string we would have a problem

let* string = Io.read_all_to_string io in
assert (String.length string <= Io.page_size);
Data.of_bin_string string

let read io =
Expand Down
18 changes: 17 additions & 1 deletion src/irmin-pack/unix/io.ml
Original file line number Diff line number Diff line change
Expand Up @@ -211,6 +211,23 @@ module Unix = struct
| Errors.Closed -> Error `Closed
| Unix.Unix_error (e, s1, s2) -> Error (`Io_misc (e, s1, s2))

let page_size = 4096

let read_all_to_string t =
let buf = Buffer.create 0 in
let len = page_size in
let bytes = Bytes.create len in
let rec aux ~off =
let nread = Util.really_read t.fd off len bytes in
Copy link
Member

@samoht samoht Oct 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You still have a data race here if the RW instance updates the file between two short reads, don't you?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two thoughts:

  1. I don't think a loop is needed here since Util.really_read also loops until len is read. But --
  2. Regardless of location of the loop, it does seem to me that a race can happen if you need multiple reads. What is the best (or even a good) way to protect against this? The simplest I can think of is to check last updated timestamp before each read and start over if it changes, but maybe there is some other trick?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is how my patch functions:

  • I've added a read_all_to_string function that just reads a file. It doesn't guarantee that the full content of the file is read atomically. It only guarantees that pages are read atomically (i.e. a page never spans on two iterations of the loop)
  • I've changed the read function in the control file to rely on this new read_all_to_string function. Since control files are always contained within a page, that read is atomic. The assertion ensures so.

Following your reviews I've changed my code in the following manner:

Copy link
Member

@samoht samoht Oct 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since control files are always contained within a page, that read is atomic

If you assume an underlyingext2 filesystem, that's probably mostly true (the kernel can still decide to interrupt file-system reads). If you run this under NFS (and other network-based filesystems), that's unfortunately not true.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case we will need to add a checksum in the control file (cc @art-w). At least this PR fixes the issue for some file systems.

Copy link
Member

@samoht samoht Oct 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea it's great to improve the current situation - it's also nice to be clear about the limitations.

Are the RO instance re-reading the full files on every "sync"?

The only way to get atomic writes in POSIX is via rename. I don't have enough context to know if this could work in this case but we are doing this in various other places of irmin, for instance in ocaml-git when we update references: https://github.com/mirage/ocaml-git/blob/master/src/git-unix/git_unix.ml#L529-L547

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the RO instance re-reading the full files on every "sync"?

Yes.

The only way to get atomic writes in POSIX is via rename.

I see, thanks

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you update the comments in the code to reflect that reads might not be atomic - this can be misleading for critical pieces of code

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in #2107

if nread > 0 then (
Buffer.add_subbytes buf bytes 0 nread;
aux ~off:Int63.(add off (of_int nread)))
in
try
aux ~off:Int63.zero;
Ok (Buffer.contents buf)
Copy link
Contributor

@icristescu icristescu Oct 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe, just to be very clear, add a comment about really_read and the unix_buffer_size (the function only loops for reading buffers with length greater than the page size, hence atomicity guaranteed for one page).

with Unix.Unix_error (e, s1, s2) -> Error (`Io_misc (e, s1, s2))

let read_size t =
match t.closed with
| true -> Error `Closed
Expand All @@ -220,7 +237,6 @@ module Unix = struct

let readonly t = t.readonly
let path t = t.path
let page_size = 4096

let move_file ~src ~dst =
try
Expand Down
3 changes: 3 additions & 0 deletions src/irmin-pack/unix/io_intf.ml
Original file line number Diff line number Diff line change
Expand Up @@ -87,6 +87,9 @@ module type S = sig
t -> off:int63 -> len:int -> (string, [> read_error ]) result
(** [read_to_string t ~off ~len] are the [len] bytes of [t] at [off]. *)

val read_all_to_string : t -> (string, [> read_error ]) result
(** [read_to_string t] is the contents full contents of the file. *)

val read_size : t -> (int63, [> read_error ]) result
(** [read_size t] is the number of bytes of the file handled by [t].

Expand Down