irmin-pack: Strengthen crash and SWMR consistencies #1690

Ngoguey42 · 2021-12-29T15:14:51Z

Should it be mentioned in the changelog ? Under bugfix?

samoht · 2021-12-29T15:47:48Z

src/irmin-pack/atomic_write_intf.ml

@@ -17,7 +17,7 @@
 module type S = sig
  include Irmin.Atomic_write.S

-  val flush : t -> unit
+  val sync : t -> unit


I think it's better to keep it calling flush for write instances.

I've removed flush and added sync.

Flush was a no-op because set_entry always flushes at the end. https://github.com/mirage/irmin/blob/main/src/irmin-pack/atomic_write.ml#L82

Sync was automatically called before each unsafe_find, which is unsafe because it may catch branches that point to a non-flushed index

Agree with flush, good catch.

For the sync do you mean that the index/pack.store is maybe not flushed when the branch store is? I'm wondering if the problem isn't that set does not force the pack file to flush (which will force the index).
set calls W.notify. I'm not sure how watches are used, but don't we need to guarantee that both the branch tag and the commit reach the disk? (cc @zshipko ).

Right now irmin-pack watches don't listen for changes on disk (which is something I'd like to improve at some point). Watchers are directly notified when the process they created the watch in does a set/remove operation.

In general, we use some words in unusual ways. I would say that sync is one such word. It tends to be linked in the mind of a programmer to fsync. However, in our codebase, sync tends to be a special function that is used by RO instances to detect changes in underlying files, and somehow force a reload of in-memory structures. So, the flow of information is somehow the opposite of the normal fsync. Whatever, I think we need to clearly document what "sync" is, what it is supposed to do etc. If it is for RO instances, we should make this clear. Then we can decide whether the implementation actually does what it is supposed to do.

Should we use push/pull instead of flush/sync? or flush/fetch?

I find push, pull and fetch very connotated to interactions with a remote server.

flush/reload?

samoht · 2021-12-29T15:53:54Z

src/irmin-pack/ext.ml

+            [clear] the order should be: branch, index, pack and dict.
+
+            In case an rw instance concurrently flushes, the [sync] the order
+            should be: branch, index, pack and dict.


I don't understand why sync has to be done in reverse order. Why is this different from clear and flush?

Actually, that's a very good point, read-only sync should be done in reverse order and I believe we are not doing this properly at the moment.

It is flush that is done in the opposite order than clear and sync.

I can attempt another explanation. Consider this dependency graph:

(top) branch | v index | v pack | v dict (bot)

crash during flush

flush top to bot (not this PR): RO gets dangling dict references

RW: flush branches/index/pack <crash> RO: sync all

flush bot to top (this PR). RO gets no gandling dict references

RW: flush dict <crash> RO: sync all

concurrent sync (RW hangs) during flush

flush top to bot (not this PR): RO gets dangling dict references

RW: flush branches/index/pack flush dict RO: sync all

flush bot to top (this PR): RO gets no gangling dict references

RW: flush dict flush index/pack/branches RO: sync all

concurrent sync (RO hangs) during flush

sync bot to top (not this PR): RO gets dangling dict references.

RW: flush all RO: sync dict sync index/pack/branches

sync top to bot (this PR): RO gets no dangling dict references.

RW: flush all RO: sync branches/index/pack sync dict

crash during clear

clear bot to top (not this PR). RO gets gandling dict references

RW: clear dict <crash> RO: sync all

clear top to bot (this PR): RO gets no dangling dict references

RW: clear branches/index/pack <crash> RO: sync all

concurrent sync (RW hangs) during clear

clear bot to top (not this PR): RO gets gangling dict references

RW: clear dict clear pack/index/branches RO: sync all

clear top to bot (this PR): RO gets no dangling dict references

RW: clear branches/index/pack clear dict RO: sync all

concurrent sync (RO hangs) during clear ⚠️ ⚠️

sync bot to top (not this PR): RO gets no dangling dict references ⚠️ ⚠️

RW: clear all RO: sync dict sync index/pack/branches

sync top to bot (this PR): RO gets dangling dict references ⚠️ ⚠️

RW: clear all RO: sync branches/index/pack sync dict

I now believe that sync can't be consistent with both clear and flush

It is flush that is done in the opposite order than clear and sync

this doesn't sound right: clear and flush are write operations, they should be done in the same order. Sync is a read operation and should be done in the reverse order to be consistent.

clear isn't a regular write operation, it is the only operation that removes data from the store. If writing and reading should have opposite orders, so does adding and removing data.

Changing clear to use the same order as flush wouldn't change the fact that sync can't be consistent with both clear and flush at the same time. Clearing from top to bot has the advantage of being crash consistent.

Thanks @Ngoguey42 for this analysis, I agree that sync cannot be consistent with both clear and flush.

But we don't support clear anymore, the only place is still used is when opening a RW instance with fresh:true while a RO instance is running (which is anyway not very well supported #955). So I propose we move the comments about clear in the pack_store (where it is used), and not deal with the clear/sync issues for now. Wdyt?

src/irmin-pack/ext.ml

icristescu · 2022-01-10T10:41:40Z

src/irmin-pack/pack_store.ml

      let former_offset = IO.offset t.pack.block in
      let offset = IO.force_offset t.pack.block in
-      if offset > former_offset then (
-        Dict.sync t.pack.dict;
-        Index.sync t.pack.index)


this was to prevent an unnecessary IO for index - which I think needs to read the offset from disk to determine if it needs to sync. But if we don't update the pack, there is no need to do so for the index, no?

icristescu · 2022-01-10T10:44:53Z

src/irmin-pack/pack_store.ml

+      (* It is expected that a flush was performed before [close]. Otherwise
+         data is lost. (ext takes care about it).
+
+         This function does not flush because index is expected to already be
+         closed. *)


Suggested change

(* It is expected that a flush was performed before [close]. Otherwise

data is lost. (ext takes care about it).

This function does not flush because index is expected to already be

closed. *)

(* This function does not flush because: 1/ index is expected to already be closed and 2/ the pack file was flushed upstream by [ext]. *)

icristescu · 2022-01-10T10:47:39Z

src/irmin-pack/atomic_write_intf.ml

@@ -17,7 +17,7 @@
 module type S = sig
  include Irmin.Atomic_write.S

-  val flush : t -> unit
+  val sync : t -> unit


Agree with flush, good catch.

For the sync do you mean that the index/pack.store is maybe not flushed when the branch store is? I'm wondering if the problem isn't that set does not force the pack file to flush (which will force the index).
set calls W.notify. I'm not sure how watches are used, but don't we need to guarantee that both the branch tag and the commit reach the disk? (cc @zshipko ).

tomjridge · 2022-03-03T10:20:03Z

src/irmin-pack/dict.ml

@@ -93,9 +93,9 @@ module Make (IO : IO.S) : S = struct
    t

  let close t =
+    (* If called from ext, flush was already called *)


I would say: it is much better if code correctness does not depend on how it is used by others; after all, we don't know how others will use the code in the future; and verifying that it is used in a particular way can be a very expensive operation requiring traversing the entire codebase. So I would prefer correctness arguments that can be made "locally" if possible. In this particular case, my expectations surrounding close are: all associated caches are expunged, the underlying file-descr is closed, etc. Now, this implementation of close doesn't do that. It only does something if open_instances becomes 0. It doesn't even cause a flush of the IO. So already this is an extremely hard function to reason about. Again, I think writing some clear documentation about "close" is supposed to do, in dict_intf.ml, will show up all sorts of problems. Aside: I really don't like the caching of IO instances in the code, for various reasons that I can maybe explain sometime.

tomjridge · 2022-03-03T10:25:05Z

src/irmin-pack/ext.ml

+            - the dict references nothing.
+
+            In case of crash (or an ro instance that concurrently syncs), the
+            [clear] the order should be: branch, index, pack and dict.


What is the [clear] the order ? Do you mean: in order to avoid problems when the system crashes, the order should be: ...? And then, shouldn't we flush the dict first, then the pack, etc?

BTW I really appreciate that you tried to clarify things with this documentation. I think we really need more documentation, especially in areas where there is "tricky" code.

tomjridge · 2022-03-03T10:26:55Z

src/irmin-pack/ext.ml

+            In case of crash (or an ro instance that concurrently syncs), the
+            [clear] the order should be: branch, index, pack and dict.
+
+            In case an rw instance concurrently flushes, the [sync] the order


What does the [sync] the order mean? I don't understand this part. As above, if A references B, then doesn't B need to flush before A, to avoid any "dangling" references from A?

tomjridge · 2022-03-03T10:33:07Z

src/irmin-pack/ext.ml

+            In case of new data in the 4 components that reference one another
+            and in case of crash (or an ro instance that concurrently syncs),
+            the [flush] order should be: dict, pack, index and branch.
+


I agree that this is the right order, given what you said earlier, if we want to avoid dangling references. But don't we also need to make an argument about async flushes etc? Maybe each component is periodically writing data out anyway (which then asynchronously flushes to disk), and don't these need to be co-ordinated too, not just what happens on sync/flush?

tomjridge · 2022-03-03T10:40:39Z

I added a few comments, but I would say that I don't really have a good feel for what is going on. There are 4 components involved: branch, index, pack and dict. It is not clear to me what guarantees each component gives as far as syncing data to disk is concerned: if each can write data out to disk asynchronously, then things are trickier than if they only write out on sync/flush etc. Looking at the simplest component, dict, the code uses the interface IO.S, and when I tried recently to document this interface, it seemed to me that there were quite a few buggy corner cases. So it is hard to be confident about code that uses an IO. And so on.

tomjridge · 2022-03-03T10:44:25Z

src/irmin-pack/ext.ml

@@ -127,6 +127,8 @@ module Maker (Config : Conf.S) = struct
        let branch_t t = t.branch

        let batch t f =
+          let readonly = Conf.readonly t.config in
+          if readonly then failwith "Can't batch an RO pack store";


At various points, some particular exception is thrown "RO_not_allowed" or similar. Perhaps we should standardise on that? In general, if a function is intended for RW only, why not add a dummy argument batch ~rw_only:() t f to mark that the interface should not be called in RO instances? Similarly ~ro_only could mark those interfaces (sync?) that are for RO instances only.

samoht · 2022-03-07T19:17:03Z

@tomjridge the way the components interact is described in https://tarides.com/blog/2020-09-01-introducing-irmin-pack - do you have a more precise question in mind? Happy to answer those of course.

Regarding:

here were quite a few buggy corner case

Could you point out which corner case? It would be useful to either open issues about those so that we don't forget and even better write small reproduction tests so that we can make sure any fix would be definite.

More generally, it would be helpful to have a more systematic way to address those issues, for instance using something like https://github.com/CraigFe/ocaml-fiu

tomjridge · 2022-03-08T09:26:02Z

Regarding:

here were quite a few buggy corner case

Could you point out which corner case? It would be useful to either open issues about those so that we don't forget and even better write small reproduction tests so that we can make sure any fix would be definite.

More generally, it would be helpful to have a more systematic way to address those issues, for instance using something like https://github.com/CraigFe/ocaml-fiu

I was referring to the IO_intf documentation PR #1758 , particularly the comments on line 92 and following where there seems to be at least the possibility of a bug, and probably more than one (at least, I also hit another problem with this code when trying to read the full data in the file - if the file size is reported as n bytes, you can actually only read n-16 bytes, which causes issues with a simple copy loop where you expect to read n bytes). And if we wrote down the actual invariants this code was supposed to satisfy, I think we would find some more. I think we are still processing this PR - ie we are tracking the issue and not just dropping it on the floor.

samoht · 2022-03-08T11:14:27Z

particularly the comments on line 92 and following where there seems to be at least the possibility of a bug,

Line 92 is truncate which is used by clear which not is used by Tezos (and only used by the tests so far), so it's not critical it seems?

tomjridge · 2022-03-08T11:19:32Z

particularly the comments on line 92 and following where there seems to be at least the possibility of a bug,

Line 92 is truncate which is used by clear which not is used by Tezos (and only used by the tests so far), so it's not critical it seems?

Yes precisely that! So Craig and I were not overly worried even if there was a bug there. And I think clear/truncate is maybe going away anyway so perhaps the whole thing sorts itself out. At least, it seemed sensible not to jump immediately to bug fighting mode.

Ngoguey42 · 2022-06-25T21:19:46Z

Superseded by #1865

Ngoguey42 force-pushed the more-swmr-consistency branch 4 times, most recently from 0f52db9 to a653481 Compare December 29, 2021 15:37

samoht reviewed Dec 29, 2021

View reviewed changes

tomjridge reviewed Jan 4, 2022

View reviewed changes

src/irmin-pack/ext.ml Outdated Show resolved Hide resolved

Ngoguey42 added this to the 3.0.0 milestone Jan 5, 2022

Ngoguey42 mentioned this pull request Jan 5, 2022

Release 3.0 #1695

Closed

33 tasks

icristescu reviewed Jan 10, 2022

View reviewed changes

icristescu added this to In progress in Irmin - Roadmap Jan 12, 2022

icristescu removed this from the 3.0.0 milestone Jan 20, 2022

tomjridge reviewed Mar 3, 2022

View reviewed changes

icristescu mentioned this pull request Mar 17, 2022

Irmin 3.2 #1784

Closed

irmin-pack: Strengthen crash and SWMR consistencies

ac9bd3d

Ngoguey42 force-pushed the more-swmr-consistency branch from 4c1b3f8 to ac9bd3d Compare March 17, 2022 16:51

Ngoguey42 closed this Jun 25, 2022

Ngoguey42 mentioned this pull request Aug 22, 2022

irmin-pack: assert failure, pack_index.ml, line 108 #1987

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

irmin-pack: Strengthen crash and SWMR consistencies #1690

irmin-pack: Strengthen crash and SWMR consistencies #1690

Ngoguey42 commented Dec 29, 2021

samoht Dec 29, 2021

Ngoguey42 Dec 29, 2021 •

edited

icristescu Jan 10, 2022

zshipko Jan 10, 2022

tomjridge Mar 3, 2022

samoht Mar 18, 2022 •

edited

Ngoguey42 Mar 18, 2022

samoht Dec 29, 2021

samoht Dec 29, 2021

Ngoguey42 Dec 29, 2021 •

edited

Ngoguey42 Dec 29, 2021

samoht Dec 29, 2021 •

edited

Ngoguey42 Dec 30, 2021 •

edited

icristescu Jan 10, 2022

icristescu Jan 10, 2022

icristescu Jan 10, 2022

icristescu Jan 10, 2022

tomjridge Mar 3, 2022

tomjridge Mar 3, 2022

tomjridge Mar 3, 2022

tomjridge Mar 3, 2022

tomjridge Mar 3, 2022

tomjridge commented Mar 3, 2022

tomjridge Mar 3, 2022

samoht commented Mar 7, 2022 •

edited

tomjridge commented Mar 8, 2022

samoht commented Mar 8, 2022 •

edited

tomjridge commented Mar 8, 2022

Ngoguey42 commented Jun 25, 2022

irmin-pack: Strengthen crash and SWMR consistencies #1690

irmin-pack: Strengthen crash and SWMR consistencies #1690

Conversation

Ngoguey42 commented Dec 29, 2021

Choose a reason for hiding this comment

Ngoguey42 Dec 29, 2021 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

samoht Mar 18, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Ngoguey42 Dec 29, 2021 • edited

Choose a reason for hiding this comment

crash during flush

concurrent sync (RW hangs) during flush

concurrent sync (RO hangs) during flush

crash during clear

concurrent sync (RW hangs) during clear

concurrent sync (RO hangs) during clear ⚠️ ⚠️

Choose a reason for hiding this comment

samoht Dec 29, 2021 • edited

Choose a reason for hiding this comment

Ngoguey42 Dec 30, 2021 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tomjridge commented Mar 3, 2022

Choose a reason for hiding this comment

samoht commented Mar 7, 2022 • edited

tomjridge commented Mar 8, 2022

samoht commented Mar 8, 2022 • edited

tomjridge commented Mar 8, 2022

Ngoguey42 commented Jun 25, 2022

Ngoguey42 Dec 29, 2021 •

edited

samoht Mar 18, 2022 •

edited

Ngoguey42 Dec 29, 2021 •

edited

crash during `flush`

concurrent `sync` (RW hangs) during `flush`

concurrent `sync` (RO hangs) during `flush`

crash during `clear`

concurrent `sync` (RW hangs) during `clear`

concurrent `sync` (RO hangs) during `clear` ⚠️ ⚠️

samoht Dec 29, 2021 •

edited

Ngoguey42 Dec 30, 2021 •

edited

samoht commented Mar 7, 2022 •

edited

samoht commented Mar 8, 2022 •

edited