Skip to content
This repository has been archived by the owner on Sep 3, 2021. It is now read-only.

decide what if anything to do about the difference in memstore/idbstore apis #62

Closed
phritz opened this issue Jul 23, 2020 · 10 comments
Closed

Comments

@phritz
Copy link
Contributor

phritz commented Jul 23, 2020

This issue attempts to summarize this meandering discussion from slack re mem/idbstore and locking: https://rocicorp.slack.com/archives/CMQQ9EDML/p1595449965291900

How we got to be talking about this:

  1. we discovered by accident when writing unit tests for dag::has_chunk that memstore wasn't properly isolated: https://github.com/rocicorp/repc/pull/39/files, https://rocicorp.slack.com/archives/CMQQ9EDML/p1594671486124700?thread_ts=1594671475.124600&cid=CMQQ9EDML
  2. i fixed the isolation bug in memstore and wrote a set of tests that verify that and other important behaviors: use a rwlock to protect the kv store #55.
  3. i decided to generalize the above test to run against the Store trait, instead of the memstore impl, so that we could run it against both memstore and idbstore (and any other store we might have). in doing so i discovered that while both memstore and idbstore have mostly the same tx guarantees, they have very different interfaces. specifically, idb enables you to create and send requests to txs before they start executing, whereas memstore wont let you instantiate a tx until it is safe to do so (RAII).
  4. i decided that we want memstore and idbstore to have the same behavior so i added a rwlock to idbstore that's used the same way as memstore uses its. this makes idbstore work like memstore api-wise. the pr is test idbstore against Store trait tests #58 and you can read more of the technical details in this comment https://github.com/rocicorp/repc/pull/58/files#diff-e83d091a2db9fc52230899a70d97e3a4R33
  5. i started talking to nate about this and realized we hadn't really talked it through and that we should not make this decision lightly.
  6. i opened this issue. here we are. welcome!

I suspect we can all agree that memstore and idbstore should, if possible, work the same way for sanity's sake and so our tests are most useful. I think the question is whether to make idbstore work like memstore as in pr 58 above, or to go the other way and make memstore work like idbstore. We are still discovering aspects of this we haven't thought of so please chime in with anything you see. Having said that, here's what I currently see:

Pros for making idbstore use the rwlock:

  • this is a natural model (nate chose it without thinking, as would i have), and we can fit it pretty much over anything
  • it's easy to reason about, easy to observe, and easy to test

Cons for making idbstore use the rwlock:

  • potentially less concurrency if idb can safely run read txs while running write txs Edit: @whiten found that no browsers do this any more
  • idbstore only works like memstore in the context of a single page; it loses these semantics and relies on the underlying idb semantics when viewed across windows
  • unless we add some more code to idbstore we lose any fairness guarantees that idb has (bc we use a rwlock which is not fair). this is an addressable issue: we can queue tx creation requests and serve them according to our own policy.
  • there may be a good reason we are not seeing for the idbstore interface to work like it does. i'm not seeing it given that it's async either way... but maybe we lose something important?

Proposal: this is not the most important thing right now, or even an important thing right now, so let's make idbstore work like memstore by merging PR 58 above so we have a nice simple mental model. we can file issues to track the potential negatives and reconsider in the future.

@whiten did i miss anything?

cc @arv @aboodman

@aboodman
Copy link
Contributor

idbstore only works like memstore in the context of a single page; it loses these semantics and relies on the underlying idb semantics when viewed across windows

EEP. This is the one that gives me pause. Landing #58 doesn't seem like it completely brings idbstore into alignment with memstore in that case.

@aboodman
Copy link
Contributor

I'm not clearly understanding the difference between memstore and idb impls so it's hard to comment. However:

  • I do agree we want them to be the same
  • I do agree we want to share tests
  • I share the preference for simpler, easier to understand, as long as it's not too slow for our constraints (it doesn't seem to be)
  • The cross-tab thing is a big concern for me as it could lead to edge case correctness issues which IMO are even worse than non-edge-case correctness issues. I don't want to add complexity to add edge cases. There are ways to get a mutex cross-tab though which we could consider: https://medium.engineering/wait-dont-touch-that-a211832adc3a
  • Or we could just accept that idb is what we have and make memstore like it

In order to have an opinion on which of these paths to take I need to understand the diff between the two APIs better. Does somebody understand it clearly yet or do we need a project to understand that?

@phritz
Copy link
Contributor Author

phritz commented Jul 23, 2020

diff between the two APIs

https://github.com/rocicorp/repc/pull/58/files#diff-e83d091a2db9fc52230899a70d97e3a4R42 explains the main thing. The other q is whether browsers actually utilize this loophole https://github.com/rocicorp/repc/pull/58/files#diff-e83d091a2db9fc52230899a70d97e3a4R47, if so we potentially suffer some hit to concurrency. what else are you after?

EEP. This is the one that gives me pause.

Note that the inability to enforce mutual exclusion at the idbstore level does not affect correctness. Per the long comment linked to, idb is enforcing the constraints under the hood, just not visibly to us. Adding the rwlock to idbstore makes it work like memstore wrt a single page. It doesn't work like memstore across page w/o a weblocky thing, but that does not affect correctness.

@whiten
Copy link
Contributor

whiten commented Jul 24, 2020

w3c/IndexedDB#253 clearly suggests that today, all browsers enforce that readwrite transactions cannot be started while readonly ones are still running, and this is suggested to be added to the spec. As such, I think @phritz's suggestion to enforce that around our IdbStore is appropriate, and support that being enforced with a RwLock. Thanks for documenting all of this detail @phritz.

@phritz
Copy link
Contributor Author

phritz commented Jul 24, 2020

@aboodman i would like a thumbs up for you on this. Nate thinks it's a fine idea.

Let's not overthink this. In essence all we are talking about is a change in the way the api works, simplified pseudo code:

// How memstore works now.
let tx1 = memstore.write();
let tx2 = memstore.write();  // <-- write() will not return until tx1 commits or rolls back.

// vs how idbstore works now.
let tx1 = idbstore.write();
let tx2 = idbstore.write();  // <-- This call to write DOES NOT BLOCK. 
                             //     It returns a tx that hasn't been started.
                             //     You can queue up queries against it:
tx2.doSomeStuff();  // <-- Queues it up, but doesn't start executing until tx1 commits/rolls back.

The proposal is to make the idbstore work like the memstore, that is, have it asynchronously wait until there are no exclusive txs outstanding before returning a new tx.

@aboodman
Copy link
Contributor

Right but it won't do that across tabs right? Is the main reason to do this so that we can unit test? I guess, I get it but I'm having trouble wrapping my head around what the result of the discrepancy across tabs is.

@aboodman
Copy link
Contributor

In any case: 👍 to not block.

@phritz
Copy link
Contributor Author

phritz commented Jul 24, 2020

Right but it won't do that across tabs right?

correct

Is the main reason to do this so that we can unit test?

the reasons are 1) so we have a clear mental model that is consistent across both idbstore and memstore (no mental gymnastics to keep two ways things work in your head) and 2) so that when we write tests with memstore they are meaningful (because idbstore works the same way).

I guess, I get it but I'm having trouble wrapping my head around what the result of the discrepancy across tabs is.

if a mutually exclusive tx is open in another tab then in this tab a call to open a read or write tx will return immediately making you think the tx is running when in fact it hasn't started yet.

@aboodman
Copy link
Contributor

aboodman commented Jul 25, 2020 via email

@phritz
Copy link
Contributor Author

phritz commented Jul 30, 2020

implemented by #58

@phritz phritz closed this as completed Jul 30, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants