Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

keep tabs in sync via broadcast channel #56

Merged
merged 1 commit into from
Nov 21, 2022
Merged

Conversation

tantaman
Copy link
Collaborator

tabs.mov

@tantaman
Copy link
Collaborator Author

tantaman commented Nov 21, 2022

@sgbeal (and @rhashimoto) -- you might be interested in this.

The video in the PR description above is using wa-sqlite in two tabs pointed to the same DB. A broadcast channel tells the tab to re-query when and update is made in another tab. Things work seamlessly.

The video below is using the official SQLite wasm build.

official.mov

The official build fails pretty much immediately due to a lock still being held by the write in the other tab.

@tantaman tantaman merged commit f52190f into main Nov 21, 2022
@tantaman tantaman deleted the broadcast-channel branch November 21, 2022 22:41
@sgbeal
Copy link

sgbeal commented Nov 22, 2022

We've improved that a great deal in the past 24-ish hours but i honestly don't currently have any clue how we can improve it more. i'll study Ron's code again to see what he's done in terms of obtaining sync handles as time allows.

@sgbeal
Copy link

sgbeal commented Nov 23, 2022

Follow-up: i've looked back into wa-sqlite and see that one of the reasons that API can offer better concurrency is that, for example, if a read request comes in but the VFS has no lock for that file, it's reading that file without having a lock:

https://github.com/rhashimoto/wa-sqlite/blob/master/src/examples/OriginPrivateFileSystemVFS.js

  xRead(fileId, pData, iOffset) {
...
      if (fileEntry.accessHandle) {
        nBytesRead = fileEntry.accessHandle.read(pData.value, { at: iOffset });
      } else {
        // Not using an access handle is slower but allows multiple readers.
        const file = await fileEntry.fileHandle.getFile()
        const blob = file.slice(iOffset, iOffset + pData.value.byteLength);
        const buffer = await blob.arrayBuffer();
        pData.value.set(new Int8Array(buffer));
        nBytesRead = Math.min(pData.value.byteLength, blob.size);
      }
...
  }

That is not, IMO, a solid solution. It is possible for a concurrent writer to modify that data while it's being read that way. The MDN docs say:

If the file on disk changes or is removed after this method [FileSystemFileHandle.getFile()] is called, the returned File object will likely be no longer readable.

The operative word being "likely," i.e. implicitly unknown/undefined. That is definitely not an approach i'm comfortable taking.

Looking closely at the file-related APIs, File.size returns the size of a file, just like OPFS's FileSystemSyncAccessHandle.getSize()1, but the latter requires a write lock. The fact that there's a different API for that ostensibly same method strongly implies that there's a low-level functional difference between how File and FileSystemSyncAccessHandle model a file's size. There's nothing in the API which prohibits a File object from being modeled as an in-memory blob (File derives from Blob, which implies that that's the case, but no docs i've found come out and say it is or isn't), as opposed to an on-storage object, the implication of which is that the approach of the xRead method above may well read the whole file into memory just to slice out a single db page. Even if that's not the case (which we don't know because the JS API docs are not detailed enough to tell us), switching between two different models of the filesystem, as wa-sqlite is doing, feels fragile to me.

i've looked into Web Locks as an alternative locking approach but they appear to buy us nothing and add yet another layer of locking. Since we require an OPFS lock (i.e. sync access handle) to do any writing or size-fetching, having a Web Lock to also block access on seems like it would add complexity and a performance hit for no functional benefit. i'm open to being convinced otherwise, though.

The OPFS folks are currently looking into a locking API more akin to Posix locks. If and when they introduce that, we'll use it to improve the concurrency situation, but i'm not currently convinced that following wa-sqlite's lead on bypassing OPFS's API for read operations is a viable solution for us.

Footnotes

  1. https://source.chromium.org/chromium/chromium/src/+/main:third_party/blink/renderer/modules/file_system_access/file_system_sync_access_handle.idl

@rhashimoto
Copy link

Follw-up: i've looked back into wa-sqlite and see that one of the reasons that API can offer better concurrency is that, for example, if a read request comes in but the VFS has no lock for that file, it's reading that file without having a lock: [...]

Wow. So much disinformation. And let's just assume wa-sqlite is "not IMO a solid solution" and "fragile" upon finding something that doesn't make sense to you, instead of asking how that actually works or alerting me directly to what would be a major bug.

i'm open to being convinced otherwise, though.

I really wonder, TBH.

@tantaman
Copy link
Collaborator Author

Let's try to stay productive.

asking how that actually works or alerting me directly to what would be a major bug.

What's your take so far? Is it a bug or something that works and requires further explanation?

If the weblock is held while reading (thus preventing a write) then it sounds like this wouldn't be problematic.

@tantaman
Copy link
Collaborator Author

Also @sgbeal I appreciate the thorough answer with references.

My current focus is on my extension to SQLite but, once that wraps up, I'll have some time to help dig into this specific issue.

@sgbeal
Copy link

sgbeal commented Nov 23, 2022

@rhashimoto my sincere apologies if my post was interpreted as an attack or FUD. i am always open to being taught the error of my ways. Can you please, for example, explain how the above xRead() is legal without a sync access handle in a multi-tab/multi-worker environment? My reading of both the code and the MDN docs is that it is that a writer may modify that file while the read is going on, leading to, insofar as i can determine, undefined results. If that's not the case, i'd love to understand why so that i can consider an equivalent solution for our project's xRead().

@sgbeal
Copy link

sgbeal commented Nov 23, 2022

And for what it's worth, @rhashimoto, you are undeniably my senior when it comes to the sqlite3 VFS and cutting-edge JS. i'm not here to try to teach you anything. On the contrary, i study your code to learn.

@rhashimoto
Copy link

@sgbeal The words are nice but they don't match the behavior. This is not the only case where you use a hot take based on a shallow (at best) investigation to denigrate another project. Consider this on the SQLite website:

James Long's aptly-named absurd-sql demonstrates persistent browser-side sqlite3 by storing the databases in IndexedDB storage. We experimented with that approach but it's... well, absurd 😉. It's too slow for anything beyond trivial databases and it's easy to get a database in an inconsistent state by visiting the page from two tabs.

I think you admitted that your experiment wasn't anything close to the absurd-sql approach. However, apparently the official SQLite position remains that the absurd-sql approach is too slow for real databases (!) and vulnerable to multi-tab corruption? Granted, as it turns out there are reports of absurd-sql corruption with multiple tabs, but (1) I think this is coincidental to your findings (as you don't recall implementing any explicit locking), and (2) a bug that can be fixed doesn't invalidate the entire approach. Choosing not to use IndexedDB is fine; tagging absurd-sql on this "evidence" in order to justify that decision is not.

As an apology for the rant in your space, @tantaman, I will answer the question in a separate message so you can edit/hide/delete this one as you see fit.

@sgbeal
Copy link

sgbeal commented Nov 24, 2022

@rhashimoto you may or may not care to hear that i've toned down the absurd-sql attribution you referred to. My unashamed personal dislike of IndexedDB does not reflect an "official stance" of the sqlite project and i'll make a point of avoiding strongly-worded/contentious personal takes in the documentation. i look forward to being corrected on the "unlocked xRead()" point.

@rhashimoto
Copy link

rhashimoto commented Nov 24, 2022

First, let's get a few things straight. I'm not an expert in Javascript or SQLite. I think it's quite likely that I've written more SQLite VFSs for the browser (8+ and counting) than anyone else, but a big part of the reason for that is some of the early ones were so terrible. The current ones may be still be terrible in all new and different ways. So while I do have experience earned through bugs and mistakes, no one should be dissuaded from going a different way. In fact, enabling people to easily try different things is the primary reason wa-sqlite exists; it wasn't intended to support production apps as-is, other than my own.

Second, @tantaman are you really using the wa-sqlite OriginPrivateFileSystemVFS in your app? If you are, I probably wouldn't - I consider IDBBatchAtomicVFS to be easier to integrate, higher performance, and more mature, basically everything that matters to an app writer. If you aren't, then we need to keep in mind that it's not a real apples-to-apples comparison with the Official SQLite WASM (OSW from now on) for purposes of analyzing the OPFS implementation. Right now it seems like you're saying that wa-sqlite is working fine with OPFS and OSW is not; I want to know whether that is actually the case before I make any ridiculous claims.

@sgbeal
Copy link

sgbeal commented Nov 24, 2022

If you aren't, then we need to keep in mind that it's not a real apples-to-apples comparison with the Official SQLite WASM (OSW from now on) for purposes of analyzing the OPFS implementation.

Absolutely. The OPFS-specific details are of course similar but the synchronous-interface-behind-an-async-impl bit is apples and oranges.

Right now it seems like you're saying that wa-sqlite is working fine with OPFS and OSW is not; I want to know whether that is actually the case before I make any ridiculous claims.

At the time this post was started, the claim that OSW was utterly failing in terms of OPFS concurrency was absolutely correct! Even getting two connections going was unreliable. We've since created a multi-worker test app which has allowed us to make some headway in improving that:

https://wasm-testing.sqlite.org/tests/opfs/concurrency/

(no guarantees on the stability of that URL and it will only work with an OPFS-capable browser)

As it currently stands, our extremely basic tests are reliably hosting up to 3 handles to the same db (one per worker), each one performing a tiny bit of work every 750-1000ms, but they get flaky (==locking-triggered errors) quickly as more workers are added. By making one particular concurrency-over-speed tweak we can about double that, but that's a measure of last resort because it cuts the runtime speed of high-I/O benchmarks by about 4x. Our standard benchmarking app is a wasmified build of sqlite's own "speedtest1" app:

https://wasm-testing.sqlite.org/speedtest1-worker.html?vfs=opfs&size=25

(same link disclaimer as above)

How that compares to wa-sqlite, i cannot say and have no basis for speculation. It would surprise me if wa-sqlite's "concurrency cap" is not significantly higher.

i'm still studying your Web Locks impl to try to get my head around it well enough to figure out if a similar approach will help us (as it demonstrably has wa-sqlite), but my brain still struggles with Promise-heavy APIs for whatever reason.

@rhashimoto
Copy link

rhashimoto commented Nov 26, 2022

I looked through the source tree and I don't see any references to OriginPrivateFileSystemVFS, so I'm going to assume it's IDBBatchAtomicVFS running in the video (I also see @tantaman making quasi-legal arguments to justify circumventing the spirit of the GPL so maybe we should discuss that sometime). Both OriginPrivateFileSystemVFS and IDBBatchAtomicVFS use Web Locks so that part is the same, but we can't rule out that OriginPrivateFileSystemVFS might not behave as well, e.g. there might be some issues with the browser implementation of OPFS which would show up with both libraries. As I mentioned before, there's no good reason for app writers to use OriginPrivateFileSystemVFS, so I don't think it's been exercised well (or maybe even at all) outside of my own limited tests. It's something to keep in mind.

If the weblock is held while reading (thus preventing a write) then it sounds like this wouldn't be problematic.

Yes, this is the essential idea. Unless someone explicitly disables locking, SQLite will issue xLock/xUnlock calls as needed to protect the database against inconsistent reads and writes.

if a read request comes in but the VFS has no lock for that file, it's reading that file without having a lock: [...]

That is not, IMO, a solid solution. It is possible for a concurrent writer to modify that data while it's being read that way.

This is not the case. SQLite allows two kinds of files to be opened by multiple connections at the same time, main database files and main journal files, and they both are protected by locking the main database file. So before we start talking about OPFS at all, SQLite will not call xRead, xWrite, xFileSize, xTruncate, etc. unless it is safe to do so, i.e. the file is used by a single-connection (typically a uniquely named temporary file) or within the scope of a lock, a shared lock for reading or an exclusive lock for writing.

i've looked into Web Locks as an alternative locking approach but they appear to buy us nothing and add yet another layer of locking. Since we require an OPFS lock (i.e. sync access handle) to do any writing or size-fetching, having a Web Lock to also block access on seems like it would add complexity and a performance hit for no functional benefit.

It is true that Web Locks does add another layer of locking on top of OPFS. It is not true that there is no functional or performance benefit. Web Locks is a superior API for locking because it queues your locking request and waits for the lock to become available. In contrast, OPFS either gives you the lock or throws an exception (within the returned Promise). This makes it a lot more difficult to maintain performance and avoid starvation.

Let's look at what happens with OSW when we try to start a transaction when another connection has the lock. Here's the handle acquisition code, which doubles as the locking code:

https://github.com/sqlite/sqlite/blob/902ea8392536b729c4c4553613dae9f65a4c1bf3/ext/wasm/api/sqlite3-opfs-async-proxy.js#L253-L274

At t=0 we call OPFS await createSyncAccessHandle() (line 260). When another connection has an access handle, this throws and we sleep until a retry signal or timeout on line 272. Perhaps I'm missing it, but I don't see anything that actually issues retry - if I'm not wrong about that, we'll try again at t=600, t=1500, and t=2700 until we succeed or give up. If the other connection happens to release its handle right before we wake up then that's great but that would be really lucky. It is much more likely that the other connection will release the access handle at some time other than right at the end of our sleep.

What if the other connection releases the handle in the middle of our sleep, say at t=300 for purposes of discussion? Two things can happen. The first is that some other connection, perhaps the same connection that just released the handle (fairly likely), jumps in and grabs the handle again, which is unfair if we had made our request first. The second is we get the handle when we try again at t=600, which means the database has been sitting idle for 300 ms while we were waiting to use it. Neither of these things are good - one contributes to starvation and the other one saps performance.

I think that you could make this work if that retry is in fact used; I just didn't see it being set or reset anywhere. If it is being used you will get spurious wake-ups if you have more than two-way contention (my understanding is Atomics.notify wakes up everyone), not horrible but not preferable. Actually, I'm not certain that it isn't always 0, in which case it would be spinning instead of sleeping, also not great. Perhaps @sgbeal can clarify.

If you use Web Locks to implement xLock/xUnlock then the request waits on the lock - there is no idle time, no spinning, no spurious wake-ups, and no starvation. Whatever connection has the lock has the right to xRead, xWrite, or whatever (depending on the type of lock). Getting an access handle should always succeed, except for actual errors, at least in the default journal_mode (we'll talk about other modes in a bit) so the cost of that extra layer of locking should be negligible (because acquiring a lock that is available is the path that implementations optimize) and you're not using exceptions to handle normal events. In addition, it allows richer locking semantics than access handles provide by themselves - you can implement the full SQLite locking model, including shared read locks. It's also a lot easier to work with and reason about Web Locks than OPFS locks, especially because you can dump the state of all your locks (with LockManager.query) when debugging concurrency.

On to the specifics of OriginPrivateFileSystemVFS. It implements xLock/xUnlock using Web Locks (via a helper class, WebLocksExclusive) as described. There is a drop-in replacement helper class, WebLocksShared, which allows shared read locks. In addition, xLock acquires an OPFS access handle when it gets an exclusive lock and xUnlock releases the handle when it loses the lock (bug here, should be !== VFS.SQLITE_LOCK_EXCLUSIVE instead). This manages the writable access handle for main databases. Reads of main databases are done via the File interface if no access handle exists (i.e. if the database is not locked for write), which is slower but allows multiple readers (if WebLocksShared is used).

switching between two different models of the filesystem, as wa-sqlite is doing, feels fragile to me.

This is not a problem. The locking guarantees that only one model is being used at a time, and if you look carefully a new File object is created when needed. The use of File does reduces performance on a single connection. That was an implementation decision because as a proof of concept experiment, enabling concurrent read transactions was more interesting to me than optimizing for single connection speed. Ideally a production VFS could have the best of both worlds by switching the behavior based on the locking_mode pragma (pragma settings are delivered via xFileControl).

For files that are not a main database, the OPFS access handle is acquired in xOpen and released in xClose. There is a bug in OriginPrivateFileSystemVFS here: this works for main database journals with the default journal_mode, DELETE, but not with TRUNCATE or PERSIST (which may be desirable for better performance). When the journal_mode is TRUNCATE or PERSIST, multiple connections can open the journal file so this will result in an SQL_IOERR for all but the first connection to open it. How should this bug be fixed?

OSW handles this by releasing the handles after the file system is idle for some period of time. This works, but I don't like it. When you switch to a waiting connection, the database will be idle for the timeout duration so that saps performance. And how do you tune it? My 8-year-old laptop can execute a transaction in ~20 ms - who's to say it's reasonable to idle for 150 ms? It's not obvious to me that there is a reasonable one-size-fits-all timeout. It might depend on CPU speed, disk speed, browser, the queries, etc. I think a better solution is out there.

My preferred fix would be to acquire the handle for a main database journal lazily as OSW already does, and to release the handle in xUnlock on the database file or xClose on the journal file. @sgbeal considers this be an ugly hack. I don't because the journal naming scheme is documented and the xOpen flags identify the type of file (so you can't be fooled by a database file named "foo-journal", for example). Perhaps someone could ask a SQLite architect for an opinion. It might also be possible to get the association between database and journal names in the C VFS and pass that to Javascript; I believe both names are available in the database pager.

Another alternative fix would be to use BroadcastChannel to signal a VFS to release a journal access handle. BroadcastChannel is another of those APIs that are new but supported everywhere OPFS is. A connection that needs the access handle can request other connections with the journal file open to release the handle if they have it. I would guard the access handle possession with a Web Lock, not so much because it needs to be locked but because it's an efficient way to wait on the access handle availability.

Either of these fixes should keep the database from being idle before acquiring a journal access handle in PERSIST or TRUNCATE mode. I expect the overhead to be small, much less than the timeout, and as a bonus with a single connection the journal access handle would never need to be reacquired.

@tantaman
Copy link
Collaborator Author

tantaman commented Nov 26, 2022

I also see @tantaman making quasi-legal arguments to justify circumventing the spirit of the GPL

Yeah, I agree it is against the spirit but maybe or maybe not the letter. In either case, I'm not the best person to weigh in on it. I can remove that text and indicate that that package is GPL (already done here -- https://github.com/vlcn-io/cr-sqlite/blob/main/js/wasm-esm/wa-crsqlite/README.md). Also, given strut does make direct use of wa-sqlite (and not one of the other db connectors) I can re-license that project to GPL.


Web Locks is a superior API for locking because it queues your locking request and waits for the lock to become available

Given web locks is async would using them would require sqlite to be rebuilt with asyncify?

Another alternative fix would be to use BroadcastChannel to signal a VFS to release a journal access handle. BroadcastChannel is another of those APIs that are new but supported everywhere OPFS is

Interesting idea. I'll have to study the VFS code a bit more.

I don't because the journal naming scheme is documented and the xOpen flags identify the type of file

FWIW, I believe fly.io makes extensive use of knowing the names sqlite uses for WAL files (not journal files but similar in spirit) for their litestream project.

@rhashimoto
Copy link

rhashimoto commented Nov 26, 2022

Given web locks is async would using them would require sqlite to be rebuilt with asyncify?

No. OSW already uses asynchronous APIs, including OPFS because only a small subset of its methods are synchronous. OSW (like absurd-sql) keeps the WASM synchronous by using SharedArrayBuffer to bridge that WASM Worker to another Worker where asynchronous calls (like OPFS or Web Locks or anything really) can be made.

The OPFS spec has recently been changed to make more of the OPFS API methods synchronous - it used to be just read and write but soon will include file size, truncate, etc. However, it looks like opening an OPFS access handle will remain asynchronous so the need for either SharedArrayBuffer or Asyncify is not going away in the near future. WASM Promise integration seems to be the most, um, promising way to be rid of both.

@sgbeal
Copy link

sgbeal commented Nov 26, 2022

Wow, that's a lot to digest. A few notes before i have to get back to packing:

  • Web Locks: we (OSW) didn't bother to look into Web Locks initially because they're not portable yet. It wasn't until a week or two ago that an anonymous forum poster pointed out that they're everywhere OPFS is that Web Locks began to make sense. The current wait/retry approach is most definitely not ideal - it was a quick and easy solution which fit within the overall infrastructure and is not a solution we're emotionally tied to.

  • The sqlite3_vfs locking behavior you describe is very helpful info and i'll re-do the xRead() to do the "slow unlocked read" when called without an explicit xLock(). If that doesn't absolutely tank performance (and i don't think that it will), that would be a big improvement right off the bat. IIRC most of the lock-related failures in my testing are either in xRead or xOpen.

There is a bug in OriginPrivateFileSystemVFS here: this works for main database journals with the default journal_mode, DELETE, but not with TRUNCATE or PERSIST (which may be desirable for better performance). When the journal_mode is TRUNCATE or PERSIST, multiple connections can open the journal file so this will result in an SQL_IOERR for all but the first connection to open it. How should this bug be fixed?

Aha! That explains OWS's immediate contention when attempting that same trick. i don't have an answer to that final question but just FYI: our experimentation shows TRUNCATE to be the fastest mode on OPFS, so we try to default OPFS connections to that mode (but that really depends on how the client opens the db). In our I/O-heavy benchmarking (the speedtest1 app), the speed difference between TRUNCATE and DELETE is visible to the human eye, without bothering to time it. PERSIST mode is new to me so i can't say anything about that.

OSW handles this by releasing the handles after the file system is idle for some period of time. This works, but I don't like it.

FWIW, i don't either (and i'm the one who implemented it) ;). It's a stopgap measure until we can find something better. To that end...

We now, in a branch, have a new OPFS-specific URI flag which will tell it to release "implicit locks" at the end of the operation which acquires them. (That will be merged into trunk sometime today or tomorrow.) This approximately doubles our "reliable" concurrency levels in very basic tests but has a huge performance hit on I/O-heavy tasks (taking as much as 400% longer to run with zero contention). That flag eliminates the idle-time closing of sync handles but the performance hit of reacquiring the sync handle for every op which uses it sucks. (Sidebar: we had very specific performance requirements going into this, imposed by an Influential Third Party, and we shamelessly optimized to hit those. Resolving contention was a secondary concern which currently has priority.)

My preferred fix would be to acquire the handle for a main database journal in xOpen on the journal file if the main database is locked and in xLock on the database file, and to release the handle in xUnlock on the database file or xClose on the journal file. @sgbeal considers this be an ugly hack. I don't because the journal naming scheme is documented and the xOpen flags identify the type of file (so you can't be fooled by a database file named "foo-journal", for example).

i do consider it ugly but admittedly only on the grounds of aesthetics. i have no compelling technical argument against it and wouldn't lose too much sleep over implementing something like that. Right now the OWS data structure doesn't have any cross-file-handle mapping, e.g. main-db-to-journal, but such a thing could be added without much hassle. i will definitely put that down on the list of things to experiment with.

Another alternative fix would be to use BroadcastChannel ...

That's a new term for me, so i'll need to research that.

THANK YOU for your exhaustive response. This will give me much to mull over and experiment with for the near-term future.

@tantaman tantaman mentioned this pull request Apr 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants