Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interactions with the BFCache #17

Open
a-sully opened this issue Mar 9, 2022 · 23 comments
Open

Interactions with the BFCache #17

a-sully opened this issue Mar 9, 2022 · 23 comments

Comments

@a-sully
Copy link
Collaborator

a-sully commented Mar 9, 2022

Migrated from WICG/file-system-access#319 (see #2)

@ArthurSonzogni

You may want to reference this from the BFCache meta bug:
whatwg/html#5880
+CC @rakina

I was wondering what kind of interaction this feature will have with the Back-forward cache. I see the execution context can obtain a lock for a file. Then the document might be (?) able to enter the BackForwardCache without releasing it. Maybe this could cause some problems?
For instance there are other features like WebLock which prevents the BFCache from being used. Maybe the same should happen here:
https://source.chromium.org/chromium/chromium/src/+/main:content/browser/renderer_host/back_forward_cache_impl.cc;l=265;drc=e5616aeff413a17175a96056b3bf1fbc9ca0ade7;bpv=1;bpt=1
Maybe you did it already. I see kWebFilesystem. Not sure if this is related.

More generally, you might want to clarify if there are any BFCache specific behavior about this feature.

@rakina

On the kWebFilesystem thing, I think @hajimehoshi and @fergald did add WebFileSystem support for BFCache, but I'm not sure about the details - maybe the can comment on what Chrome is doing and if there's anything to be added on the spec side.

@fergald

My understanding is that WebFileSystem had nothing that prevented bfcaching (there are no connections to bring up/down or locks). I see that it's unblocking is still controlled by a flag but we should clean that up and leave it unconditionally unblocked.

I don't think there's any spec change (unless we want to state positively that it is compatible with bfcache) but we should probably add a WPT that checks if a page using WFS is cached. That would make it clear which browsers support it.

@mkruisselbrink

On the chrome side currently using the file system access API with local files on the file system will cause the page to be prevented from entering bfcache, although I think that is primarily since we haven't reased about interaction with usage tracking etc for our current permissions implementation (where lifetime of permissions is dependent on the presence of tabs with pages loaded).

For the newly proposed access handle API, which include file locking, I do suspect we'll need explicit spec-level integration with bfcache. We don't want pages in the bfcache to be able to keep files locked after all.

@rakina
Copy link
Member

rakina commented Apr 27, 2022

Thanks! Wondering if there's any resolution needed here. BTW, there's a guide on how to handle non-fully active documents (which includes bfcached documents) in spec.

@a-sully
Copy link
Collaborator Author

a-sully commented Apr 27, 2022

TLDR: this isn't currently a concern, but will become one soon

The primary concern here is that file handles which have acquired locks (these locks are internal to the File System Access API) will enter BFCache without releasing said locks.

Currently:

  1. The only meaningfully enforced locking in the API is that creating an Access Handle grabs an exclusive lock on the file,
  2. Access Handles are only available within dedicated workers, and
  3. BFCache is not supported in dedicated workers

Once any of the above conditions change, we will need to revisit this.

As for when these will change...

  1. See Possible options in FileSystemFileHandleCreateAccessHandleOptions #19 (and also Explicitly define locking mechanisms #18)
  2. Will likely change if/when we support Async Access Handles
  3. Tracked in Proposal: beforePutToBFcache and afterRestoreFromBFcache events for DedicatedWorkerGlobalScope html#7216. See the latest design doc

Please let me know if any of this does not match your understanding!

@rakina
Copy link
Member

rakina commented Apr 28, 2022

Thanks, actually we have started experimenting with DedicatedWorker support for BFCache ~1 month ago, so we probably should work on resolving this (cc @fergald).
Is it reasonable to drop the lock on navigation away from the page? (and maybe restore them on BFCache restore?). Do we need the events proposed at whatwg/html#7216 for that to make sense (so that the page can prepare for the lock to be dropped somehow?)

@annevk
Copy link
Member

annevk commented Apr 28, 2022

Ideally this behaves the same as Web Locks, I'd imagine?

@saschanaz could you comment on what the story is there?

@saschanaz
Copy link
Member

saschanaz commented Apr 28, 2022

Ideally this behaves the same as Web Locks, I'd imagine?

@saschanaz could you comment on what the story is there?

Gecko prevents BFCache when at least one lock is held or a pending request exists, so that closing the page can immediately release the lock. AFAIK Chrome prevents BFCache when at least one lock has been requested.

@a-sully
Copy link
Collaborator Author

a-sully commented Apr 28, 2022

Preventing BFCache would certainly work here (this is what the API currently does in Chromium), though I think it's worth exploring other options...

Is it reasonable to drop the lock on navigation away from the page? (and maybe restore them on BFCache restore?). Do we need the events proposed at whatwg/html#7216 for that to make sense (so that the page can prepare for the lock to be dropped somehow?)

We should definitely drop the lock in some way when navigating away from the page. However, re-acquisition of the lock upon restore can't be guaranteed, since the file may have been locked by another tab in the meantime.

This becomes tricky since we're not allowed to construct a sync Access Handle without first acquiring a lock. Is there precedent for destroying all lock-holding objects when the page enters the BFCache? This might be enough for this API since these locks are less generic than for Web Locks? It's pretty straightforward for a site to re-acquire an Access Handle, for example.

That being said... the API does acquire locks in other cases:

  • A shared lock is held while a file handle has an open writable stream (via createWritable())
  • An exclusive lock is held when move()ing a file or directory (not yet shipped)
  • A shared lock (for now - arguably we should make this exclusive) is held when remove()ing a file or directory (not yet shipped)

If destroying a lock-holding object is feasible, we can destroy the writable streams, just like we would the Access Handles.

Apologies for my lack of familiarity with the BFCache, but I'm not sure what would happen in the latter two cases. move() in particular can be a long-running operation if we're recursively moving a directory. What happens if the page goes into the BFCache while the operation is in progress?

@rakina
Copy link
Member

rakina commented May 2, 2022

I strongly advise against disabling BFCache for new APIs. For older/shipped APIs like Web Locks we're pretty much stuck because websites already use them before BFCache is a thing (at least in Chrome), while we can shape the behavior we want with this new API (not sure what stage this is in?)

We should definitely drop the lock in some way when navigating away from the page. However, re-acquisition of the lock upon restore can't be guaranteed, since the file may have been locked by another tab in the meantime.

Yes this sounds like the way to go, and don't re-acquire the lock automatically on restore. I think we should just advise developers to re-acquire the lock manually on BFCache restore. This might be tricky if we expect them to do that on DedicatedWorker... If you think it's necessary, we might need to ship whatwg/html#7216.

Apologies for my lack of familiarity with the BFCache, but I'm not sure what would happen in the latter two cases.

Will those still be a problem if we always drop the lock on navigation? If we can guarantee that a bfcached page will never hold a lock, it doesn't sound like the bfcached page would be affected by anything outside of the page, and vice versa?

@smaug----
Copy link

3. BFCache is not supported in dedicated workers

That is a Chromium implementation detail. Some other browser engines do support workers in bfcache.

I think we should just advise developers to re-acquire the lock manually on BFCache restore.

That sounds rather bad model. The whole point of bfcache is that devs don't really need to care about it.

@jesup
Copy link
Contributor

jesup commented May 23, 2022

There's one way we could allow pages using OPFS into the bfcache:
If a page using OPFS is in the bfcache, and another page tries to use the same files in a way that conflicts with the bfcached page (*), the page is evicted from the BFCache.

  • If the BFCached page has a SyncAccessHandle to a file, then getting a SyncAccessHandle or a WritableFileStream to that file causes eviction
  • If the BFCached page has a WritableFileStream to a file, then getting a SyncAccessHandle to the file causes eviction
  • If the BFCached page has a FileHandle/DirectoryHandle/SyncAccessHandle/WritableFileStream to a file/dir and that file/dir is deleted or moved, it causes eviction

See also web-platform-tests/wpt#31082 (hat tip to smaug)

@fergald
Copy link

fergald commented May 24, 2022

I think we should just advise developers to re-acquire the lock manually on BFCache restore.

That sounds rather bad model. The whole point of bfcache is that devs don't really need to care about it.

At the top of the issue, it says the locks are internal. The spec makes no mention of locks at all.

@jesup's comment sounds like one reasonable implementation and is a common pattern for BFCache but this all seems to be about implementation but how does it map to the spec?

What should happen if 2 pages call fileHandle.createWritable({ keepExistingData: true })` on the same file in different tabs? I haven't read the full spec but I can't find that addressed in it. I guess the locks discussed above mean that Chrome would make the second opener wait for the first opener to finish.

Is that something we want to spec? That would then give a pretty clear signal that you must evict a page that is in BFCache with a writeable file open.

If we're not willing to spec that writeable opens are exclusive across tabs then it's unclear to me that holding a filehandle should prevent BFCaching.

@jesup
Copy link
Contributor

jesup commented May 24, 2022

What should happen if 2 pages call fileHandle.createWritable({ keepExistingData: true })` on the same file in different tabs?

They take read locks, and so they both could have them; both createWritable's would succeed. (They would block any attempt to createSyncAccessHandle, at least until the writables are close()ed.)

If a tab with a writable is in the cache, and another tab calls createWritable for that file, that would succeed in my plan, and you would not have to evict the BFCached page. If the tab called createSyncAccessHandle, you would have to evict, since you can't have a SyncAccessHandle with any other SyncAccessHandle or WritableFileStream to the same file.

@fergald
Copy link

fergald commented May 25, 2022

Thanks for the clarification. Sound good.

@a-sully
Copy link
Collaborator Author

a-sully commented May 27, 2022

Overall SGTM if we can figure out the edge cases. There's an open issue to better spec locking mechanisms (#18). We're planning to expand the use of locking soon (in the upcoming move() and remove() methods, as well as other possible write modes) so the language should probably be a bit more generic:

  • If the BFCached page has an exclusive lock to a handle, then acquiring any lock to that handle causes eviction
  • If the BFCached page has a shared lock to a handle, then acquiring an exclusive lock to that handle file causes eviction

That being said, there are some edge cases here I'm not sure if I have enough knowledge to understand the full implications of. For example...

The read() and write() methods are synchronous, meaning they block the DedicatedWorker while they run. I created a demo site to test what happens when we navigate away from the page in the middle of a write: https://access-handle-navigation-test.glitch.me/. It appears that the write operation completes successfully (at least on my machine) and does not prevent navigation. I'm not sure exactly what the lifetime of a DedicatedWorker is relative to the site it serves, but it seems that even if we evict the page, the handle (and therefore lock? perhaps depending on the implementation?) may still be around? I'm also not sure whether a beforePutToBFcache event (as proposed in whatwg/html#7216) would actually fire in this case?...

I assume navigating away from a page during long-running async operations (like recursively moving a directory) might have similar problems, even if they're issued from a window? Are there other APIs we can look towards as precedent here?

@jesup
Copy link
Contributor

jesup commented May 31, 2022

@a-sully -- please open a separate issue for speccing the interaction between navigation and aync operations

@a-sully
Copy link
Collaborator Author

a-sully commented Jun 7, 2022

The interaction between navigation and sync operations seems like something that's not specific to this API? Presumably other APIs which run code synchronously from workers encounter the same issues?

@a-sully
Copy link
Collaborator Author

a-sully commented Jun 7, 2022

I agree this is something that should be specified, but I'm not sure if this is the right repo for it. @annevk may know?

@annevk
Copy link
Member

annevk commented Jun 8, 2022

@rakina and @fergald can advice on this I think, but in general you want to define whether unloading a document ends up impacting its salvageable state. To do this you would define unloading document cleanup steps. Generally these are defined outside HTML at the moment, except for some APIs. At some point it might make sense to centralize them if we run into ordering issues.

@jesup
Copy link
Contributor

jesup commented Jun 8, 2022

So this case is a little different than unloading with WebSockets, etc. We (probably) want to maintain the file locks and filehandles while cached (salvagable), but if that would impact any other page's actions, we'd evict it (close all the actors/filehandles/drop locks).
How I would phrase it in a spec would be something along the lines of "A document using this API can be salvagable, but must become not salvagable if it's existence would be visible to any active document". I.e. if a page's existence in a cache would cause any observable change in state to an active page, it must be evicted. Note that this eviction does not happen at the time of unload, so the unloading cleanup steps wouldn't be the place to clean things out.
You could implement it like that (keep locks/etc alive in bfcache); you could also drop them all, and on an attempt to salvage, throw the document away if all the state can't be re-acquired at that time. That might slightly increase the likelihood of a page being retrieved from the bfcache (but only slightly), and it might also be slightly more work (but this may be arguable; both solutions involve some coding).

@annevk
Copy link
Member

annevk commented Jun 8, 2022

Ah right, I don't think we have a good hook for "this action ends up destroying a bfcache entry, if any". @domenic might also be able to comment as to whether that's something that will be introduced as part of the ongoing bfcache work.

@jesup
Copy link
Contributor

jesup commented Jun 10, 2022

BTW, I think the idea of dropping and re-acquiring can lead to subtle deviations from what could happen if it hadn't gone into the cache, so I would avoid that, and keep the locks (and drop the page on any attempt that would conflict with the locks)

@rakina
Copy link
Member

rakina commented Jun 16, 2022

Just FYI, BFCache people are talking about adding a way to spec eviction cases like in here easily at whatwg/html#7253 (comment), so hopefully the spec changes here won't be blocked by that soon enough.

On navigation & synchronous operations, I guess most APIs only affect things within its own document, while this API have external side effects. It might be good to look at how APIs that have external effects (IndexedDB? not sure if they have sync operations) behave.

@jesup
Copy link
Contributor

jesup commented Mar 25, 2023

For unloading document cleanup steps:
"A document using this API must become not salvagable if it's existence would be visible to any active document. This would happen if this document has a FileSystemSyncAccessHandle to an object, and another document were to attempt to access that object using this API." (and maybe list the other ways it can happen.)
@annevk any wordsmithing ideas?

@rakina
Copy link
Member

rakina commented Mar 28, 2023

For unloading document cleanup steps:
"A document using this API must become not salvagable if it's existence would be visible to any active document. This would happen if this document has a FileSystemSyncAccessHandle to an object, and another document were to attempt to access that object using this API." (and maybe list the other ways it can happen.)

I think you mentioned in #17 (comment) that you want this to not prevent the page from being BFCached, and instead only "evict" a BFcached/non-fully active document when it can impact other pages, which I think is the right way to do this. To spec the eviction, I think you can follow whatwg/html#8972. See also the guides at https://w3ctag.github.io/bfcache-guide/.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

7 participants