Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve spec compliance of Blob Url store: resolve entry as part of URL parsing #25226

Open
gterzian opened this issue Dec 10, 2019 · 7 comments
Open

Comments

@gterzian
Copy link
Member

@gterzian gterzian commented Dec 10, 2019

We've got a couple of wpt failures that are the result of a "race condition" of sorts between revoking and resolving a url.

[url-with-fetch.any.worker.html]
  [Revoke blob URL after creating Request, will fetch]
    expected: FAIL


[url-with-fetch.any.html]
  [Revoke blob URL after creating Request, will fetch]
    expected: FAIL

[url-in-tags-revoke.window.html]
  expected: TIMEOUT
  [Fetching a blob URL immediately before revoking it works in an iframe.]
    expected: FAIL

  [Fetching a blob URL immediately before revoking it works in an iframe navigation.]
    expected: FAIL

  [Opening a blob URL in a new window immediately before revoking it works.]
    expected: TIMEOUT

  [Opening a blob URL in a noopener about:blank window immediately before revoking it works.]
    expected: FAIL

  [Opening a blob URL in a new window by clicking an <a> tag works immediately before revoking the URL.]
    expected: FAIL

And there a few new intermittents:

  ▶ Unexpected subtest result in /FileAPI/url/url-with-fetch.any.worker.html:
  │ FAIL [expected PASS] Revoke blob URL after calling fetch, fetch should succeed
  └   → promise_test: Unhandled rejection with value: object "TypeError: Network error occurred"

  ▶ Unexpected subtest result in /FileAPI/url/url-in-tags-revoke.window.html:
  │ TIMEOUT [expected PASS] Fetching a blob URL immediately before revoking it works in <script> tags.
  └   → Test timed out

What happens in all these cases is that the resolving of the blob url happens "too late", by the time the entry has already been revoked.

"Resolving" the blob url is currently done inside scheme_fetch whereas the URL spec says this should be done as part of URL parsing, at https://url.spec.whatwg.org/#ref-for-concept-url-blob-entry.

So that's why those WPT tests expect the fetch to succeed if for example the request is created before revoking the blob url, since per the spec by then the blob url should already have been resolved, even though fetch hasn't even started yet.

In the other cases, the intermittency is due to the fact that a fetch is started on a thread-pool, while the revoking is done immediately when the FileManagerThreadMsg::RevokeBlobURL is handled by the filemanager.

So in a given task on a script event-loop, with the following operations:

  1. Start a fetch with blob url,
  2. Revoke blob url.

While the two IPC messages to the net components are sent and received in order, sometimes that fetch, running on the fetch thread pool, resolves the blob url before the subsequent revoke message is handled by the "main thread" of the net component where the filemanager is running, and sometimes it doesn't.

I think this requires changing the resolve/revoke model of the FileManagerStore, where resolving means script acquiring some sort of token from net(perhaps via a sync IPC call as in RevokeObjectURL), which would later be used in a scheme fetch and give it access to the underlying FileImpl even after the URL would have been revoked. With revoking meaning you can't acquire anymore tokens. And you would only remove the file impl from the store if they are no more outstanding tokens.

Spec for the general concept of Blob URL store: https://w3c.github.io/FileAPI/#url-model

@pshaughn
Copy link
Member

@pshaughn pshaughn commented Feb 5, 2020

I just noticed this issue now after having thought about this on a parallel track; #21517 and #13767 are both at least somewhat duplicates. My thoughts in #21517 are similar to the above, but more influenced by the way the spec treats the "parsed url" as an object in itself and has that object responsible for the blob reference.

@gterzian
Copy link
Member Author

@gterzian gterzian commented Feb 6, 2020

Yes I think you're right that the spec assumes that once you've parsed the url, you own the blob. In practice this "owning" will have to be implemented through some async cross-process workflow, for example as sketched above where you'd acquire a token representing ownership of the blob.

Currently the cross-process workflow is broken in the sense that the blob url can be revoked even though it has been parsed or rather "is being parsed", and the parsing started before the revoking, both from the same script-thread, so the revocation should not be able to affect the parsing that started before it.

While the two messages, one for parsing, one for revoking, are received in order, the actual parsing/resolving is done on a background-thread, which means it's end-up racing with the handling of the subsequent "revoke" message, which is handled fully on the "main-thread" of the file manager.

So basically, we're loosing the ordering of operations that normally would be guaranteed by the channel, because the handling of first "resolve" is forked-off to another thread, without somehow first marking the blob-url as "owned" by the workflow(so then the second message is able to revoke it in parallel to the resolve workflow, depending on the progress already made in parallel)

@pshaughn
Copy link
Member

@pshaughn pshaughn commented Feb 6, 2020

Questions to think about: Do we need the actual Blob object on the other side of the handoff, or just the bytes from inside it and the MIME type? Would holding onto the latter with a reference count be easier than the former?

@gterzian
Copy link
Member Author

@gterzian gterzian commented Feb 10, 2020

Yes I think we only need the bytes, not the actual blob object. The object to keep alive is the FileStoreEntry.

So there is already some reference counting going on, in response to FileManagerThreadMsg::RevokeBlobURL and FileManagerThreadMsg::ActivateBlobURL.

And the problem is that the file store entry can be removed after the blob url has been resolved, in the code, whereas the spec expect the blob to remain available for the fetch for which it has resolved(but it should be revoked for other fetches from the point of revoking on).

Resolving of the blob url happens in the spec when the url is parsed, which is early, for example already when the request is created. https://url.spec.whatwg.org/#ref-for-concept-url-blob-entry

Our current implementation only does the resolving of the url in Fetch, at

if let Err(err) = context.filemanager.fetch_file(

So the tests that create a request(resulting in parsing the blob url), and then revoke the blob url, still expect the request to be successful(because the blob url should already have been resolved). So we definitely fail those, because we only do the resolving much later in fetch, and then the blob url has already been revoked.

Then there are some intermittent tests, where the revoking is done not right after creating the request but after already having started it. There they can pass because the file manager will handle both operations in order, however the fetch is started on a separate thread, so the second message can be handled by the filemanager(and the entry removed), before the fetch in the other thread has had a chance to resolve the blob(which should actually have been done much before).

So the current reference counting works for the lifetime of the blob url(which should indeed be revoked), but it doesn't work for the lifetime of a fetch that is using the blob that is the result of resolving the blob url(we only need the bytes indeed). We could just increase the count when the spec says to resolve the blob, maybe, but then we would also have to decrease either when the fetch is finished or aborted(or never started and the request garbage collected without having been used?).

So basically the lifetime of the FileStoreEntry needs to be adjusted to better match the fact that the spec expects the blob to be available to a fetch once the blob url has been "resolved" for it, which actually happens very early.

But I don't think we can just increment the current count to mark it as resolved, because in the current setup that would be like creating another blob url for the same blob. So perhaps we need a separate ref count, to take into account blob urls that are resolved, to keep the associated bytes alive even if the url itself is revoked, but only for the duration of the actual fetch which did the resolving...

One could also grab the bytes before the fetch is actually started in another thread, and just pass it to the fetch(maybe that's what you meant?), however that would only solve those intermittent tests, not the ones that expect actually creating a request in fetch to also resolve the blob url(and I think to determine which bytes to grab, and whether to grab them at all, would require first running some fetch logic, so it might be somewhat complicated)

@gterzian gterzian mentioned this issue Feb 11, 2020
0 of 2 tasks complete
@gterzian
Copy link
Member Author

@gterzian gterzian commented Feb 11, 2020

Note that I think it might make sense to first do #25723

Since tha could fix the intermittent test part of this one. It could also make it simpler to then "resolve the blob url when the url is parsed", since the FileManager would not be multithreaded like it is now.

@gterzian
Copy link
Member Author

@gterzian gterzian commented Feb 12, 2020

I'm trying to address this partially at #25724

@gterzian gterzian mentioned this issue Feb 13, 2020
0 of 5 tasks complete
@gterzian
Copy link
Member Author

@gterzian gterzian commented Feb 13, 2020

Ok I've made a first step at #25740 (not #25724, which is now only dealing with the threadpool used by the resource manager).

I see that there are a bunch of tests in /FileAPI/url/ which will require acquiring this token much earlier, for example when a request is first created, and when a navigation starts. But that PR is a step in that direction, I've added a note at: d08f888#diff-fb86627f1b3931ce6f2b706cd6603957R602

(the navigation will be interesting, because it goes to fetch somewhat "later" after having gone through the constellation first, so will obviously be beaten to it by the "revoke" message which goes straight to the resource manager. So that's not as simple as with a normal fetch where both messages go to the resource manager directly but the fetch was spawned on an additional thread. Also, with navigation, it becomes a question of where to store the token while we go through the entire navigation flow, probably just on whatever messages are sent around for that).

There are also a bunch of tests dealing with iframe and auxiliary browsing contexts...

bors-servo added a commit that referenced this issue Feb 25, 2020
…hearth

Per fetch file token for blob url

<!-- Please describe your changes on the following line: -->

Depends on #25724

First step of #25226

---
<!-- Thank you for contributing to Servo! Please replace each `[ ]` by `[X]` when the step is complete, and replace `___` with appropriate data: -->
- [ ] `./mach build -d` does not report any errors
- [ ] `./mach test-tidy` does not report any errors
- [ ] These changes fix #___ (GitHub issue number if applicable)

<!-- Either: -->
- [ ] There are tests for these changes OR
- [ ] These changes do not require tests because ___

<!-- Also, please make sure that "Allow edits from maintainers" checkbox is checked, so that we can help you if you get stuck somewhere along the way.-->

<!-- Pull requests that do not address these steps are welcome, but they will require additional verification as part of the review process. -->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants
You can’t perform that action at this time.