Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement parallel CSS parsing #20721

Open
jdm opened this issue Apr 30, 2018 · 13 comments · May be fixed by #22478
Open

Implement parallel CSS parsing #20721

jdm opened this issue Apr 30, 2018 · 13 comments · May be fixed by #22478

Comments

@jdm
Copy link
Member

@jdm jdm commented Apr 30, 2018

Firefox did this in https://bugzilla.mozilla.org/show_bug.cgi?id=1346988 and https://bugzilla.mozilla.org/show_bug.cgi?id=1455115; we should look into doing the same for Servo.

@mandreyel
Copy link
Contributor

@mandreyel mandreyel commented Nov 12, 2018

I'm interested in trying my hand at this, but I may be biting off more than I can handle. Would anyone be available for potential mentoring?

@emilio
Copy link
Member

@emilio emilio commented Nov 13, 2018

I can try if you don't mind a bit of lag in potentially replying to questions in Servo issues (I get a lot of GitHub Servo and don't check so often, as it can be seen).

But you can always catch me on irc (emilio there as well) or mail me directly if you're blocked on something :)

@mandreyel
Copy link
Contributor

@mandreyel mandreyel commented Nov 14, 2018

@emilio Awesome, thank you! I'll contact you in a bit!

@mandreyel
Copy link
Contributor

@mandreyel mandreyel commented Nov 28, 2018

@emilio Here's what I have so far. If the parsing of a stylesheet were moved off the script thread, we'd need a way to tell its document to fetch imported sheets on the event-loop. I can see that Gecko dispatches the import sheet load to the main thread, but it seems to me that this introduces a lot of complexity in Servo. Please correct me if I'm wrong and this is not in fact difficult to accomplish.
However, based on your observation that imports are only allowed at the start of a sheet, I think we could split it into an initial import and a main section, parse the "import section" on the script thread (and initiate the fetches for them), and parse the rest of the sheet on a thread pool. We could then make use of the existing network listener/task source infrastructure to get the parsed sheet back to its document. What do you think? Does this sound reasonable?

I hope I'll have time to work on this on the weekend. @highfive assign me please.

@highfive highfive added the C-assigned label Nov 28, 2018
@highfive
Copy link

@highfive highfive commented Nov 28, 2018

Hey @mandreyel! Thanks for your interest in working on this issue. It's now assigned to you!

@jdm
Copy link
Member Author

@jdm jdm commented Nov 28, 2018

In order to initiate an action on a script thread from another thread, all that is necessary is:

Then the new variant needs to contain all of the data necessary to perform the action - that might be a PipelineId, which can be used to obtain a document, and a list of stylesheet URLs that need to be fetched.

@mandreyel
Copy link
Contributor

@mandreyel mandreyel commented Nov 28, 2018

@jdm Ohhhh, that makes my life so much simpler! Thanks 😄

@emilio
Copy link
Member

@emilio emilio commented Nov 28, 2018

However, based on your observation that imports are only allowed at the start of a sheet, I think we could split it into an initial import and a main section, parse the "import section" on the script thread (and initiate the fetches for them), and parse the rest of the sheet on a thread pool. We could then make use of the existing network listener/task source infrastructure to get the parsed sheet back to its document. What do you think? Does this sound reasonable?

I don't think this is needed. This is needed in Gecko because import rules need to create C++'s StyleSheet objects, which are cycle-collected main-thread-only objects, and because we can't trigger loads off-main-thread. Servo may or may not be able to do it.

What's the particularly complex part about it? I'd prefer if both Gecko and Servo were consistent on the approach, I don't particularly mind changing Gecko but that means carrying the offset in the source where the rest of the rules should start, and having a partial rule-list and so on which is not amazing.

@mandreyel
Copy link
Contributor

@mandreyel mandreyel commented Nov 29, 2018

@emilio Apologies, I should have edited my post to say that the second paragraph is no longer relevant since jdm's suggestion revealed the first idea is in fact pretty straight forward to implement.

@mandreyel
Copy link
Contributor

@mandreyel mandreyel commented Dec 16, 2018

Apologies, was swamped with shipping code at work, but I cooked up something on the weekend... It's mostly done, I think, but unfortunately import sheets are applied out-of-order. This is the specific test case that's failing:


  ▶ Unexpected subtest result in /_mozilla/mozilla/out-of-order-stylesheet-loads-and-imports.html:
  │ FAIL [expected PASS] out-of-order stylesheet loads for the same element happen correctly, even with imports
  │   → assert_equals: expected "rgb(0, 128, 0)" but got "rgba(0, 0, 0, 0)"

Does Servo have some mechanisms in place to ensure correct stylesheet load order (since import sheet loads are already partly async)?

@emilio
Copy link
Member

@emilio emilio commented Dec 16, 2018

So, fwiw that test is racy (shouldn't be using setTimeout to wait for network events). But yes, the request_generation_id is what should ensure that only the last stylesheet gets applied to the page.

Looks like with your changes we're not applying any of both? (if the first stylesheet was applying it'd show red instead of black transparent).

In any case, you should probably open a PR already so you can run the tests on automation instead of your machine, even if it's still a WIP :)

@mandreyel
Copy link
Contributor

@mandreyel mandreyel commented Dec 16, 2018

@emilio Thanks for the info.

I think the problem may have been that I verified the request_generation_id before spawning the parser thread, and that may have introduced some race conditions.
I moved the request_generation_id check to where I apply the sheet on the script thread after parsing, and reran that specific test case a few times just now and it seems to pass now (though as you say then, this might not be a full guarantee).

Ok, I'll open a PR in a moment :)

@mandreyel mandreyel linked a pull request that will close this issue Dec 16, 2018
4 of 5 tasks complete
bors-servo added a commit that referenced this issue Dec 17, 2018
[WIP]  Implement parallel CSS parsing

<!-- Please describe your changes on the following line: -->
Opening for discussion.

The basic idea is to spawn a parser thread on a global thread-pool (like in Gecko) with an `AsyncStylesheetLoader` (instead of the previous `StylesheetLoader`), used when encountering an `@import` rule to send a task to script thread from the parser thread, which simply instantiates an async fetch on the corresponding document. After the parsing is done, another task is sent to script thread to do the post-parsing steps, like setting the parsed stylesheet on its link element (if applicable), firing the load event, etc.

A problem was that verifying the request generation id before parsing (still on the script thread) introduced a race condition, so I moved that to the post-parse task run on the script thread. This (seems to have) solved the race condition, but it differs in semantics from the sync parsing in that it's now no longer checked before doing the parsing, but after, which incurs an unnecessary parse overhead if the sheet is not applicable (I think).

Also, there seems to be an occasional time-out in one of the CSS tests, and the code needs some cleaning up.

---
<!-- Thank you for contributing to Servo! Please replace each `[ ]` by `[X]` when the step is complete, and replace `___` with appropriate data: -->
- [X] `./mach build -d` does not report any errors
- [X] `./mach test-tidy` does not report any errors
- [X] These changes fix #20721 (GitHub issue number if applicable)

<!-- Either: -->
- [ ] There are tests for these changes OR
- [X] These changes do not require tests because existing CSS tests should cover them.

<!-- Also, please make sure that "Allow edits from maintainers" checkbox is checked, so that we can help you if you get stuck somewhere along the way.-->

<!-- Pull requests that do not address these steps are welcome, but they will require additional verification as part of the review process. -->

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/servo/22478)
<!-- Reviewable:end -->
@emilio
Copy link
Member

@emilio emilio commented Dec 17, 2018

Oh I had not read that comment. Yeah the request_generation_id check needs to be on script and after parsing. Glad that fixed it :)

bors-servo added a commit that referenced this issue Dec 17, 2018
[WIP]  Implement parallel CSS parsing

<!-- Please describe your changes on the following line: -->
Opening for discussion.

The basic idea is to spawn a parser thread on a global thread-pool (like in Gecko) with an `ThreadSafeStylesheetLoader` (instead of the previous `StylesheetLoader`), used when encountering an `@import` rule to send a task to script thread from the parser thread, which simply instantiates an async fetch on the corresponding document. After the parsing is done, another task is sent to script thread to do the post-parsing steps, like setting the parsed stylesheet on its link element (if applicable), firing the load event, etc.

A problem was that verifying the request generation id before parsing (still on the script thread) introduced a race condition, so I moved that to the post-parse task run on the script thread. This (seems to have) solved the race condition, but it differs in semantics from the sync parsing in that it's now no longer checked before doing the parsing, but after, which incurs an unnecessary parse overhead if the sheet is not applicable (I think).

Also, there seems to be an occasional time-out in one of the CSS tests, and the code needs some cleaning up.

---
<!-- Thank you for contributing to Servo! Please replace each `[ ]` by `[X]` when the step is complete, and replace `___` with appropriate data: -->
- [X] `./mach build -d` does not report any errors
- [X] `./mach test-tidy` does not report any errors
- [X] These changes fix #20721 (GitHub issue number if applicable)

<!-- Either: -->
- [ ] There are tests for these changes OR
- [X] These changes do not require tests because existing CSS tests should cover them.

<!-- Also, please make sure that "Allow edits from maintainers" checkbox is checked, so that we can help you if you get stuck somewhere along the way.-->

<!-- Pull requests that do not address these steps are welcome, but they will require additional verification as part of the review process. -->

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/servo/22478)
<!-- Reviewable:end -->
bors-servo added a commit that referenced this issue Dec 19, 2018
[WIP]  Implement parallel CSS parsing

<!-- Please describe your changes on the following line: -->
Opening for discussion.

The basic idea is to spawn a parser thread on a global thread-pool (like in Gecko) with an `ThreadSafeStylesheetLoader` (instead of the previous `StylesheetLoader`), used when encountering an `@import` rule to send a task to script thread from the parser thread, which simply instantiates an async fetch on the corresponding document. After the parsing is done, another task is sent to script thread to do the post-parsing steps, like setting the parsed stylesheet on its link element (if applicable), firing the load event, etc.

A problem was that verifying the request generation id before parsing (still on the script thread) introduced a race condition, so I moved that to the post-parse task run on the script thread. This (seems to have) solved the race condition, but it differs in semantics from the sync parsing in that it's now no longer checked before doing the parsing, but after, which incurs an unnecessary parse overhead if the sheet is not applicable (I think).

Also, there seems to be an occasional time-out in one of the CSS tests, and the code needs some cleaning up.

---
<!-- Thank you for contributing to Servo! Please replace each `[ ]` by `[X]` when the step is complete, and replace `___` with appropriate data: -->
- [X] `./mach build -d` does not report any errors
- [X] `./mach test-tidy` does not report any errors
- [X] These changes fix #20721 (GitHub issue number if applicable)

<!-- Either: -->
- [ ] There are tests for these changes OR
- [X] These changes do not require tests because existing CSS tests should cover them.

<!-- Also, please make sure that "Allow edits from maintainers" checkbox is checked, so that we can help you if you get stuck somewhere along the way.-->

<!-- Pull requests that do not address these steps are welcome, but they will require additional verification as part of the review process. -->

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/servo/22478)
<!-- Reviewable:end -->
bors-servo added a commit that referenced this issue Dec 24, 2018
[WIP]  Implement parallel CSS parsing

<!-- Please describe your changes on the following line: -->
Opening for discussion.

The basic idea is to spawn a parser thread on a global thread-pool (like in Gecko) with an `ThreadSafeStylesheetLoader` (instead of the previous `StylesheetLoader`), used when encountering an `@import` rule to send a task to script thread from the parser thread, which simply instantiates an async fetch on the corresponding document. After the parsing is done, another task is sent to script thread to do the post-parsing steps, like setting the parsed stylesheet on its link element (if applicable), firing the load event, etc.

A problem was that verifying the request generation id before parsing (still on the script thread) introduced a race condition, so I moved that to the post-parse task run on the script thread. This (seems to have) solved the race condition, but it differs in semantics from the sync parsing in that it's now no longer checked before doing the parsing, but after, which incurs an unnecessary parse overhead if the sheet is not applicable (I think).

Also, there seems to be an occasional time-out in one of the CSS tests, and the code needs some cleaning up.

---
<!-- Thank you for contributing to Servo! Please replace each `[ ]` by `[X]` when the step is complete, and replace `___` with appropriate data: -->
- [X] `./mach build -d` does not report any errors
- [X] `./mach test-tidy` does not report any errors
- [X] These changes fix #20721 (GitHub issue number if applicable)

<!-- Either: -->
- [ ] There are tests for these changes OR
- [X] These changes do not require tests because existing CSS tests should cover them.

<!-- Also, please make sure that "Allow edits from maintainers" checkbox is checked, so that we can help you if you get stuck somewhere along the way.-->

<!-- Pull requests that do not address these steps are welcome, but they will require additional verification as part of the review process. -->

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/servo/22478)
<!-- Reviewable:end -->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

4 participants
You can’t perform that action at this time.