-
Notifications
You must be signed in to change notification settings - Fork 608
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speed-up subset test suite #3089
Comments
Currently |
Tentatively assigning to @garretrieger; but @khaledhosny feel free to try since you're on a roll! Thank you. |
The subset tests also use FontTools to compare the fonts (saving to ttx XML first then diffing it), we can probably speed it by saving the expected output in XML format instead of generating it each time. This has the downside of having to update the expected results each time FontTools changes its output. |
Yeah I keep forgetting what the subset test suite does. It has expected binary from fonttools saved? Then converts that and hb-subset output to ttx to compare? So if we save XML instead, that speeds things up a bit. And I like it better honestly as is readable. The update issue is fine, we'll deal with it. |
I think we're near-enough to a point to be able to generate ttx-like dump of font using hb itself... That would help a lot. |
If anything, making the test-runner a bit more verbose would be great. Right now each test takes multiple seconds with no further details. |
Yes. |
Speed-up subset tests by saving TTX dump of expected output instead of generating it each time the tests are run. Cuts down meson test --suite=subset on my system from: real 0m38.977s user 1m12.024s sys 0m10.547s to: real 0m22.291s user 0m44.548s sys 0m9.221s Part of #3089
Speed-up subset tests by saving TTX dump of expected output instead of generating it each time the tests are run. Cuts down meson test --suite=subset on my system from: real 0m38.977s user 1m12.024s sys 0m10.547s to: real 0m22.291s user 0m44.548s sys 0m9.221s Part of #3089
Thanks Khaled. That definitely helped. Next: the subset tests are a matrix of cartesian product of fonts x profiles x text. For example: harfbuzz/test/subset/data/tests/cff-japanese.tests Lines 1 to 19 in 5d283aa
That's one font, 8 profiles, 5 texts, for a total of 40 subset tests. I think we should sparse this up. We don't need to test every combination. |
time meson test --suite=subset down from: real 0m22.822s user 0m44.561s sys 0m9.255s to: real 0m19.418s user 0m38.171s sys 0m3.587s Does not seem to help much, but it is something. Part of #3089
I’ll leave this for someone who is actually familiar with how subset and its test suite works. |
These two tests are currently the biggest offenders, all other tests are under 2 seconds and most under 1 second:
Breaking them up so they can be parallelized probably helps, even if we keep testing all the profiles. |
Splitting basics test into 3 tests one for each font does not seem to help either, most of the time is spent working on the 3rd font (NanumMyeongjo-Regular) anyway. |
Correction, most of the time is spent on Comfortaa-Regular-new.ttf. |
Most of the time (15 out of 17 seconds) is spent in FontTools dumping the subset font. |
time meson test --suite=subset down from: real 0m22.822s user 0m44.561s sys 0m9.255s to: real 0m19.418s user 0m38.171s sys 0m3.587s Does not seem to help much, but it is something. Part of #3089
time meson test --suite=subset down from: real 0m22.822s user 0m44.561s sys 0m9.255s to: real 0m19.418s user 0m38.171s sys 0m3.587s Does not seem to help much, but it is something. Part of #3089
Thanks Khaled! |
time meson test --suite=subset down from: real 0m22.822s user 0m44.561s sys 0m9.255s to: real 0m19.418s user 0m38.171s sys 0m3.587s Does not seem to help much, but it is something. Part of #3089
time meson test --suite=subset down from: real 0m22.822s user 0m44.561s sys 0m9.255s to: real 0m19.418s user 0m38.171s sys 0m3.587s Does not seem to help much, but it is something. Part of #3089
Also:
|
Although, it slows down the entire suite. Which is expected, since only long-running processes benefit from pypy.
|
What if we tried to make some shortcuts, e.g. check file checksum first and if it matches skip the ttx dump? |
I believe the binary output between fonttools and harfbuzz is rarely going to be exactly the same, so not sure if that will save much. |
But we can save the HarfBuzz output as the expected one (we don’t use FontTools subsetter when running the tests, only when adding the tests for the first time). I.e. manually verify that FontTools and HarfBuzz produce equivalent output, store HarfBuzz result in the repository and test against it moving forward. |
good idea, yeah that should work. |
This will require reverting the commits that saved the XML files in the repositoiry, but with this change:
|
We can even make |
That's BRILLIANT! Do it please. |
Yep. Maybe add a Maybe also add a |
If the two binaries differ that should already fail. Our output should be deterministic, and we'll change the expected files every time we make a change that would change them. When things fail we can ttx both and if they compare equal, fail by suggesting expected file to be regenerated... |
We need to documeent the version of FontTools used to generate the current expectation, I’m seeing diffs like this:
|
Ah right, there was a change in fonttools recently to omit cmap12 if possible. We haven't yet brought that over to hb subset. For generating the test cases we've been using fontTools with that change (fonttools/fonttools#2146) patched out |
I'll take a look and see if I can make that change in hb now that some of the cmap stuff has been cleaned up. |
Thanks, I reverted that change locally. We can update the subsetter and test results later. |
Next is to see why the fuzzer test takes 13 seconds by itself (almost 1.5 times the the rest of the tests). |
Interestingly, the dist-check CI job used to take 13 minutes in average, now it takes 7 minutes, I wasn’t expecting a change in CI times. |
I suggest adding
--batch
tohb-subset
, likehb-shape
has. That should immensely help.The text was updated successfully, but these errors were encountered: