Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache varstore #3605

Merged
merged 9 commits into from
May 23, 2022
Merged

Cache varstore #3605

merged 9 commits into from
May 23, 2022

Conversation

behdad
Copy link
Member

@behdad behdad commented May 20, 2022

Fixes issue originally raised in #2878 (comment)

Hi Jonathan,

Can you test this in Firefox and see if it helps with variable font performance?

Thanks

@behdad
Copy link
Member Author

behdad commented May 21, 2022

I see 4% speedup in shaping of a simple variable font, and up to 32% speedup in shaping of RobotoFlex, which is a heavy variable font. Even for shaping of single words, 19% speedup. Non-variable font shaping performance is unaffected. These numbers are with hb font-funcs.

behdad:hb 130 (main*)$ ninja -Cbuild && build/perf/benchmark-shape perf/texts/en-thelittleprince.txt RobotoFlex.ttf perf/texts/en-words.txt RobotoFlex.ttf --benchmark_out=before
---------------------------------------------------------------------------------------------
Benchmark                                                   Time             CPU   Iterations
---------------------------------------------------------------------------------------------
BM_Shape/en-thelittleprince.txt/RobotoFlex.ttf           7.55 ms         7.54 ms           94
BM_Shape/en-thelittleprince.txt/RobotoFlex.ttf/var       22.9 ms         22.8 ms           31
BM_Shape/en-words.txt/RobotoFlex.ttf                     11.1 ms         11.1 ms           63
BM_Shape/en-words.txt/RobotoFlex.ttf/var                 28.3 ms         28.2 ms           25

behdad:hb 0 (cache-varstore)$ ninja -Cbuild && build/perf/benchmark-shape perf/texts/en-thelittleprince.txt RobotoFlex.ttf perf/texts/en-words.txt RobotoFlex.ttf --benchmark_out=after
---------------------------------------------------------------------------------------------
Benchmark                                                   Time             CPU   Iterations
---------------------------------------------------------------------------------------------
BM_Shape/en-thelittleprince.txt/RobotoFlex.ttf           7.56 ms         7.54 ms           92
BM_Shape/en-thelittleprince.txt/RobotoFlex.ttf/var       15.4 ms         15.4 ms           45
BM_Shape/en-words.txt/RobotoFlex.ttf                     11.6 ms         11.6 ms           61
BM_Shape/en-words.txt/RobotoFlex.ttf/var                 22.8 ms         22.8 ms           30

behdad:hb 2 (cache-varstore)$ subprojects/benchmark-1.6.0/tools/compare.py benchmarks before after
Comparing before to after
Benchmark                                                            Time             CPU      Time Old      Time New       CPU Old       CPU New
-------------------------------------------------------------------------------------------------------------------------------------------------
BM_Shape/en-thelittleprince.txt/RobotoFlex.ttf                    +0.0009         +0.0007             8             8             8             8
BM_Shape/en-thelittleprince.txt/RobotoFlex.ttf/var                -0.3248         -0.3252            23            15            23            15
BM_Shape/en-words.txt/RobotoFlex.ttf                              +0.0439         +0.0437            11            12            11            12
BM_Shape/en-words.txt/RobotoFlex.ttf/var                          -0.1932         -0.1935            28            23            28            23

Just looking at advance-width speedup, for simple variable font, 30% speedup, for RobotoFlex, 50%.

behdad:hb 130 (main)$ build/perf/benchmark-font --benchmark_filter=advance RobotoFlex-VariableFont_GRAD,XTRA,YOPQ,YTAS,YTDE,YTFI,YTLC,YTUC,opsz,slnt,wdth,wght.ttf --benchmark_out=before
------------------------------------------------------------------------------------------------------------------------------------------------------------------
Benchmark                                                                                                                        Time             CPU   Iterations
------------------------------------------------------------------------------------------------------------------------------------------------------------------
BM_Font/glyph_h_advances/RobotoFlex-VariableFont_GRAD,XTRA,YOPQ,YTAS,YTDE,YTFI,YTLC,YTUC,opsz,slnt,wdth,wght.ttf/hb           1.86 us         1.85 us       380778
BM_Font/glyph_h_advances/RobotoFlex-VariableFont_GRAD,XTRA,YOPQ,YTAS,YTDE,YTFI,YTLC,YTUC,opsz,slnt,wdth,wght.ttf/ft           14.5 us         14.5 us        48358
BM_Font/glyph_h_advances/RobotoFlex-VariableFont_GRAD,XTRA,YOPQ,YTAS,YTDE,YTFI,YTLC,YTUC,opsz,slnt,wdth,wght.ttf/var/hb        152 us          152 us         4607
BM_Font/glyph_h_advances/RobotoFlex-VariableFont_GRAD,XTRA,YOPQ,YTAS,YTDE,YTFI,YTLC,YTUC,opsz,slnt,wdth,wght.ttf/var/ft        126 us          126 us         5552

behdad:hb 134 (main)$ build/perf/benchmark-font --benchmark_filter=advance RobotoFlex-VariableFont_GRAD,XTRA,YOPQ,YTAS,YTDE,YTFI,YTLC,YTUC,opsz,slnt,wdth,wght.ttf --benchmark_out=after
------------------------------------------------------------------------------------------------------------------------------------------------------------------
Benchmark                                                                                                                        Time             CPU   Iterations
------------------------------------------------------------------------------------------------------------------------------------------------------------------
BM_Font/glyph_h_advances/RobotoFlex-VariableFont_GRAD,XTRA,YOPQ,YTAS,YTDE,YTFI,YTLC,YTUC,opsz,slnt,wdth,wght.ttf/hb           1.87 us         1.87 us       374716
BM_Font/glyph_h_advances/RobotoFlex-VariableFont_GRAD,XTRA,YOPQ,YTAS,YTDE,YTFI,YTLC,YTUC,opsz,slnt,wdth,wght.ttf/ft           14.6 us         14.5 us        48117
BM_Font/glyph_h_advances/RobotoFlex-VariableFont_GRAD,XTRA,YOPQ,YTAS,YTDE,YTFI,YTLC,YTUC,opsz,slnt,wdth,wght.ttf/var/hb       76.3 us         76.2 us         9143
BM_Font/glyph_h_advances/RobotoFlex-VariableFont_GRAD,XTRA,YOPQ,YTAS,YTDE,YTFI,YTLC,YTUC,opsz,slnt,wdth,wght.ttf/var/ft        128 us          128 us         5521

behdad:hb 0 (main)$ subprojects/benchmark-1.6.0/tools/compare.py benchmarks before after
Comparing before to after
Benchmark                                                                                                                                 Time             CPU      Time Old      Time New       CPU Old       CPU New
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
BM_Font/glyph_h_advances/RobotoFlex-VariableFont_GRAD,XTRA,YOPQ,YTAS,YTDE,YTFI,YTLC,YTUC,opsz,slnt,wdth,wght.ttf/hb                    +0.0099         +0.0102             2             2             2             2
BM_Font/glyph_h_advances/RobotoFlex-VariableFont_GRAD,XTRA,YOPQ,YTAS,YTDE,YTFI,YTLC,YTUC,opsz,slnt,wdth,wght.ttf/ft                    +0.0052         +0.0054            14            15            14            15
BM_Font/glyph_h_advances/RobotoFlex-VariableFont_GRAD,XTRA,YOPQ,YTAS,YTDE,YTFI,YTLC,YTUC,opsz,slnt,wdth,wght.ttf/var/hb                -0.4988         -0.4989           152            76           152            76
BM_Font/glyph_h_advances/RobotoFlex-VariableFont_GRAD,XTRA,YOPQ,YTAS,YTDE,YTFI,YTLC,YTUC,opsz,slnt,wdth,wght.ttf/var/ft                +0.0157         +0.0159           126           128           126           128

@drott
Copy link
Collaborator

drott commented May 23, 2022

Thanks for doing this optimisation.

@behdad behdad merged commit d473397 into main May 23, 2022
@behdad behdad deleted the cache-varstore branch May 23, 2022 18:24
Copy link
Collaborator

@drott drott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤦‍♂️ Github ate my homework, I meant to send this review with minor comments but it was hanging in pending state and I didn't send it. LGTM with some remarks. Anyway, thanks again!

@@ -310,7 +311,7 @@ struct hmtxvmtx

unsigned int default_advance;

private:
public:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is visiblity changing here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

var_table is now accessed from hb-ot-font to access the VarStore to create the cache.

{
if (unlikely (region_index >= regionCount))
return 0.;

float *cached = nullptr;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: cached and cache as variable names in the same scope here read somewhat confusingly. Maybe cached_value?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Will do.

@@ -335,7 +340,7 @@ struct HVARVVAR

bool has_side_bearing_deltas () const { return lsbMap && rsbMap; }

protected:
public:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar question here, why does the visibility need to change - I didn't understand that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same. Need to reach out to the varStore. Can add accessors. But these are safe to access. In those cases we just make them public.

behdad added a commit that referenced this pull request May 24, 2022
@jfkthame
Copy link
Collaborator

jfkthame commented Jun 8, 2022

Hi @behdad - sorry to be so late responding to this, I've been out for a while (first PTO and then Covid)... trying to catch up on things. I tried a Firefox build with this applied, and can confirm that it appears to help. For a huge reflow (a 40MB text block) using the Apple system font, which is variable (but not very complex), it reduced the time spent under VariationDevice during GPOS pair-pos processing from around 500ms to 300ms. Obviously the specific gains will be depend on characteristics of the font being used, but in any case it's a nice win - thanks!

@behdad
Copy link
Member Author

behdad commented Jun 8, 2022

Thanks @jfkthame

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants