Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[cff2] Significantly slower than freetype #3559

Closed
behdad opened this issue Apr 29, 2022 · 1 comment · Fixed by #3574
Closed

[cff2] Significantly slower than freetype #3559

behdad opened this issue Apr 29, 2022 · 1 comment · Fixed by #3574

Comments

@behdad
Copy link
Member

behdad commented Apr 29, 2022

Our perf suite shows that our interpretation performance for variable CFF2 fonts is about 50% slower than freetype, even for the default master. I took a look, and I believe it's because of this: when interpretting blend operator, instead of just computing the blended value and storing that on the stack, we allocate a vector and store blend arguments in a vector, only to compute it later and free the vector. That amounts to a malloc/free per coordinate x/y for a variable font!!!

I'm not sure why this code was written that way, but that's completely unnecessary and unfortunate. The code involved is hb-cff2-interp-cs.hh::set_blends().

@behdad
Copy link
Member Author

behdad commented May 10, 2022

This proof of concept patch speeds up CFF2 evaluation by around 8%:

diff --git a/src/hb-cff-interp-common.hh b/src/hb-cff-interp-common.hh
index 2a9e60ffd..1f924d8b8 100644
--- a/src/hb-cff-interp-common.hh
+++ b/src/hb-cff-interp-common.hh
@@ -430,8 +430,8 @@ struct cff_stack_t
   unsigned int get_count () const { return count; }
   bool is_empty () const          { return !count; }
 
-  hb_array_t<const ELEM> get_subarray (unsigned int start) const
-  { return hb_array_t<const ELEM> (elements).sub_array (start); }
+  hb_array_t<const ELEM> sub_array (unsigned start, unsigned length) const
+  { return hb_array_t<const ELEM> (elements).sub_array (start, length); }
 
   private:
   bool error;
diff --git a/src/hb-cff2-interp-cs.hh b/src/hb-cff2-interp-cs.hh
index 60a29e1ca..767f23d0c 100644
--- a/src/hb-cff2-interp-cs.hh
+++ b/src/hb-cff2-interp-cs.hh
@@ -151,6 +151,20 @@ struct cff2_cs_interp_env_t : cs_interp_env_t<blend_arg_t, CFF2Subrs>
   void	 set_ivs (unsigned int ivs_) { ivs = ivs_; }
   bool	 seen_vsindex () const { return seen_vsindex_; }
 
+  double blend_deltas (hb_array_t<const blend_arg_t> deltas)
+  {
+    double v = 0;
+    if (do_blend)
+    {
+      if (likely (scalars.length == deltas.length))
+      {
+	for (unsigned int i = 0; i < scalars.length; i++)
+	  v += (double) scalars[i] * deltas[i].to_real ();
+      }
+    }
+    return v;
+  }
+
   protected:
   void blend_arg (blend_arg_t &arg)
   {
@@ -234,8 +248,10 @@ struct cff2_cs_opset_t : cs_opset_t<blend_arg_t, OPSET, cff2_cs_interp_env_t, PA
     }
     for (unsigned int i = 0; i < n; i++)
     {
-      const hb_array_t<const blend_arg_t>	blends = env.argStack.get_subarray (start + n + (i * k));
-      env.argStack[start + i].set_blends (n, i, k, blends);
+      //const hb_array_t<const blend_arg_t>	blends = env.argStack.subarray (start + n + (i * k));
+      //xxxxxxxxxxxxxxxxx
+      env.argStack[start + i].set_real (env.argStack[start + i].to_real () +
+					env.blend_deltas (env.argStack.sub_array (start + n + (i * k), k)));
     }
 
     /* pop off blend values leaving default values now adorned with blend values */

We might need to duplicate this type, as it seems like the subsetter relies on the old behavior.

behdad added a commit that referenced this issue May 10, 2022
Do the blending immediately.

Fixes #3559

Benchmark on AdobeVFPrototype shows 35% speedup. Now we're faster
than FreeType:

Benchmark                                                           Time             CPU      Time Old      Time New       CPU Old       CPU New
------------------------------------------------------------------------------------------------------------------------------------------------
BM_Font/glyph_extents/AdobeVFPrototype.otf/hb                    -0.3792         -0.3792          1584           983          1581           982
BM_Font/glyph_extents/AdobeVFPrototype.otf/ft                    +0.0228         +0.0224          1220          1248          1218          1245
BM_Font/glyph_extents/AdobeVFPrototype.otf/var/hb                -0.3513         -0.3518          1616          1048          1613          1046
BM_Font/glyph_extents/AdobeVFPrototype.otf/var/ft                +0.0172         +0.0169          1232          1254          1230          1251
behdad added a commit that referenced this issue May 10, 2022
Do the blending immediately.

Fixes #3559

Benchmark on AdobeVFPrototype shows 35% speedup. Now we're faster
than FreeType:

Benchmark                                                           Time             CPU      Time Old      Time New       CPU Old       CPU New
------------------------------------------------------------------------------------------------------------------------------------------------
BM_Font/glyph_extents/AdobeVFPrototype.otf/hb                    -0.3792         -0.3792          1584           983          1581           982
BM_Font/glyph_extents/AdobeVFPrototype.otf/ft                    +0.0228         +0.0224          1220          1248          1218          1245
BM_Font/glyph_extents/AdobeVFPrototype.otf/var/hb                -0.3513         -0.3518          1616          1048          1613          1046
BM_Font/glyph_extents/AdobeVFPrototype.otf/var/ft                +0.0172         +0.0169          1232          1254          1230          1251
behdad added a commit that referenced this issue May 10, 2022
Do the blending immediately.

Fixes #3559

Benchmark on AdobeVFPrototype shows 35% speedup. Now we're faster
than FreeType:

Benchmark                                                           Time             CPU      Time Old      Time New       CPU Old       CPU New
------------------------------------------------------------------------------------------------------------------------------------------------
BM_Font/glyph_extents/AdobeVFPrototype.otf/hb                    -0.3792         -0.3792          1584           983          1581           982
BM_Font/glyph_extents/AdobeVFPrototype.otf/ft                    +0.0228         +0.0224          1220          1248          1218          1245
BM_Font/glyph_extents/AdobeVFPrototype.otf/var/hb                -0.3513         -0.3518          1616          1048          1613          1046
BM_Font/glyph_extents/AdobeVFPrototype.otf/var/ft                +0.0172         +0.0169          1232          1254          1230          1251
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant