Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split the functionality in src/core/fonts.js into multiple files, and use standard classes #13327

Merged
merged 18 commits into from May 5, 2021

Conversation

Snuffleupagus
Copy link
Collaborator

Obviously the Font-class is still very large, given particularly how TrueType fonts are handled, however this patch-series at least improves things by moving a number of functions/classes into their own files.
As a follow-up it might make sense to try and re-factor/extract the TrueType parsing into its own file, since all of this code is quite old, however that's probably best left for another time.

For e.g. gulp mozcentral, the built pdf.worker.js files decreases from 1 620 332 to 1 617 466 bytes with this patch-series.

These changes were made automatically, using `gulp lint --fix`.
These changes were made automatically, using `gulp lint --fix`.
…d into their own file

 - `FontFlags`, is used in both `src/core/fonts.js` and `src/core/evaluator.js`.
 - `getFontType`, same as the above.
 - `MacStandardGlyphOrdering`, is a fairly large data-structure and `src/core/fonts.js` is already a *very* large file.
 - `recoverGlyphName`, a dependency of `type1FontGlyphMapping`; please see below.
 - `SEAC_ANALYSIS_ENABLED`, is used by both `Type1Font`, `CFFFont`, and unit-tests; please see below.
 - `type1FontGlyphMapping`, is used by both `Type1Font` and `CFFFont` which a later patch will move to their own files.
These changes were made *mostly* automatically, using `gulp lint --fix`, with the following manual changes:

```diff
diff --git a/src/core/fonts_utils.js b/src/core/fonts_utils.js
index f88ce4a8c..c4b3f3808 100644
--- a/src/core/fonts_utils.js
+++ b/src/core/fonts_utils.js
@@ -167,8 +167,8 @@ function type1FontGlyphMapping(properties, builtInEncoding,
glyphNames) {
   }

   // Lastly, merge in the differences.
-  let differences = properties.differences,
-    glyphsUnicodeMap;
+  const differences = properties.differences;
+  let glyphsUnicodeMap;
   if (differences) {
     for (charCode in differences) {
       const glyphName = differences[charCode];
```
These changes were made automatically, using `gulp lint --fix`.
These changes were made *mostly* automatically, using `gulp lint --fix`, with the following manual changes:

```diff
diff --git a/src/core/type1_font.js b/src/core/type1_font.js
index 50a3e49e6..55a2005fb 100644
--- a/src/core/type1_font.js
+++ b/src/core/type1_font.js
@@ -38,10 +38,9 @@ const Type1Font = (function Type1FontClosure() {
     const scanLength = streamBytesLength - signatureLength;

     let i = startIndex,
-      j,
       found = false;
     while (i < scanLength) {
-      j = 0;
+      let j = 0;
       while (j < signatureLength && streamBytes[i + j] === signature[j]) {
         j++;
       }
@@ -248,14 +247,14 @@ const Type1Font = (function Type1FontClosure() {
         return charCodeToGlyphId;
       }

-      let glyphNames = [".notdef"],
-        glyphId;
+      const glyphNames = [".notdef"];
+      let builtInEncoding, glyphId;
       for (glyphId = 0; glyphId < charstrings.length; glyphId++) {
         glyphNames.push(charstrings[glyphId].glyphName);
       }
       const encoding = properties.builtInEncoding;
       if (encoding) {
-        var builtInEncoding = Object.create(null);
+        builtInEncoding = Object.create(null);
         for (const charCode in encoding) {
           glyphId = glyphNames.indexOf(encoding[charCode]);
           if (glyphId >= 0
```
These changes were made automatically, using `gulp lint --fix`.
Given the large size of this patch, the manual fixes are done separately in the next commit.
Obviously the `Font`-class is still *very* large, given particularly how TrueType fonts are handled, however this patch-series at least improves things by moving a number of functions/classes into their own files.
As a follow-up it might make sense to try and re-factor/extract the TrueType parsing into its own file, since all of this code is quite old, however that's probably best left for another time.

For e.g. `gulp mozcentral`, the *built* `pdf.worker.js` files decreases from `1 620 332` to `1 617 466` bytes with this patch-series.
It shouldn't be possible for the `getBytes`-call to throw a `MissingDataException`, since all resources are loaded *before* e.g. font-parsing ever starts; see https://github.com/mozilla/pdf.js/blob/f0817015bd0b5b3dd498427df85c564e9eb08603/src/core/object_loader.js#L111-L126

Furthermore, even if we'd *somehow* re-throw a `MissingDataException` here that still won't help considering where the `Type1Font`-instance is created. Note how in the `Font`-constructor we simply catch any errors and fallback to a standard font, which means that a `MissingDataException` would just lead to rendering errors anyway; see https://github.com/mozilla/pdf.js/blob/f0817015bd0b5b3dd498427df85c564e9eb08603/src/core/fonts.js#L648-L691

All-in-all, it's not possible for a `MissingDataException` to be thrown in `getHeaderBlock` and this code-path can thus be removed.
@Snuffleupagus
Copy link
Collaborator Author

/botio test

@pdfjsbot
Copy link

pdfjsbot commented May 3, 2021

From: Bot.io (Linux m4)


Received

Command cmd_test from @Snuffleupagus received. Current queue size: 0

Live output at: http://54.67.70.0:8877/918188c203e4f89/output.txt

@pdfjsbot
Copy link

pdfjsbot commented May 3, 2021

From: Bot.io (Windows)


Received

Command cmd_test from @Snuffleupagus received. Current queue size: 0

Live output at: http://3.101.106.178:8877/415f6543f5109a3/output.txt

@pdfjsbot
Copy link

pdfjsbot commented May 3, 2021

From: Bot.io (Linux m4)


Failed

Full output at http://54.67.70.0:8877/918188c203e4f89/output.txt

Total script time: 25.57 mins

  • Font tests: Passed
  • Unit tests: Passed
  • Integration Tests: Passed
  • Regression tests: FAILED

Image differences available at: http://54.67.70.0:8877/918188c203e4f89/reftest-analyzer.html#web=eq.log

@pdfjsbot
Copy link

pdfjsbot commented May 3, 2021

From: Bot.io (Windows)


Failed

Full output at http://3.101.106.178:8877/415f6543f5109a3/output.txt

Total script time: 29.69 mins

  • Font tests: Passed
  • Unit tests: Passed
  • Integration Tests: Passed
  • Regression tests: FAILED

Image differences available at: http://3.101.106.178:8877/415f6543f5109a3/reftest-analyzer.html#web=eq.log

@Snuffleupagus Snuffleupagus marked this pull request as ready for review May 4, 2021 10:32
@Snuffleupagus
Copy link
Collaborator Author

/botio-linux preview

@pdfjsbot
Copy link

pdfjsbot commented May 4, 2021

From: Bot.io (Linux m4)


Received

Command cmd_preview from @Snuffleupagus received. Current queue size: 0

Live output at: http://54.67.70.0:8877/59e13da8d8eb3fd/output.txt

@pdfjsbot
Copy link

pdfjsbot commented May 4, 2021

From: Bot.io (Linux m4)


Success

Full output at http://54.67.70.0:8877/59e13da8d8eb3fd/output.txt

Total script time: 4.43 mins

Published

@timvandermeij timvandermeij merged commit afb8c4f into mozilla:master May 5, 2021
@timvandermeij
Copy link
Contributor

Nice work!

@Snuffleupagus Snuffleupagus deleted the split-fonts branch May 5, 2021 20:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants