Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression (probably) parsing METADATA.pb #2200

Closed
graphicore opened this issue Nov 14, 2018 · 13 comments
Closed

Regression (probably) parsing METADATA.pb #2200

graphicore opened this issue Nov 14, 2018 · 13 comments
Assignees
Labels
Milestone

Comments

@graphicore
Copy link
Collaborator

(with fontbakery 0.6.1)

Alfa Slab One (METADATA.pb from 2016!!!)

E.g. http://35.225.170.228/report/2763e5df-29e1-4b80-9d68-bfc53400a4e9

(source is the same as in https://github.com/google/fonts/tree/fb839dd24542f8f1b0678af6f74335553ee8e61a/ofl/alfaslabone)

Example:

>> com.google.fonts/check/081
   METADATA.pb: Fontfamily is listed on Google Fonts API?
 * ERROR: The condition <FontBakeryCondition:family_metadata> had an error: ParseError: 3:1 : Couldn't parse string: 'utf-8' codec can't decode byte 0xe9 in position 6: unexpected end of data
           File "/home/commander/Projekte/fontdev/googlefonts/fontbakery/Lib/fontbakery/checkrunner.py", line 377, in _evaluate_condition
             return None, condition(**args)
           File "/home/commander/Projekte/fontdev/googlefonts/fontbakery/Lib/fontbakery/callable.py", line 71, in __call__
             return self._func(*args, **kwds)
           File "/home/commander/Projekte/fontdev/googlefonts/fontbakery/Lib/fontbakery/specifications/googlefonts.py", line 408, in family_metadata
             return get_FamilyProto_Message(pb_file)
           File "/home/commander/Projekte/fontdev/googlefonts/fontbakery/Lib/fontbakery/utils.py", line 140, in get_FamilyProto_Message
             text_format.Merge(text_data, message)
           File "/home/commander/Projekte/fontdev/googlefonts/fontbakery/venv/lib/python3.6/site-packages/google/protobuf/text_format.py", line 536, in Merge
             descriptor_pool=descriptor_pool)
           File "/home/commander/Projekte/fontdev/googlefonts/fontbakery/venv/lib/python3.6/site-packages/google/protobuf/text_format.py", line 590, in MergeLines
             return parser.MergeLines(lines, message)
           File "/home/commander/Projekte/fontdev/googlefonts/fontbakery/venv/lib/python3.6/site-packages/google/protobuf/text_format.py", line 623, in MergeLines
             self._ParseOrMerge(lines, message)
           File "/home/commander/Projekte/fontdev/googlefonts/fontbakery/venv/lib/python3.6/site-packages/google/protobuf/text_format.py", line 638, in _ParseOrMerge
             self._MergeField(tokenizer, message)
           File "/home/commander/Projekte/fontdev/googlefonts/fontbakery/venv/lib/python3.6/site-packages/google/protobuf/text_format.py", line 763, in _MergeField
             merger(tokenizer, message, field)
           File "/home/commander/Projekte/fontdev/googlefonts/fontbakery/venv/lib/python3.6/site-packages/google/protobuf/text_format.py", line 888, in _MergeScalarField
             value = tokenizer.ConsumeString()
           File "/home/commander/Projekte/fontdev/googlefonts/fontbakery/venv/lib/python3.6/site-packages/google/protobuf/text_format.py", line 1255, in ConsumeString
             raise self._StringParseError(e)
         

   Result: ERROR

also:

>> com.google.fonts/check/118 with (('font[0]', '../fonts/ofl/alfaslabone/AlfaSlabOne-Regular.ttf'),)
   Glyphs are similiar to Google Fonts version?
 * ERROR: The condition <FontBakeryCondition:api_gfonts_ttFont> had an error: FailedConditionError: The condition <FontBakeryCondition:remote_styles> had an error: FailedConditionError: The condition <FontBakeryCondition:family_metadata> had an error: ParseError: 3:1 : Couldn't parse string: 'utf-8' codec can't decode byte 0xe9 in position 6: unexpected end of data
           File "/home/commander/Projekte/fontdev/googlefonts/fontbakery/Lib/fontbakery/checkrunner.py", line 370, in _evaluate_condition
             args = self._get_args(condition, iterargs, path)
           File "/home/commander/Projekte/fontdev/googlefonts/fontbakery/Lib/fontbakery/checkrunner.py", line 509, in _get_args
             args[name] = self._get(name, iterargs, path)
           File "/home/commander/Projekte/fontdev/googlefonts/fontbakery/Lib/fontbakery/checkrunner.py", line 483, in _get
             raise error
         

   Result: ERROR

But, there's also a failed attempt parsing METADATA.pb due to unknown fields defined in the file. Thus we should have the chck and condition does_METADATA_parse:

(I don't know where these tested files came from)

http://35.225.170.228/report/d8819464-9208-46cd-9151-903dfd0f4b93

ERROR com.google.fonts/check/092 Checks METADATA.pb font.name field matches family name declared on the name table.

    ERROR
    The condition <FontBakeryCondition:font_metadata> had an error: FailedConditionError: The condition <FontBakeryCondition:family_metadata> had an error: ParseError: 4:1 : Message type "google.fonts.FamilyProto" has no field named "visibility".
    traceback

      File "/usr/local/lib/python3.6/dist-packages/fontbakery/checkrunner.py", line 370, in _evaluate_condition
        args = self._get_args(condition, iterargs, path)
      File "/usr/local/lib/python3.6/dist-packages/fontbakery/checkrunner.py", line 509, in _get_args
        args[name] = self._get(name, iterargs, path)
      File "/usr/local/lib/python3.6/dist-packages/fontbakery/checkrunner.py", line 483, in _get
        raise error
@graphicore
Copy link
Collaborator Author

Another example, probably related:

http://35.225.170.228/report/0bdc5b01-cead-4ee1-82c6-91fda8e08fac

This is Lato from current google/fonts@master. We get many of these:

 com.google.fonts/check/081
   METADATA.pb: Fontfamily is listed on Google Fonts API?
 * ERROR: The condition <FontBakeryCondition:family_metadata> had an error: IndexError: list index out of range
           File "/home/commander/Projekte/fontdev/googlefonts/fontbakery/Lib/fontbakery/checkrunner.py", line 377, in _evaluate_condition
             return None, condition(**args)
           File "/home/commander/Projekte/fontdev/googlefonts/fontbakery/Lib/fontbakery/callable.py", line 71, in __call__
             return self._func(*args, **kwds)
           File "/home/commander/Projekte/fontdev/googlefonts/fontbakery/Lib/fontbakery/specifications/googlefonts.py", line 408, in family_metadata
             return get_FamilyProto_Message(pb_file)
           File "/home/commander/Projekte/fontdev/googlefonts/fontbakery/Lib/fontbakery/utils.py", line 140, in get_FamilyProto_Message
             text_format.Merge(text_data, message)
           File "/home/commander/Projekte/fontdev/googlefonts/fontbakery/venv/lib/python3.6/site-packages/google/protobuf/text_format.py", line 536, in Merge
             descriptor_pool=descriptor_pool)
           File "/home/commander/Projekte/fontdev/googlefonts/fontbakery/venv/lib/python3.6/site-packages/google/protobuf/text_format.py", line 590, in MergeLines
             return parser.MergeLines(lines, message)
           File "/home/commander/Projekte/fontdev/googlefonts/fontbakery/venv/lib/python3.6/site-packages/google/protobuf/text_format.py", line 623, in MergeLines
             self._ParseOrMerge(lines, message)
           File "/home/commander/Projekte/fontdev/googlefonts/fontbakery/venv/lib/python3.6/site-packages/google/protobuf/text_format.py", line 638, in _ParseOrMerge
             self._MergeField(tokenizer, message)
           File "/home/commander/Projekte/fontdev/googlefonts/fontbakery/venv/lib/python3.6/site-packages/google/protobuf/text_format.py", line 763, in _MergeField
             merger(tokenizer, message, field)
           File "/home/commander/Projekte/fontdev/googlefonts/fontbakery/venv/lib/python3.6/site-packages/google/protobuf/text_format.py", line 888, in _MergeScalarField
             value = tokenizer.ConsumeString()
           File "/home/commander/Projekte/fontdev/googlefonts/fontbakery/venv/lib/python3.6/site-packages/google/protobuf/text_format.py", line 1251, in ConsumeString
             the_bytes = self.ConsumeByteString()
           File "/home/commander/Projekte/fontdev/googlefonts/fontbakery/venv/lib/python3.6/site-packages/google/protobuf/text_format.py", line 1266, in ConsumeByteString
             the_list = [self._ConsumeSingleByteString()]
           File "/home/commander/Projekte/fontdev/googlefonts/fontbakery/venv/lib/python3.6/site-packages/google/protobuf/text_format.py", line 1291, in _ConsumeSingleByteString
             result = text_encoding.CUnescape(text[1:-1])
           File "/home/commander/Projekte/fontdev/googlefonts/fontbakery/venv/lib/python3.6/site-packages/google/protobuf/text_encoding.py", line 103, in CUnescape
             result = ''.join(_cescape_highbit_to_str[ord(c)] for c in result)
           File "/home/commander/Projekte/fontdev/googlefonts/fontbakery/venv/lib/python3.6/site-packages/google/protobuf/text_encoding.py", line 103, in <genexpr>
             result = ''.join(_cescape_highbit_to_str[ord(c)] for c in result)
         

   Result: ERROR

@graphicore
Copy link
Collaborator Author

One more similar:

fontbakery check-googlefonts -l ERROR ../fonts/ofl/inknutantiqua/* (google/fonts@master)

or http://35.225.170.228/report/0bdc5b01-cead-4ee1-82c6-91fda8e08fac

Example:

>> com.google.fonts/check/110 with (('font[1]', '../fonts/ofl/inknutantiqua/InknutAntiqua-Bold.ttf'),)
   METADATA.pb: Check font name is the same as family name.
 * ERROR: The condition <FontBakeryCondition:family_metadata> had an error: ParseError: 3:1 : Couldn't parse string: 'utf-8' codec can't decode byte 0xf8 in position 14: invalid start byte
           File "/home/commander/Projekte/fontdev/googlefonts/fontbakery/Lib/fontbakery/checkrunner.py", line 377, in _evaluate_condition
             return None, condition(**args)
           File "/home/commander/Projekte/fontdev/googlefonts/fontbakery/Lib/fontbakery/callable.py", line 71, in __call__
             return self._func(*args, **kwds)
           File "/home/commander/Projekte/fontdev/googlefonts/fontbakery/Lib/fontbakery/specifications/googlefonts.py", line 408, in family_metadata
             return get_FamilyProto_Message(pb_file)
           File "/home/commander/Projekte/fontdev/googlefonts/fontbakery/Lib/fontbakery/utils.py", line 140, in get_FamilyProto_Message
             text_format.Merge(text_data, message)
           File "/home/commander/Projekte/fontdev/googlefonts/fontbakery/venv/lib/python3.6/site-packages/google/protobuf/text_format.py", line 536, in Merge
             descriptor_pool=descriptor_pool)
           File "/home/commander/Projekte/fontdev/googlefonts/fontbakery/venv/lib/python3.6/site-packages/google/protobuf/text_format.py", line 590, in MergeLines
             return parser.MergeLines(lines, message)
           File "/home/commander/Projekte/fontdev/googlefonts/fontbakery/venv/lib/python3.6/site-packages/google/protobuf/text_format.py", line 623, in MergeLines
             self._ParseOrMerge(lines, message)
           File "/home/commander/Projekte/fontdev/googlefonts/fontbakery/venv/lib/python3.6/site-packages/google/protobuf/text_format.py", line 638, in _ParseOrMerge
             self._MergeField(tokenizer, message)
           File "/home/commander/Projekte/fontdev/googlefonts/fontbakery/venv/lib/python3.6/site-packages/google/protobuf/text_format.py", line 763, in _MergeField
             merger(tokenizer, message, field)
           File "/home/commander/Projekte/fontdev/googlefonts/fontbakery/venv/lib/python3.6/site-packages/google/protobuf/text_format.py", line 888, in _MergeScalarField
             value = tokenizer.ConsumeString()
           File "/home/commander/Projekte/fontdev/googlefonts/fontbakery/venv/lib/python3.6/site-packages/google/protobuf/text_format.py", line 1255, in ConsumeString
             raise self._StringParseError(e)
         

   Result: ERROR

@felipesanches
Copy link
Collaborator

crash is caused by the designer field:

screenshot at 2018-12-02 20 13 29

@davelab6
Copy link
Contributor

davelab6 commented Dec 2, 2018

I am astonished to see this with py3 code.

@felipesanches
Copy link
Collaborator

felipesanches commented Dec 3, 2018

I think this is not a bug in our code.

I investigated it a bit further. The offending character in that string is clearly the 'é'.

I started by using the interactive python debugger and traced the execution of the protobuf code from my local virtualenv:

screenshot at 2018-12-03 01 07 46

Then I tried the same bytes in the python shell and reproduced the crash:

screenshot at 2018-12-03 01 05 17

I also noticed that when I encode the "JM Solé" string with the utf-8 encoding, I get a different sequence of bytes: \xc3\xa9 versus \xe9. Well... \xc3\xa9 is UTF-8 while \xe9 is 8bit ASCII.
So I presumed that the mistake was that the author of METADATA.pb saved it encoded as ASCII.

So, I used hexdump to confirm that METADATA.pb has the ASCII 'é' (\xe9), but was shocked to see it actually has \xc3\xa9:

screenshot at 2018-12-03 01 13 04

I am not sure yet why, but the protobuf python package somehow managed to read UTF-8 from the METADATA.pb file and then ended up with 8-bit ASCII and then crashed while trying to interpret it as UTF-8...

@felipesanches
Copy link
Collaborator

I continued investigating and found this:

screenshot at 2018-12-03 01 37 49

It seems that the 8-bit ASCII string is being created at text_encoding.CUnescape.

So I googled. And I think this gives us the answer:
screenshot at 2018-12-03 01 37 07

I looked up the sources of my locally installed protobuf and it is indeed encoding the string into "ascii" and it does not yet include the patch from that pull request (protocolbuffers/protobuf#4804).

@felipesanches
Copy link
Collaborator

I patched my local protobuf and now the fontbakery check/081 PASSes withut a crash.

screenshot at 2018-12-03 01 43 57

The upstream bugfix is from 2018 July 20. There was a release on July (protobuf 3.6.1) and it is available on PyPI. Our setup.py does not specify a version, so I would expect that a pip install would get us the latest version, and I can confirm that I do indeed have 3.6.1 installed:

screenshot at 2018-12-03 01 48 57

So, maybe that pull request was later reverted? I'll try to figure out.

@felipesanches
Copy link
Collaborator

yeah... it was reverted: protocolbuffers/protobuf#4804 (comment)

And the problem is currently being tracked at protocolbuffers/protobuf#4721 (and they're already talking about Google Fonts repo METADATA.pb files in there :-D)

@felipesanches
Copy link
Collaborator

We will have to wait for the upcoming protobuf 3.7.0 release.

screenshot at 2018-12-03 01 59 06

@felipesanches felipesanches changed the title Regression (probably) parsing METADATA.pb and add check and condition "does_METADATA_parse" Regression (probably) parsing METADATA.pb Dec 4, 2018
@thundernixon
Copy link
Contributor

thundernixon commented Dec 18, 2018

Was going to file a separate issue, but realized it would be a duplicate. So, I'll do a +1 here so that it might show up in other searches a little better...

METADATA.pb fails to parse when there is a non-ASCII character (e.g. in a designer's name)

Observed behaviour

Signika is designed by Anna Giedryś.

Unfortunately, if Anna's proper name is placed in METADATA.pb, it triggers a bunch of FontBakery fails. For example this metadata:

name: "Signika"
designer: "Anna Giedryś"

...causes this:

image

💔 A bunch of FontBakery METADATA fails
💔 ERROR: Check METADATA.pb parse correctly.
💔 ERROR: Font designer field in METADATA.pb must not be 'unknown'.
  • com.google.fonts/check/007
  • 💔 ERROR The condition FontBakeryCondition:family_metadata had an error: IndexError: list index out of range
💔 ERROR: METADATA.pb: Fontfamily is listed on Google Fonts API?
  • com.google.fonts/check/081
  • 💔 ERROR The condition FontBakeryCondition:family_metadata had an error: IndexError: list index out of range
💔 ERROR: METADATA.pb: check if fonts field only has unique "full_name" values.
  • com.google.fonts/check/083
  • 💔 ERROR The condition FontBakeryCondition:family_metadata had an error: IndexError: list index out of range
💔 ERROR: METADATA.pb: check if fonts field only contains unique style:weight pairs.
  • com.google.fonts/check/084
  • 💔 ERROR The condition FontBakeryCondition:family_metadata had an error: IndexError: list index out of range
💔 ERROR: METADATA.pb license is "APACHE2", "UFL" or "OFL"?
  • com.google.fonts/check/085
  • 💔 ERROR The condition FontBakeryCondition:family_metadata had an error: IndexError: list index out of range
💔 ERROR: METADATA.pb should contain at least "menu" and "latin" subsets.
  • com.google.fonts/check/086
  • 💔 ERROR The condition FontBakeryCondition:family_metadata had an error: IndexError: list index out of range
💔 ERROR: METADATA.pb subsets should be alphabetically ordered.
  • com.google.fonts/check/087
  • 💔 ERROR The condition FontBakeryCondition:family_metadata had an error: IndexError: list index out of range
💔 ERROR: METADATA.pb: Copyright notice is the same in all fonts?
  • com.google.fonts/check/088
  • 💔 ERROR The condition FontBakeryCondition:family_metadata had an error: IndexError: list index out of range
💔 ERROR: Check that METADATA.pb family values are all the same.
  • com.google.fonts/check/089
  • 💔 ERROR The condition FontBakeryCondition:family_metadata had an error: IndexError: list index out of range
💔 ERROR: METADATA.pb: According Google Fonts standards, families should have a Regular style.
  • com.google.fonts/check/090
  • 💔 ERROR The condition FontBakeryCondition:family_metadata had an error: IndexError: list index out of range
💔 ERROR: METADATA.pb: Regular should be 400.
  • com.google.fonts/check/091
  • 💔 ERROR The condition FontBakeryCondition:family_metadata had an error: IndexError: list index out of range

Whereas this metadata:

name: "Signika"
designer: "Anna Giedrys"

...yields this result:

image

Expected behaviour

It is incorrect to spell Anna's name with no /sacute, so it looks like we'll have to wait for an update.

Hoping for a new protobuff release soon. 🙏

@felipesanches
Copy link
Collaborator

Protobuf 3.7.0 was released. We will use it on the next fontbakery release. I have just updated requirements.txt

felipesanches added a commit to felipesanches/fontbakery that referenced this issue Mar 12, 2019
Special note to protobuf version 3.7.0 which includes a fix
for a bug that affected fontbakery (issue fonttools#2200)
felipesanches added a commit that referenced this issue Mar 12, 2019
Special note to protobuf version 3.7.0 which includes a fix
for a bug that affected fontbakery (issue #2200)
@felipesanches
Copy link
Collaborator

I have just re-run fontbakery on Signika and the parsing errors are gone. Cheers!

@thundernixon
Copy link
Contributor

Nice; thanks!

felipesanches added a commit to felipesanches/fontbakery that referenced this issue Mar 23, 2019
a bug that affected parsing some METADATA.pb files correctly.
(issue fonttools#2200)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants