Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fonts stored in base85 can contain trigraphs. #839

Closed
tss3000 opened this issue Sep 22, 2016 · 4 comments
Closed

Fonts stored in base85 can contain trigraphs. #839

tss3000 opened this issue Sep 22, 2016 · 4 comments

Comments

@tss3000
Copy link

tss3000 commented Sep 22, 2016

When using

binary_to_compressed_c --base85 Cousine-Regular.ttf

the result ends up containing trigraphs like ??( witch will end up replaced to other characters on some build setups.
It might be worth adjusting or changing the base85 algorithm to something like https://github.com/maksverver/base85

Edit:

char Encode85Byte(unsigned int x) 
{
    x = (x % 85) + 35;
    switch(x)
    {
    case '\\': return 35 + 85;
    case '?':  return 35 + 86;
    case ':':  return 35 + 87;
    case '%':  return 35 + 88;
    default:   return x;
    }
}
unsigned int Decode85Byte(char c)
{
    switch (c)
    {
    case 35 + 85: return '\\' - 35;
    case 35 + 86: return '?' - 35;
    case 35 + 87: return ':' - 35;
    case 35 + 88: return '%' - 35;
    default:      return c - 35;
    }
}

ocornut added a commit that referenced this issue Sep 23, 2016
@ocornut ocornut closed this as completed Sep 23, 2016
@ocornut
Copy link
Owner

ocornut commented Sep 23, 2016

Nice catch. Who ever remembers about trigraphs nowadays? I'm surprised they affect string literals to be honest, Wikipedia is unclear:

The only places in a C file where two question marks in a row may be used are in multi-character constants, string literals, and comments

Then immediately following

To safely place two consecutive question marks within a string literal, the programmer can use string concatenation "...?""?..." or an escape sequence "...?\?...".

Changing the encoding isn't desirable as it would break already encoded data. Instead I added escape characters when there are pairs of ?? as suggested by the wikipedia entry:
https://en.wikipedia.org/wiki/Digraphs_and_trigraphs#C.2B.2B

Does that works for you? I didn't test with an old compiler.

Do you have problems with digraphs as well? (<: %> etc.)

@MrSapps
Copy link

MrSapps commented Sep 23, 2016

I think c++11 and newer remove or disable them by default

@ocornut
Copy link
Owner

ocornut commented Sep 23, 2016

Yes.

@tss3000
Copy link
Author

tss3000 commented Sep 23, 2016

I can confirm that the submitted changes resolve the issue. Surprisingly I stumbled onto it when using the latest https://github.com/kripken/emscripten-fastcomp-clang . I don't plan on using either trigraphs or diagraphs in any of my projects so I'll be a user of "-Werror=trigraphs" from now on.

Thanks for the help and for this amazing library.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants