Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Social card cyrillic encoding bug #3099

Closed
5 tasks done
Zilborg opened this issue Oct 11, 2021 · 7 comments
Closed
5 tasks done

Social card cyrillic encoding bug #3099

Zilborg opened this issue Oct 11, 2021 · 7 comments
Labels
bug Issue reports a bug resolved Issue is resolved, yet unreleased if open

Comments

@Zilborg
Copy link

Zilborg commented Oct 11, 2021

Contribution guidelines

I've found a bug and checked that ...

  • ... the problem doesn't occur with the mkdocs or readthedocs themes
  • ... the problem persists when all overrides are removed, i.e. custom_dir, extra_javascript and extra_css
  • ... the documentation does not mention anything about my problem
  • ... there are no open or closed issues that are related to my problem

Description

utf-8 encoding is missing missing (imho).
I think the trouble is missing encoding in ImageFont.truetype(self.font.get(700), 36, encoding="utf-8"), however, I couldn't to fix this.

Your source link.

Expected behaviour

Correct social card =)

Actual behaviour

Screenshot 2021-10-11 at 10 37 09

Russian title and description looks like squares.

Steps to reproduce

  1. Write title = 'Привет мир!' (Hello world in translation) and/or description = 'Привет мир!
  2. build
  3. https://cards-dev.twitter.com/validator

Package versions

Build with Github action.
I have the fork of insiders. There was last build with your hash commit (96e9a28f989abc498bc6f5ced4d0d061908c95ff commit link)

Configuration

# Configuration
theme:
  # Default values, taken from mkdocs_theme.yml
  language: en

# Plugins
plugins:
  - social

System information

I think, doesn't matter.

@squidfunk squidfunk added the bug Issue reports a bug label Oct 11, 2021
@squidfunk
Copy link
Owner

squidfunk commented Oct 11, 2021

Thanks for reporting. I think the problem lies in the font that's being downloaded. Google Fonts may need to be told to download the set including the Cyrillic charset. I'll investigate asap.

@squidfunk
Copy link
Owner

The problem is not what you were suspecting, but that Google Fonts provides multiple fonts for different Unicode ranges. For Roboto, the following ranges are available:

/* cyrillic-ext */
@font-face {
  font-family: 'Roboto';
  font-style: normal;
  font-weight: 400;
  src: url(https://fonts.gstatic.com/s/roboto/v29/KFOmCnqEu92Fr1Mu72xKKTU1Kvnz.woff2) format('woff2');
  unicode-range: U+0460-052F, U+1C80-1C88, U+20B4, U+2DE0-2DFF, U+A640-A69F, U+FE2E-FE2F;
}
/* cyrillic */
@font-face {
  font-family: 'Roboto';
  font-style: normal;
  font-weight: 400;
  src: url(https://fonts.gstatic.com/s/roboto/v29/KFOmCnqEu92Fr1Mu5mxKKTU1Kvnz.woff2) format('woff2');
  unicode-range: U+0400-045F, U+0490-0491, U+04B0-04B1, U+2116;
}
/* greek-ext */
@font-face {
  font-family: 'Roboto';
  font-style: normal;
  font-weight: 400;
  src: url(https://fonts.gstatic.com/s/roboto/v29/KFOmCnqEu92Fr1Mu7mxKKTU1Kvnz.woff2) format('woff2');
  unicode-range: U+1F00-1FFF;
}
/* greek */
@font-face {
  font-family: 'Roboto';
  font-style: normal;
  font-weight: 400;
  src: url(https://fonts.gstatic.com/s/roboto/v29/KFOmCnqEu92Fr1Mu4WxKKTU1Kvnz.woff2) format('woff2');
  unicode-range: U+0370-03FF;
}
/* vietnamese */
@font-face {
  font-family: 'Roboto';
  font-style: normal;
  font-weight: 400;
  src: url(https://fonts.gstatic.com/s/roboto/v29/KFOmCnqEu92Fr1Mu7WxKKTU1Kvnz.woff2) format('woff2');
  unicode-range: U+0102-0103, U+0110-0111, U+0128-0129, U+0168-0169, U+01A0-01A1, U+01AF-01B0, U+1EA0-1EF9, U+20AB;
}
/* latin-ext */
@font-face {
  font-family: 'Roboto';
  font-style: normal;
  font-weight: 400;
  src: url(https://fonts.gstatic.com/s/roboto/v29/KFOmCnqEu92Fr1Mu7GxKKTU1Kvnz.woff2) format('woff2');
  unicode-range: U+0100-024F, U+0259, U+1E00-1EFF, U+2020, U+20A0-20AB, U+20AD-20CF, U+2113, U+2C60-2C7F, U+A720-A7FF;
}
/* latin */
@font-face {
  font-family: 'Roboto';
  font-style: normal;
  font-weight: 400;
  src: url(https://fonts.gstatic.com/s/roboto/v29/KFOmCnqEu92Fr1Mu4mxKKTU1Kg.woff2) format('woff2');
  unicode-range: U+0000-00FF, U+0131, U+0152-0153, U+02BB-02BC, U+02C6, U+02DA, U+02DC, U+2000-206F, U+2074, U+20AC, U+2122, U+2191, U+2193, U+2212, U+2215, U+FEFF, U+FFFD;
}

We could, in theory, pick another font, as we're currently just picking latin resulting in the problem you're seeing. We can#t just pick cyrillic, because then we can't render ASCII characters, as those are not part of the font. I'll make up my mind and see how we can gravitate towards a solution, but it's definitely a tricky one 😉

@squidfunk
Copy link
Owner

Okay, so it's definitely possible:

creating-your-site

What we have to do is download the font families so they include all characters, not parse them from CSS. Unfortunately, I can't find any hint of where to programmatically download fonts, but they seem all to be hosted here:

https://github.com/google/fonts

@Zilborg
Copy link
Author

Zilborg commented Oct 11, 2021

Now it's working.
image

I've added script to github action. But it's kludge)

     ....
      - name: Install Python dependencies
        run: |
          pip install git+https://${GH_TOKEN}@github.com/Zilborg/mkdocs-material-insiders.git
      - name: Download & unzip fonts
        run: |
          curl -o fonts.zip -L https://fonts.google.com/download?family=Roboto
          mkdir .cache
          unzip -p fonts.zip Roboto-Regular.ttf > .cache/Roboto.400.ttf
          unzip -p fonts.zip Roboto-Bold.ttf > .cache/Roboto.700.ttf 
      - name: Deploy documentation
      ....

@squidfunk
Copy link
Owner

Perfect, I'll adjust the plugin that it uses this logic. Thanks for investigating!

@squidfunk
Copy link
Owner

squidfunk commented Oct 17, 2021

Fixed in 351dd162c. Thanks again, your efforts were a huge help!

The social plugin will now download the fonts from https://fonts.google.com/download and extract them to the .cache folder, which ensures that the entire supported character set is available. All fonts are kept, not only the regular and bold weights. This will later allow specifying alternative font weights for different parts of the social card if desired. In the future, the social card will have more configuration options, and this is an important step in that direction 😊

Furthermore, if a font doesn't include a bold variant, the social plugin will now fall back to the regular weight.

@squidfunk squidfunk added the resolved Issue is resolved, yet unreleased if open label Oct 17, 2021
@squidfunk
Copy link
Owner

Released as part of 7.3.4+insiders-3.1.4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Issue reports a bug resolved Issue is resolved, yet unreleased if open
Projects
None yet
Development

No branches or pull requests

2 participants