Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Font rendering bug for Devanagari text #23082

Open
kach opened this issue May 20, 2022 · 7 comments
Open

[Bug]: Font rendering bug for Devanagari text #23082

kach opened this issue May 20, 2022 · 7 comments

Comments

@kach
Copy link

kach commented May 20, 2022

Bug summary

When rendered as part of a graph's axis label or title, Devanagari text renders incorrectly: the "matras" get affixed to the wrong letters (see the attached image, and note the difference between the string literal in the code and the rendered text in the title of the graph).

Code for reproduction

from matplotlib import pyplot as plt
import matplotlib
matplotlib.rc('font', family='Devanagari Sangam MN')
matplotlib.rc('font', size=18)

plt.title('किसान')

Actual outcome

image

Expected outcome

The rendered word should look the same as in the string literal in the code.

Additional information

No response

Operating system

macOS

Matplotlib Version

3.4.1

Matplotlib Backend

module://ipykernel.pylab.backend_inline

Python version

Python 3.9.12

Jupyter version

6.3.0

Installation

pip

@tacaswell
Copy link
Member

I'm going to start by apologizing that I know nothing more about Devanagari rendering than this issue and a very quick skimming of it's wikipedia entry.

It looks like the matras is a vowel diacritic that combines with the letter to its left, where as the other vowels appear below or to the right (ref https://en.wikipedia.org/wiki/Devanagari#Vowel_diacritics). Do these render correctly or are all vowels broken?
It also looks like there are also ligatures between some consonants, do those render correctly?

I am not sure off the top of my head if we are doing the detailed text layout here or freetype is, but it is clear that the diacritic is not being handled correctly.

A couple questions:

  • does it work with other fonts? We have seen issues where the fonts have errors in their metadata about the size of the glyphs which results in incorrect layouts
  • does it work if you use https://github.com/matplotlib/mplcairo ? That under the hood uses a different text layout engine which does RTL correctly so it is likely it will also handle this case correctly.

@tacaswell
Copy link
Member

One more detail: we are rendering the text in the order it comes in in the string:

In [1]: list('किसान')
Out[1]: ['क', 'ि', 'स', 'ा', 'न']

@kach
Copy link
Author

kach commented May 20, 2022

Hi Thomas,

No apology necessary, thank you for looking into this issue. :)

Your understanding is correct. All other vowels are fine. 'क' + 'ि' is "special" in the sense that the 'ि' needs to be rendered to the left of the 'क'.

image

  • The problem persists when I use a different (Devanagari-compatible) font.
  • I don't know how to use mplcairo, but you should be able to run the experiment using the code snippet I shared above. On Macs that font is built-in.

The consonant-consonant ligatures "work" in the sense that they are not wrong, but not ideal. The Devanagari script has special ligatures for each pair of consonant-ligatures, but there is also a generic way to write them by adding the "halanth" diacritic ([्]) to the first consonant. Matplotlib seems to just use the generic form everywhere. See below (compare to the respective string literals in Python):

image

@tacaswell
Copy link
Member

Looking at the last example , it looks like the shaping on 'ि' is such that it goes over a the whole ligature?

Unfortunately, I do not think this is going to be an easy fix (it is possible that we are clipping something to be positive that should be negative to move the diacritic to the left), but I suspect that our text system is not currently up to this.

@kach
Copy link
Author

kach commented May 20, 2022

Hi Thomas,

You're right, the 'ि' should loop around the entire consonant cluster, though there are exceptions. For example, in मूर्ति the little arc above the that makes it र्त should be "outside" the matra.

I do need to produce graphs with (correct) Devanagari labels on them. Do you have recommendations on how to proceed?

@anntzer
Copy link
Contributor

anntzer commented May 20, 2022

Unless I am mistaken, this is basically a duplicate of #8765.

@tacaswell
Copy link
Member

The fastest thing is probably to use mplcairo

so

import matplotlib
matplotlib.use('module://mplcairo.qt')
matplotlib.rc('font', family='Noto Sans Devanagari')  # I already had NoTo installed
import matplotlib.pyplot as plt
plt.title('किसान')

I am not sure how hard it would be to get mplcairo to play nice with inline, but if you need to save the output you are good to go!

If you definitly need images in a notebook, I think the way to go is module://mplcairo.base and then rely on the repr of a Figure is a png of it it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants