Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vietnamese names are mis-rendered #475

Closed
interDist opened this issue Jul 11, 2018 · 4 comments
Closed

Vietnamese names are mis-rendered #475

interDist opened this issue Jul 11, 2018 · 4 comments

Comments

@interDist
Copy link

The rendering of the full Vietnamese names of places and streets (containing all Vietnamese diacritics, name:nonlatin) is faulty. Usually only a couple of vowels are displayed. This seems to be unrelated to the actual font used and more a problem with the data itself (the issues manifests itself on any map demonstrated on the openmaptiles.org website).

For example, curl 'https://maps.tilehosting.com/data/v3/11/1625/900.pbf?key=alS7XjesrAd6uvek9nRE' -H 'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.99 Safari/537.36' -H 'Origin: https://openmaptiles.org' --compressed returns a tile for the capital Hanoi, but the PBF contains name:nonlatin �ộ��. @ChrisLoer confirmed this is not a problem with the rendering engine (mapbox-gl-js), in mapbox/mapbox-gl-js#6939.

Demonstration

image

Expected Behavior

Any Vietnamese name is rendered correctly. For example, the two biggest cities should be rendered HÀ NỘI and THÀNH PHỐ HỒ CHÍ MINH.

Actual Behavior

Most Vietnamese names are rendered inadequately. For example, for the two biggest cities the map shows “Ộ” and “Ố Ồ”...

@klokan
Copy link
Member

klokan commented Jul 11, 2018

Verified. It looks like a bug in the data processing.

Should be fixed by improving osml10n_is_latin function - something in this direction:
openmaptiles/import-sql@503f0b7

Help wanted - if anybody from the community is willing to contribute with a Pull Request - we would be very glad to review and accept it...

@klokan klokan added this to the v3.9 milestone Jul 11, 2018
@ChrisLoer
Copy link

Using Hanoi as a specific example, the base name is "Hà Nội��":

https://apps.timwhitlock.info/unicode/inspect?s=H%C3%A0+N%E1%BB%99i%1A%07#block-U0080

I'm not sure why the is_latin check appears to be false, because all of those characters appear to be covered in the check code -- are those ranges actually specified correctly (e.g. not sure how that substring function works, is it working with UTF-16? Using "ascii" for the variable name there is confusing).

Aside from fixing the is_latin check, probably the remove_latin code should use the same character range logic for identifying Latin, instead of the "unnaccent -> a-Z" approach which misses characters like ộ.

@jirik
Copy link
Collaborator

jirik commented Nov 11, 2018

I am testing approach introduced by openmaptiles/import-sql@503f0b7

There is a consequence. Some new diacritics will be encoded into name:latin, e.g. the special "O" letters in Hanoi name. As these letters are rendered well using common fonts, I consider it to be correct behavior.

Preview with name:latin:

image

Similar case, Lankaran city:
https://openmaptiles.github.io/osm-bright-gl-style/#6.19/39.844/49.659
image

@jirik
Copy link
Collaborator

jirik commented Nov 12, 2018

Related import-sql PR: openmaptiles/import-sql#7

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants