Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove out-of-script exemplars #50

Merged
merged 6 commits into from
Nov 14, 2022
Merged

Conversation

simoncozens
Copy link
Contributor

This PR fixes a number of issues with the exemplar glyphs. It fixes #24.

  • The exemplars for ku_Cyrl and tk_Arab were in the Latin script, not Cyrillic or Latin. These were removed.
  • The exemplars for shi_Arab were actually in Tifingah script. These were moved to shi_Tfng.
  • bugi_Latin schwa characters were actually Cyrillic, not Latin; these were replaced with Latin schwa characters.
  • The exemplars for ccp_Beng were in Chakma, not Bengali. These were removed.

This also includes a test to ensure that the exemplar glyphs are in the script that they are defined to be.

@moyogo
Copy link
Contributor

moyogo commented Nov 14, 2022

@simoncozens Should languages have a repeated field auxiliary_script?

Otherwise LGTM.

@simoncozens
Copy link
Contributor Author

Possibly. There is an interesting case in some Cyrillic languages whereby they do use glyphs from the Latin range as well as Cyrillic glyphs. But I'm not sure if changing the textproto schema is something that is in our power to do.

@simoncozens simoncozens merged commit 96f07b8 into main Nov 14, 2022
@moyogo moyogo deleted the remove-out-of-script-exmplars branch April 29, 2023 13:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

tk_Arab and ku_Cyrl are actually Latin
2 participants