Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Font for "Makasar" is missing. #1

Closed
SalviaSage opened this issue Dec 8, 2019 · 10 comments
Closed

Font for "Makasar" is missing. #1

SalviaSage opened this issue Dec 8, 2019 · 10 comments

Comments

@SalviaSage
Copy link

I want to bring to your attention that the font for "Makasar",
Unicode plane: U+11EE0 - U+11EFF is missing.

https://en.wikipedia.org/wiki/Makasar_(Unicode_block)

Please upload this font here if possible.

Thank you.

@cambagulung
Copy link

cambagulung commented Jul 20, 2020

Saya adalah orang Makassar, saya akan merasa sangat senang jika bisa membantu.

I am a Makassarese, I would feel very happy if I could help.

@r12a
Copy link

r12a commented Jul 21, 2020

fwiw, might be worth asking Anshuman Pandey, who has a graphite font at https://github.com/pandey/graphite-fonts in case it may help by providing a starting point for development of a noto font, if he's willing.

@verdy-p
Copy link

verdy-p commented Oct 30, 2020

You should note that this is not the only script encoded in Unicode which still has no Noto font for it, not even in any other free/open fonts; except possibly some proprietary fonts, or just not-Unicode fonts hacked over ASCII or another script and most often with very partial coverage, such as lack of support for necessary diacritics or ligatures and contextual forms),

As of start of 4 February 2022, MOST scripts encoded up to Unicode 14.0 (last version, published on 21 September 2021) are supported with some Noto Fonts, except one complex script from Unicode 13.0 (with a traditional vertical layout, with variable heights for long clusters counting up to 8 base characters, normally not fitting in a single square composition area like Han Sinograms or Hangul clusters). The following supported scripts still have no Noto fonts (and still no other fonts available with an open licence). See also the ISO 15924 Notice of Changes (last updated on 3 December 2021) on the ISO 15924/RA site hosted by Unicode.

ISO 15924 code English script name Nom d'écriture en français Unicode alias Unicode age ISO 15924 since Noto fonts
Kits 288 Khitan small script petite écriture khitan Khitan_Small_Script 13.0 2015-07-15

Note that there exists some fonts for Khitan Small Script on https://babelstone.co.uk/Fonts/KhitanSmall.html
However they still do not specify any licence for now, so they can only be for experimental personal use and are legally available only from the author's site (this may be on purpose, because they are probably considered experimental and the author does not want to assume their use to create incorrectly encoded texts, if all the needed requirements for this scripts are still not implemented). The same is true for his experimental font for Khitan Large Script (only Khitan Seal is released). Also for Naxi Dongba, or Shuishu, or some variants/extensions of Tangut (currently still mapped on PUA, as there's no Unicode assignment for now). That author however has released many other fonts with an open APL or OFL font for scripts that are fully supported in Unicode (see https://babelstone.co.uk/Fonts/index.html). However these fonts do not focus on getting in harmony with other Noto fonts: many fonts exhibit the original designs of the scripts with their own metrics, some other feature specific decorative innovative styles. These fonts are then probably suitable only for monolingual documents, and not arbitrary texts or a common UI for applications. Notably I don't know if their proposed "Horizontal" variant for the Khitan Small Script is suitable for such uses, i.e. if it will still be correctly represent texts in a readable fallback way without loosing too much information (like it is possible, for example, with Hangul legacy Jamos used on consoles with low resolution with narrow monospaced fonts, insead of showing normal syllabic clusters arranged in a square layout, assuming that Korean users knowing the language will still recognize the syllabic boundaries, or will want to add some extra punctuation like apostrophes to mark them explictly without ambiguity).


However the following scripts encoded since Unicode 11.0 (at start of this issue in December 2019), are now supported by some Noto fonts (which may still need some additional developments or fixes, so they are in beta, and some are still not hinted or not available in other styles, Serif/Sans/UI, Bold, Italic); only one living script added in Unicode 13.0, Yezidi is well documented and in development (but its usage is shifting, their community is spread across different countries where they are now either endangered minorities or refugees, forced to use other languages and scripts, and with poor support or absence of support for education in their culture; many are then shifting to the Arabic script, as education is mostly provided by islamic schools and mosquees): what was done for enabling Hanifi Rohingya or Dives Akuru should be done for Yezidi for the same reason of preserving endangered cultures (it's more important and more urgent to support them than supporting Elymaic or Cypro-Minoan which are defunct scripts IMHO); Vithkuqi is the only script recently added in Unicode 14.0 which is currently in active development.

In February 2022, three additional scripts encoded in Unicode 11.0, 13.0 and 14.0 had Noto fonts released in a first version to support them (with still pending works to fix some issues) :

  • Makasar which is an historic script, almost completely replaced by Bugis/Lontara (along with Latin), but still important for historic/cultural reasons, found in a few places and old books, and may be revived easily; it was never printed, only handwritten;
  • Dives Akuru which is the simplest script to support and, even if it fell out of use, this was more recent than for the two others, and it also has sources of books of cultural importance in the Southern Maldives and the history of today's Maldivian language ; and
  • Tangsa which is used to write modern language with the same name and spoken in India and Myanmar.
ISO 15924 code English script name Nom d'écriture en français Unicode alias Unicode age ISO 15924 since Noto fonts
Dogr 328 Dogra dogra Dogra 11.0 2016-12-05 Noto Serif Dogra
Gong 312 Gunjala Gondi gunjala gondî Gunjala_Gondi 11.0 2016-12-05 Noto Sans Gunjala Gondi
Maka 366 Makasar makassar Makasar 11.0 2016-12-05 Noto Serif Makasar
Medf 265 Medefaidrin (Oberi Okaime, Oberi Ɔkaimɛ) médéfaïdrine Medefaidrin 11.0 2016-12-05 Noto Sans Medefaidrin
Rohg 167 Hanifi Rohingya hanifi rohingya Hanifi_Rohingya 11.0 2017-11-21 Noto Sans Hanifi Rohingya
Sogd 141 Sogdian sogdien Sogdian 11.0 2017-11-21 Noto Sans Sogdian
Sogo 142 Old Sogdian ancien sogdien Old_Sogdian 11.0 2017-11-21 Noto Sans Old Sogdian
Hmnp 451 Nyiakeng Puachue Hmong nyiakeng puachue hmong Nyiakeng_Puachue_Hmong 12.0 2017-07-26 Noto Serif Hmong Nyiakeng
Wcho 283 Wancho wantcho Wancho 12.0 2017-07-26 Noto Sans Wancho
Elym 128 Elymaic élymaïque Elymaic 12.0 2018-08-26 Noto Sans Elymaic
Nand 311 Nandinagari nandinâgarî Nandinagari 12.0 2018-08-26 Noto Sans Nandinagari, Noto Serif Nandinagari
Chrs 109 Chorasmian chorasmien Chorasmian 13.0 2019-08-19 Noto Sans Chorasmian
Diak 342 Dives Akuru dives akuru Dives_Akuru 13.0 2019-08-19 Noto Serif Dives Akuru
Yezi 192 Yezidi yézidi Yezidi 13.0 2019-08-19 Noto Serif Yezidi
Cpmn 402 Cypro-Minoan syllabaire chypro-minoen Cypro_Minoan 14.0 2017-07-26 Noto Sans Cypro Minoan
Toto 294 Toto toto Toto 14.0 2020-04-16 Noto Serif Toto
Ougr 143 Old Uyghur ancien ouïgour Old_Uyghur 14.0 2021-01-25 Noto Serif Old Uyghur
Tnsa 275 Tangsa tangsa Tangsa 14.0 2021-02-17 Noto Sans Tangsa
Vith 228 Vithkuqi vithkuqi Vithkuqi 14.0 2021-02-17 Noto Sans Vithkuqi, Noto Serif Vithkuqi

Also the following scripts are encoded in ISO 15924, but still not in Unicode (so it's for now impossible to define Noto fonts for them, except experimentally with PUA mappings, possibly useful to create Unicode encoding proposals and working documents in PDF form, or test websites).

Some of these scripts are planed to be encoded very soon, maybe in Unicode 15.0, much earlier as expected as their addition in ISO 15924 is also very recent, made at the same time as the announcement that these scripts were planed for faster encoding and support (long before other scripts waiting since many years). This is because they have a wide and wellknown modern usage (but were forgotten for too long in the encoding processes, possibly there wer political issues blocking them for ISO 15924). See Proposed New Characters: The Pipeline on the Unicode site:

ISO 15924 code English script name Nom d'écriture en français Unicode alias Unicode age ISO 15924 since Noto fonts
Kawi 368 Kawi kawi     2021-12-03
Nagm 295 Nag Mundari nag mundari     2021-12-03

Working on developping beta versions of open fonts for the previous scripts (but still with PUA mappings, until these scripts are encoded) is possible, if one want to accelerate first release of a suitable Noto font for them, because the proposals are already solid to support a significant part of these scripts and create test documents that will help finalize the Unicode encoding and review its pre-released version before it is finalized for ever. Such documents won't be easily interchangeable (except when using webfonts for HTML documents, or by embedding these fonts in documents, or for providing rendering snapshots that will be useful for these final talks).

Other scripts, most of them in the Unicode roadmap for later possible encoding in the SMP and two others in the Unicode roadmap for later possible encoding in the SIP, include :

ISO 15924 code English script name Nom d'écriture en français Unicode alias Unicode age ISO 15924 since Noto fonts
Afak 439 Afaka afaka     2010-12-21
Blis 550 Blissymbols symboles Bliss     2004-05-01
Cirt 291 Cirth cirth     2004-05-01
Egyd 070 Egyptian demotic démotique égyptien     2004-05-01
Inds 610 Indus (Harappan) indus     2004-05-01
Jurc 510 Jurchen jurchen     2010-12-21
Kitl 505 Khitan large script grande écriture khitan     2015-07-15
Kpel 436 Kpelle kpèllé     2010-03-26
Leke 364 Leke léké     2015-07-07
Loma 437 Loma loma     2010-03-26
Maya 090 Mayan hieroglyphs hiéroglyphes mayas     2004-05-01
Moon 218 Moon (Moon code, Moon script, Moon type) écriture Moon     2006-12-11
Nkdb 085 Naxi Dongba (na²¹ɕi³³ to³³ba²¹, Nakhi Tomba) naxi dongba     2017-07-26
Nkgb 420 Naxi Geba (na²¹ɕi³³ gʌ²¹ba²¹, 'Na-'Khi ²Ggŏ-¹baw, Nakhi Geba) naxi geba, nakhi geba     2017-07-26
Pcun 015 Proto-Cuneiform proto-cunéiforme     2021-01-25
Pelm 016 Proto-Elamite proto-élamite     2021-01-25
Phlv 133 Book Pahlavi pehlevi des livres     2007-07-15
Psin 103 Proto-Sinaitic proto-sinaïtique     2021-01-25
Ranj 303 Ranjana ranjana     2021-01-25
Shui 530 Shuishu shuishu     2017-07-26
Sunu 274 Sunuwar sunuwar     2021-12-03
Ranj 303 Ranjana ranjana     2021-01-25
Teng 290 Tengwar tengwar     2004-05-01
Visp 280 Visible Speech parole visible     2004-05-01
Wole 480 Woleai woléaï     2010-12-21

But also these scripts, roadmapped for possible later encoding in Unicode (but that still don't have any ISO 15924 script code assigned):

ISO 15924 code English script name Nom d'écriture en français Unicode alias Unicode age ISO 15924 since Noto fonts
? ? (Bagam) ? ?
? ? (Balti-B) ? ?
? ? (Brusha) ? ?
? ? (Eebee Hmong) ? ?
? ? (Eskayan) ? ?
? ? (Kerinci) ? ?
? ? (Khema (Gurung)) ? ?
? ? (Khitan ideographs) ? ?
? ? (Khotanese) ? ?
? ? (Lampung) ? ?
? ? (Landa) ? ?
? ? (Mandombe) ? ?
? ? (Oracle Bone Script) ? ?
? ? (Pallava) ? ?
? ? (Pau Cin Hau Syllabary) ? ?
? ? (Pitman Shorthand) ? ?
? ? (Pungchen) ? ?
? ? (Pyu) ? ?
? ? (Sirmauri) ? ?
? ? (Small Seal Script) ? ?
? ? (Tani) ? ?
? ? (Tikamuli) ? ?
? ? (Kirat Rai) ? ?
? ? (Tocharian) ? ?
? ? (Tolong Siki (Kurukh)) ? ?
? ? (Tulu-Tigalari) ? ?
? ? (Vatteluttu) ? ?
? ? (Zou) ? ?

Other proposals include ranges of characters proposed for musical notations, mapped as part of the ISO 15924 "Common" script, even if they mostly used in contexts where (traditional) Han ideographs are used. These includes the Unicode encoding proposals for: Chinese Flute Musical Notation, Chinese Lute (or Pipa) Musical Notation, and Jianzi Musical Notation.

As well these scripts are tentatively roadmapped for later encoding in Unicode (and they also don't have any ISO 15924 script code assigned), but lack a formal proposal containing enough information to make a solid proposal :

ISO 15924 code English script name Nom d'écriture en français Unicode alias Unicode age ISO 15924 since Noto fonts
? ? ¿Beria? ? ?
? ? ¿Chalukya (Box-Headed)? ? ?
? ? ¿Chola? ? ?
? ? ¿Khe Prih (Gurung)? ? ?
? ? ¿Linear Elamite? ? ?
? ? ¿Lontara bilang-bilang? ? ?
? ? ¿Marchung? ? ?
? ? ¿Pungchung? ? ?
? ? ¿Rongorongo? ? ?

Finally other scripts, which were added to ISO 15924, have been rejected as unsuitable for encoding in Unicode (meaning that they can only be supported by PUA, managed by private agreements such as ConScript), or did not provide enough information for even a pre-allocation in the Unicode roadmap :

ISO 15924 code English script name Nom d'écriture en français Unicode alias Unicode age ISO 15924 since Noto fonts
Piqd 293 Klingon (KLI pIqaD) klingon (pIqaD du KLI)   (rejected) 2015-12-16
Sara 292 Sarati sarati   (?) 2004-05-29

Within the scripts above, Egyptian Demotic is probably the most wanted, it has tons of documents. It is one of the three scripts that have the oldest ISO 15924 code assigned in 2004, and still not encoded in Unicode (and still no pending proposal to encode it that can be coherent for being accepted also by ISO). This is strange, given the large number of egyptologists around the world, and tons of documents archived in famous public libraries (notably in France and US for the largest collections, but there are many as well in Israel, UK, and Russia).

For now it is supported partly by fonts hacking some other encodings (sometimes ASCII, may be Coptic with a few custom mappings, but Egyptian Demotic also requires lot of specific ligatures, and probably many "variant" composition sequences; but linguists are already using a productive subset without variants or rare ligatures; the problem is that it should be RTL by default, and ASCII or Coptic are LTR; another possibility could be based on a mapping to the existing Unicode encoding of Meroitic Cursive).

Using PUA for unencoded scripts will often not render correctly as PUAs are LTR by default and there's unfortunately no RTL subset of PUA - Look at the responses [1] and [2] to "RTL PUA?" by Michael Everson on the Unicode mailing list, 2011-08-19.

Two other living script (Shuishu and Naxi Dongba) are also the only two pictographic scripts still living and in use today (in China), but they are highly endangered (despite of their very specific nature: how could they resist for a so long time and did not evolve into some ideographic form like Han, i.e. with logographic/semantic and syllabic features mixed and some simplifications?). These are not "complex" scripts by their composition feature (they are however moderately large), but we lack sources for these pictograms which could be simply digitized and mapped directly. However Unicode does not seem to work on them, as most work in China for Unicode is done for Han by the Ideographics Rapporter Group, visibly not interested in it. There are probably other interested groups in China that could prepare a list of pictograms easily, as long as it remains a living script there, or would help archiving samples in Chinese libraries and elsewhere to preserve them. May be there are people now living outside China that have kept samples (books, or other pieces of art such ceramics, jewelry, tools, clothes, photographs) which would be useful for their urgently needed encoding, and that could be collected on community photo sharing site (with open licences) or as PDF facsimiles, and that could help graphics designers to redraw them (e.g. as SVG shapes) and then collect these glyphs into a beta font useful for making a script encoding proposal in Unicode/ISO/IEC 10646. See:
https://www.endangeredalphabets.net/alphabets/shuishu/
https://www.endangeredalphabets.net/alphabets/naxi/

Naxi Geba is also living and endangered, but it is a syllabary (with variable phonetization depending on reader, which may result differences of interpretation and delaying the normative encoding), not pictograms like Naxi Dongba. It is also used to write the Naxi language; it was created to complement or replace Naxi Gongba which was too defective for modern use and sufficient coverage of the language. It is still used by a minority (with no really active support by the local Chinese government and educative institutions in Yunnan, only kept in archives for cultural reason).

Ranjana (or Lanydza or Lantsa) is an indic script used in Nepal and Tibet, with a very rich history and tons of documents. It has been endangered in Nepal by the forced adoption of Devanagari in the 1950s-1980s. But it still has fonts already available for it, it is displayed everywhere, and it is actively used and now supported again by the government of the Kathmandu Valley to write the Newa language, so it should be easily encoded (at least the core part even if there are additions later). See:
https://www.endangeredalphabets.net/alphabets/ranjana/
https://www.facebook.com/ranjanascript/

@adtbayuperdana
Copy link

I have a made an experimental makassar font which can be downloaded here: https://aksaradinusantara.com/fonta/font/Salapa%20Jangang?key=aa166f01886b8b5271fd984ac79a2f5a

it needs some refinement but may be useful as a starting point

@r12a
Copy link

r12a commented Nov 12, 2020

@adtbayuperdana great to have this. Thank you. Looking at the licence i wasn't 100% sure whether i could make a webfont from this, and serve it with my Makasar character app. Could you confirm?

Also, i noticed that the digit glyphs applied to the ASCII range look like arabic digits. Arabic digit code points don't map to glyphs in the font. This may be a bug, given the following passage in the block description of the unicode standard:

Digits. The available Makasar manuscript sources show two distinct sets of digits. The first set strongly resembles European digits and can be represented with U+0030..U+0039. The second set strongly resembles Arabic-Indic digits, and can be represented with U+0660..U+0669. Therefore, script-specific digits for Makasar are not separately encoded.

Digits are frequently used, and both sets occur concurrently in the sources. The Arabic-Indic digits are restricted to Arabic-language environments—particularly for expressing dates of the Hijri era. The European digits are used for general purposes, but occur within Arabic-language contexts for writing non-Hijri dates, specifically those of the Gregorian calendar.

Fwiw, here's what i see (ascii code points on the left, arabic on the right):
Screenshot 2020-11-12 at 14 21 19

@adtbayuperdana
Copy link

adtbayuperdana commented Nov 12, 2020

@r12a the regularization of Salapa is kinda experimental since (to my knowledge) a printed version of the script has never been developed, so i have nothing i can reference to. At times I still tweak a lot of its aspect to look somewhat more authentic. For styles with clearer reference, I have tried to make three other fonts. They are not released anywhere yet, but here's a sample of them:

tes jangang2.pdf

perhaps they are better if you want to make a webfont that are more representative of Makassar?

right, the digits should be corrected! although, i saw eastern arabic digits in makassar/buginese manuscript as not very far off from from current iterations of that digit. However, western arabic digits in those manuscripts looks noticeably different than "normal" ones. Whether makassar font should just follow "normal" shapes or have makassar/buginese shapes for the western arabic digits is a thing that i am unsure of.

@sridatta1
Copy link

sridatta1 commented Jun 24, 2022

With Noto Serif Makasar font being available, can this issue be closed?
I can't find the font in fonts.google.com though. Maybe the latest fonts are not yet up on the Google fonts site.
https://github.com/notofonts/noto-fonts/tree/main/hinted/ttf/NotoSerifMakasar

@simoncozens
Copy link
Contributor

I'm keeping it open for the moment because for some reason Makasar was missed out when I moved everything to the new build system. Once it's moved across we can get it on Google Fonts and close this issue.

@simoncozens simoncozens transferred this issue from notofonts/noto-fonts Jun 26, 2022
@verdy-p
Copy link

verdy-p commented Dec 22, 2022

There's now the Nag Mundari script [Nagm/295] without font. It is encoded since Unicode 15.0 (September 2022). See
http://blog.unicode.org/2022/09/announcing-unicode-standard-version-150.html

@simoncozens
Copy link
Contributor

We’re working on a number of new scripts for 2023, Nag Mundari being one of them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants