Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dotted circle between HALANT and CONS_FINAL_MOD #478

Closed
dscorbett opened this issue Apr 27, 2017 · 6 comments
Closed

Dotted circle between HALANT and CONS_FINAL_MOD #478

dscorbett opened this issue Apr 27, 2017 · 6 comments
Labels

Comments

@dscorbett
Copy link
Collaborator

When a CONS_FINAL_MOD follows a HALANT, HarfBuzz inserts a dotted circle between them; for example, ⟨த்³⟩. This is in accord with the USE specification, but the specification is wrong, at least for the superscript and subscript digits used in Tamil.

@behdad
Copy link
Member

behdad commented Apr 27, 2017

Can you references Unicode text on this?

@dscorbett
Copy link
Collaborator Author

What little Unicode says about this is in chapter 12:

The Tamil script has fewer consonants than the other Indic scripts. When representing the “missing” consonants in transcriptions of languages such as Sanskrit or Saurashtra, superscript European digits are often used, so ப² = pha, ப³ = ba, and ப⁴ = bha. The characters U+00B2, U+00B3, and U+2074 can be used to preserve this distinction in plain text.

It does not describe how digits interact with virama or vowel signs, but such combinations are attested.

@behdad
Copy link
Member

behdad commented Apr 27, 2017

Also note that Tamil does NOT go through USE shaper.

@behdad
Copy link
Member

behdad commented Apr 28, 2017

Here's Andrew Glass's comment re this issue (via private mail):

For the latest issue, the bug report is correct (aside from the fact that Tamil doesn't get handled by USE in either Harfbuzz or Uniscribe). That is to say, items having ISC = Syllable_modifier should be allowed to combine with Virama, since a virama terminated cluster is still a viable orthographic syllable (whether or not it is actually pronounceable). So the change to make here is to modify the virama terminated cluster definition to allow FM after H:

[< R | CS >] < B | GB > [VS] (CMAbv)* (CMBlw)* (< H B | SUB > [VS] (CMAbv)* (CMBlw)*)* H [FM]

That makes me wonder if we should allow multiple FM. I've not heard of the need, but in the abstract this could occur at least with the Tibetan syllable modifiers. So if I were going to update the spec and the code, I would add that flexibility to both the standard and virama-terminated clusters . . . (FM)*

@dscorbett
Copy link
Collaborator Author

One reason to allow multiple FMs is that U+1A7F TAI THAM COMBINING CRYPTOGRAMMIC DOT may appear many times on one base.

@Richard57
Copy link

Except that U+1A7F is actually a Nukta - to which it might be corrected this week.

<RA HAAM, MAI SAM> ought to be another example - but (a) I can't find any instances and (b) their categories may also be corrected this week (UTC #151).

@behdad behdad closed this as completed in 3cc84f4 Jul 14, 2017
tallytalwar added a commit to tangrams/harfbuzz-icu-freetype that referenced this issue Aug 18, 2017
7917792f 1.4.8
5dc30451 Two fixes to avar mapping
dc2c418e [check-defs/symbols.sh] Drop empty-symbol lines
6f38845d [hb-shape] Rename --show-messages to --trace
e6035055 [hb-shape] Improve shaping-debug output
65f64d14 Unbreak arm-none-eabi build again (#514)
fc15e60e 1.4.7
c1432bce [arabic] Adjust feature order again
9813be3d [coretext] Allow to disable kern (#508)
9dd29c68 [use] Allow up to two medial-below letters
216b003c [use] Fix shaping of U+AA29 CHAM VOWEL SIGN AA
f1cd7ca8 [indic] Add github URL
3cc84f45 [indic] Fix harfbuzz/harfbuzz#478
e359a4b8 [indic] Disable automatic ZWNJ handling for Indic features
cdf1fd06 [indic] Add infrastructure to disable ZWNJ-skipping in context-matching
3a73e0d5 Shaping tests for Tibetan vowels (#446)
4e21ec54 Fix for reordering of Tibetan vowel u (#443)
ad52e044 Win32/NMake builds: Support builds from GIT (#498)
3b0e47ca Fix arm-none-eabi build (fixes #451) (#496)
76c4873e Support branch prediction helpers on clang compiles (#491)
7dba3063 Handle allocation failure in hb-language code
92e2c4ba Avoid using strdup inside library. (#488)
06cfe3f7 Do not skip TAG characters in glyph substitution (#487)
18172216 Minor
15273698 [cmake] Add framework build support (#484)
bf50ddaf [cmake] minor (#482)
141b33de 1.4.6
74b99ef2 Fix graphite2 rtl conversion (#475)
69664131 [cmake] Final touches (#473)
aacca375 Fix clang -Wcomma warnings (#471) (#472)
4d7c5206 [cmake] Remove HB_DISABLE_DEPRECATED as it seems needed for pango build (#470)
5ecf96e3 Use absolute paths of ragel generated headers (#467)
c42869eb Small doc fix: `make check` runs the tests (#469)
75931427 [cmake] Fix try compile link issues (#466)
cb021e14 [cmake] typo (#465)
a41d5ea4 [cmake] Add atomic ops availability detection (#464)
85685882 [cmake] Remove NO_MT flag (#462)
c04c1fe8 Blacklist GDEF table in additional Tahoma versions. (#459)
adfd4ae1 [cmake] Improve third party libraries support (#461)
3a8bc572 [cmake] Add utils build support (#460)
bc1244e2 NMake Makefiles: Fix ICU builds
a4471d0c Move list of ragel sources to Makefile.sources as well
d2acaf6d Split ragel generated files lists and remove hardcoded rl files lists (#453)
7d64c0ef Add CMake build support (#444)
740fdbcd avoid UBSan warning in get_stage_lookups (#450)
8d256841 Current fonttools (3.9.1) generate subset-file called font.subset.ttf instead of older font.ttf.subset
c2a9de15 Updated samples: record-it.sh is now record-test.sh
f2e6c7ce [tools] Make hb-unicode-code work with Python 3
edcf6344 Blacklist more versions of Padauk
e693ba77 [ci] Fix msys2 build on AppVeyor
91570a1e Just always use strtod here
539571c1 src/hb-common.cc: Fix build on older Visual Studio
b90fb83e Visual Studio builds: Fix Introspection when UCDN enabled
f0aa167e Update Visual Studio builds for UCDN usage
60e2586f 1.4.5
47e7a180 Revert "Fix Context lookup application when moving back after a glyph delete"
3c080a7a Fix buffer serialize of empty buffer
8e42c3cb 1.4.4
9ac9af72 Add TODO item
5aec2fb8 Remove TODO item that is not going to happen
b9b005f3 Fix Context lookup application when moving back after a glyph delete
a1150144 Add few tests found by libFuzzer and oss-fuzz
85630996 Fix buffer-overrun with Bengali reph positioning code
6685d281 1.4.3
a657f23c Blacklist another instance of Padauk (#419)
70202983 [ci] Disable vcpkg freetype installation and fix Appveyor CI (#422)
44f7d6ec Guard against underflow when adjusting length (#421)
45766b67 [indic] Add support for Grantha marks that may be used in Tamil to th… (#401)
d4bb52b9 Unbreak hb-coretext build
c8dfed8e Merge pull request #357 from khaledhosny/graphite-scale
7c47474f Set LC_ALL instead of LANG when creating harfbuzz.def
ffde3c9f hb-font: Fix a potentially undefined use of memcmp() (#413)
09594df1 Update ax_pthread.m4 to latest upstream version
a6ced90e test: Fix some memory leaks in test-font.c (#409)
925ceacf util: Add missing field initialisers in constructor (#410)
73c6dcbb Silence Coverity warning
466b3e58 Shuffle things around a bit
fc8189b6 Minor
d3d36918 Add dirty-state tracking to hb_face_t
2171f48b Add dirty-state tracking to hb_font_t
95808bad Add new API hb_font_set_face()
4ec19319 Add Win10 Anniversary Update version of Tahoma to GDEF blacklist. (#412)
1dd630a7 Minor
e888f642 Route Adlam through Arabic shaper
72c75487 Add Win7 version of himalaya.ttf to the GDEF table blacklist. (#407)
22af28a3 [var] Implement MVAR table
67a19116 [var] Whitespace
b435c7c4 [graphite] Stop creating unused gr_face
1b00a3b0 [graphite] Fix shaping with varying font sizes

git-subtree-dir: harfbuzz
git-subtree-split: e195981ad844986a475356bde9e937d422fb4209
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants