Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consolidation of Mapping Change Suggestions #37

Open
kenlunde opened this issue Apr 11, 2017 · 16 comments
Open

Consolidation of Mapping Change Suggestions #37

kenlunde opened this issue Apr 11, 2017 · 16 comments
Assignees

Comments

@kenlunde
Copy link
Contributor

kenlunde commented Apr 11, 2017

This issue is meant for tracking and submitting suggestions for mapping changes, meaning that a character might be better mapped to a different but existing glyph. Note that mapping changes, especially for ideographs, will trigger changes to GSUB features, such as the language-specific lookups of the 'locl' GSUB feature. Because tools are used to build the language-specific lookups of the 'locl' GSUB feature by using the CMap resources, such suggestions cannot be accepted as pull requests, and should instead be posted here. Issues that were submitted before this consolidation issue was opened are referenced by issue number.

The following changes were made in Version 1.001:

Post Version 1.001 Mapping Changes:

  • Map U+732A 猪 to uni732A-JP in the CN CMap resource per Issue Consolidation of Glyph Sharing Suggestions #38.
  • Map U+5009 倉 to uni5009-JP in the TW CMap resource per Issue Mapping Difference between Source Han Sans and Source Han Serif #53.
  • Map U+2EDE ⻞ to u2967F-CN in the CN CMap resource per Issue Design difference between Source Han Sans SC and Source Han Serif SC #55.
  • Map U+7300 猀 to uni7300-CN in the JP and KR CMap resources per Issue Design difference between Source Han Sans K and Source Han Serif K  #59.
  • Map U+526A 剪, U+5881 墁, U+688F 梏, U+6ADD 櫝, U+6C4B 汋, U+7006 瀆, U+70B7 炷, U+7258 牘, U+72A2 犢, U+72B3 犳, and U+7431 琱 to uni526A-CN, uni5881-CN, uni688F-CN, uni6ADD-CN, uni6C4B-CN, uni7006-CN, uni70B7-CN, uni7258-CN, uni72A2-CN, uni72B3-CN, and uni7431-CN, respectively, in the KR CMap resource per Issue Design difference between Source Han Sans K and Source Han Serif K  #59.
  • Map U+501C 倜, U+5192 冒, U+52C7 勇, U+553E 唾, U+5DFD 巽, U+641C 搜, U+73F9 珹, U+7A20 稠, U+7C3F 簿, U+8983 覃, and U+8D16 贖 to uni501C-CN, uni5192-CN, uni52C7-CN, uni553E-CN, uni5DFD-CN, uni641C-CN, uni73F9-CN, uni7A20-CN, uni7C3F-CN, uni8983-CN, and uni8D16-CN, respectively, in the KR CMap resource per Issue Design difference between Source Han Sans K and Source Han Serif K 2 #60.
  • Map U+4E7C 乼, U+5125 儥, U+58B0 墰, U+60C6 惆, U+6D2C 洬, U+6E54 湔, U+83C2 菂, U+83DF 菟, U+86C0 蛀, U+8729 蜩, U+8CD9 賙, U+90DC 郜, U+99B0 馰, U+9C4F 鱏, U+9D69 鵩, and U+9EF7 黷 to uni4E7C-CN, uni5125-CN, uni58B0-CN, uni60C6-CN, uni6D2C-CN, uni6E54-CN, uni83C2-CN, uni83DF-CN, uni86C0-CN, uni8729-CN, uni8CD9-CN, uni90DC-CN, uni99B0-CN, uni9C4F-CN, uni9D69-CN, and uni9EF7-CN, respectively, in the KR CMap resource per Issue Design difference between Source Han Sans K and Source Han Serif K 3 #61.
  • Map U+3B6D 㭭 and U+5225 別 to uni3B6D-JP and uni5225-JP, respectively, in the TW CMap resource.
  • Map U+5A66 婦 and U+7199 熙 to uni5A66uE0101-JP and uni7199-JP, respectively, in the KR CMap resource.
  • Map U+2F2C ⼬ to uniFA3C-JP in the JP (and, by extension, KR) CMap resource, and to uni5C6E-CN in the CN (and, by extension, TW) CMap resource.
  • Map U+284DC 𨓜 to uni9038-JP in the JP (and by extension, all) CMap resource.
  • Map U+8056 聖 and U+83BD 莽 to uni8056-TW and uni83BD-JP, respectively, in the KR CMap resource.
  • Investigate the U+F92C 郎 issue that affects the 'locl' GSUB feature.
@hfhchan
Copy link

hfhchan commented Apr 16, 2017

image
Though they are out of the scope for the TW subset, the JP glyphs for U+77D2 and U+61DC could be used for the TW locale, similar to Issue #26 and #32.

@kenlunde
Copy link
Contributor Author

@hfhchan: Because this is easy to do, it shall be done.

@tamcy
Copy link

tamcy commented Apr 24, 2017

u 8019_803b_tw

Incorrect mapping for U+8019 耙 and U+803B 耻 in TW. Both should look the same as the JP/KR glyph.

@kenlunde
Copy link
Contributor Author

kenlunde commented Apr 24, 2017

@tamcy: Noted. Thanks! (Also, while U+803B is outside the scope of the TW coverage, the fix is easy and shall be done.)

@tamcy
Copy link

tamcy commented Apr 26, 2017

Seems that U+9EFD 黽 is incorrectly serving CN glyph (CID 48026) in JP font with language set to "zh-tw". The language specific version SHSerif-TW uses the correct glyph which is CID 48025. Looks like an issue similar to #43?

@kenlunde
Copy link
Contributor Author

@tamcy: Yes, it is similar to #43, and the solution is to map U+2FCC ⿌ to uni9EFD-JP (CID+48025) in the TW CMap resource.

@hfhchan
Copy link

hfhchan commented May 2, 2017

image

The TW glyph for U+5225 should use that of the JP glyph. It is customary for the bottom left component to be joined upwards in the Code Charts, and the overall shape of 11202 looks too wide for TW/HK customary use.

The TW glyph for U+8382 and U+3B6D could borrow the JP glyph as well, since the bottom left protrusion is considerably closer to conventions.

@kenlunde
Copy link
Contributor Author

kenlunde commented May 2, 2017

While these mapping changes are relatively easy to implement, you missed the deadline for the dot release, given the extraordinarily large number of moving parts that are involved. These will need to wait for a subsequent release.

@kenlunde
Copy link
Contributor Author

kenlunde commented May 2, 2017

@hfhchan: I am willing to map U+3B6D 㭭 and U+5225 別 to uni3B6D-JP and uni5225-JP, respectively, in the TW CMap resource, but not U+8382 莂, mainly due to the inappropriate radical shape, which is far more striking.

@hfhchan
Copy link

hfhchan commented May 2, 2017

@kenlunde, isn't the JP glyph and CN glyph for U+8382 equally inappropriate for TW? Changing the fallback to JP glyph instead of CN would not introduce more inappropriateness, as far as I can tell...

@kenlunde
Copy link
Contributor Author

kenlunde commented May 2, 2017

@hfhchan: Exactly, which is why I suggest leaving the mapping as-is unless a significant number of people also support the change.

@tamcy
Copy link

tamcy commented May 4, 2017

Incorrect lookup data for U+90CE 郎 in JP and KR OTCs. When tagged in TW or CN, the font should substitute CID 41693 with CID 41694, but CID 41717 is incorrectly served as indicated in the jp2cn and jp2tw tables.

shserif-u90ce

@kenlunde
Copy link
Contributor Author

kenlunde commented May 4, 2017

@tamcy: I torpedoed my earlier reply, because the situation is quite complex, and involves KS X 1001 and GB 18030.

U+F92C 郎 (its canonical equivalent is U+90CE 郎) is a GB 18030 character, and its representative glyph is the same as uni90DE-CN, meaning with the extra stroke, so its code point maps to that CN glyph. U+F92C 郎 was a KS X 1001 character. Its K source was moved to U+FA2E 郞 with U+90DE 郞 being its canonical equivalent.

Because this will be addressed after the Version 1.001 update, I will give this some more thought. The work-around, especially if you're using the OTCs, is to explicitly select the CN font in the OTC.

@hfhchan
Copy link

hfhchan commented May 15, 2017

Consider removing jp glyph for U+85F2:

image

@kenlunde
Copy link
Contributor Author

The JP glyph for U+85F2 藲, uni85F2-JP, is already a candidate for removal.

@Jstamz
Copy link

Jstamz commented Sep 8, 2021

Glyph001
I found 12 KR glyphs of version 1.001 which are quite different to those in common Korean fonts.
(함초롬바탕/HCR Batang: The default font of Hangul word processor)
(HY 신명조, HY 견명조: Those fonts are common serif fonts, bundled in MS Office/Hangul Office)
(나눔고딕: Popular free sans serif font)
(한양해서: Widely used Regular script font in Korea)

I suggest to remap 8 of 12 glyphs above: 咎(u+548e), 嗤(u+55e4), 憊(u+618a), 潤(u+6f64), 竇(u+7ac7), 續(u+7e8c), 讀(u+8b80), 隙(u+9699). (Reason: Those glyphs are minor variants in Korea)

(1) I think the mapping for this character is referring to an incorrect glyph (I didn't see any other fonts render 潤(⿰⺡閏) as ⿰⺡⿵⾨壬. Comparing 潤[⿰⺡⿵⾨壬] to 閏[⿵⾨王] in the font, one could easily find the difference.)
潤(u+6f64): KR -> JP (*I think the KR glyphs for U+6F64 is both incorrect in Source Han Sans and Source Han Serif)

(2) KR glyphs for these characters are rarely used in Korea; JP glyphs is more often, widely used and similar to the traditional* glyph.
(*These characters are phono-semantic compounds; the common phonetic component is 𧶠, not 賣.)
竇(u+7ac7), 續(u+7e8c), 讀(u+8b80): KR -> JP

(3) These KR glyphs are not rare, but are minor variants.
咎(u+548e): KR -> JP
嗤(u+55e4): JP -> CN
憊(u+618a): KR -> JP
隙(u+9699): KR -> JP

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants