Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

My curated issues on Source Han Serif v2.000 #121

Open
tamcy opened this issue Oct 29, 2021 · 14 comments
Open

My curated issues on Source Han Serif v2.000 #121

tamcy opened this issue Oct 29, 2021 · 14 comments

Comments

@tamcy
Copy link

tamcy commented Oct 29, 2021

Hi,

This is my curated list of issues found in Source Han Serif v2.000. They are mostly related to the newly added HK variant. Issues thought to be related to the build process migration as mentioned by the devs are skipped. Thank you!

💡: New glyph needed.
❶: Fixed in v2.001.

A. Issue with the masters

# Region Codepoint Current glyph name Remarks
❶ A1 HK U+270AE 𧂮 u270AE-HK Contour mismatch:
u270AE-HK

B. Glyph issues

# Region(s) Codepoint Current glyph name Remarks
❶ B1 HK U+3FFA 㿺 uni3FFA-HK 共's last stroke incorrect. Actually the CN glyph could be used, no need for a new glyph.
uni3FFA-HK
B2 TW,HK U+4E31 丱 uni4E31-TW This is just the bottom of 𢇇. Currently the design is a replication of the reference glyph, thus its two connecting strokes do not conform to Source Han Serif's chosen stroke type. Suggest to use characters like 關, 聯 as references.
uni4E31-TW
❶ B3💡 CN,TW,HK U+51F8 凸 uni51F8-JP Due to the following stroke form difference the JP glyph can't be shared with CN/TW/HK.
uni51F8-JP
❶ B4💡 CN,TW,HK U+51F9 凹 uni51F9-JP Ditto.
uni51F9-JP
❶ B5💡 CN,TW,HK U+534D 卍 uni534D-JP Ditto.
uni534D-JP
B6💡 HK U+63B0 掰 uni63B0-CN The 手 component on the left is different.
uni63B0-CN
❶ B7 HK U+74CA 瓊 uni74CA-JP The 攵 component isn't the same. Should either remap to uni74CAuE0101-JP or uni74CA-TW. Not sure if this is an isolated issue, still explicitly mention it here since it can be easily overlooked.
uni74CA-JP
❶ B8 HK U+7631 瘱 uni7631-TW The last stroke of 大 is different. Suggest to remap to uni7631-CN.
uni7631-TW
B9💡 TW,HK U+7661 癡 uni7661-CN The 匕 component is different, can't use CN form.
uni7661-CN
❶ B10💡 CN,TW,HK U+774F 睏 uni774F-JP The last stroke of 木 in 困 is different, can't use JP form.
uni774F-JP
B11💡 HK U+79B8 禸 uni79B8-TW This component is actually the bottom part of 禺/禽, the middle stroke could be vertical of slanted. The HK version adopts the slanted stroke. Do note that this 厶-like form is still three strokes, not two, thus the JP (not CN) form should be followed. A new glyph would be needed. While at it, I want to add that this 冂-like form in TC actually looks like the design in uni5D4EuE0101-JP or uni6A8EuE0101-JP. I am not asking to change all of the related glyphs here (we can say the triangle-like decoration of the vertical line for this serif form is kind of buried into the horizontal line due to the limit amount of space), but for this particular codepoint, which shows the 禸 component alone, you may consider preserving this detail. (Edit 2021/11/2: U+2F71 should also remap to this new glyph.)
uni79B8-TW
B12 (see L8 below) TW,HK U+8985 覅 uni8985-TW The first stroke of 女 is different.
uni8985-TW
❶ B13💡 HK U+8D81 趁 uni8D81-CN The stroke form of 人 is diffrent, can't use CN form.
uni8D81-CN
❶ B14 TW,HK U+9156 酖 uni9156-CN Design different for 酉. Can remap to uni9156uE0101-JP. BTW, uni9156-JP and uni9156-CN look the same to me.
uni9156-CN
❶ B15 TW,HK U+9165 酥 uni9165-JP Ditto. Can remap to uni9165uE0101-JP.
uni9165-JP
❶ B16💡 HK U+9D0C 鴌 uni9D0C-JP The topmost component (天) is different from JP (夭).
uni9D0C-JP
❶ B17💡 CN,TW,HK U+9F0F 鼏 uni9F0F-JP The component 鼎 can't be shared with JP.
uni9F0F-JP
❶ B18 TW,HK U+9F71 齱 uni9F71-TW The Heavy master of 耳 is isn't correct. The last stroke shouldn't protrude.
output
B19 HK U+200EE 𠃮 u200EE-HK Check the last stroke of the 水 component.
u200EE-HK
B20 HK U+20E78 𠹸 u20E78-HK Check the last stroke of the 衣 component.
u20E78-HK
B21 HK U+2A1DF 𪇟 u2A1DF-HK Check the last stroke of the second 人 component in 谷.
u2A1DF-HK

Note: For U+51F8 凸, U+51F9 凹, U+534D 卍 and U+9F0F 鼏, a correct version is found in the v1.x version of the font. Still, I strongly suggest to start over from the JP version for the first 3 glyphs because of the better design. v1x

C. Stroke form issues with the component 聽

# Region(s) Codepoint Current glyph name Remarks
❶ C1💡 HK U+5EF3 廳 uni5EF3-TW The component 耳 is the same as JP but different from TW, while the component 𡈼 is the same as TW but different from JP. Both are different from CN.
uni5EF3-TW
❶ C2💡 HK U+807D 聽 uni807D-JP Ditto.

D. Stroke form issues with the component 貫

# Region(s) Codepoint Current glyph name Remarks
❶ D1💡 CN,TW,HK U+645C 摜 uni645C-JP Glyph can't be shared due to stroke form difference.
uni645C-JP
❶ D2💡 CN,TW,HK U+8CAB 貫 uni8CAB-JP Ditto.
❶ D3 HK U+2037F 𠍿 u2037F-HK Check the 毌 component.
u2037F-HK
  • For U+645C 摜, the correct form could be found in v1, but the mapping for TW was wrong.
  • For U+8CAB 貫, the correct form could be found in v1.

E. Stroke form issues with the component 臣

san

  • The stroke form of 臣 is different for JP/KR and CN/TW/HK. Any TW/HK glyph that shares the glyph with JP/KR can be considered a bug.
  • And, I consider this an isolated bug, because for many cases a new glyph would be needed.
  • On the other hand, there are glyphs that are mapped to CN, which are intrinsically incorrect because the TW/HK standard form use "⿱𠂉一" while the CN form uses "⿱𠂉丶". I am also reporting them here because they can't be fixed by simply changing the mapping to JP.

gaam

# Region(s) Codepoint Current glyph name Remarks
❶ E1💡 TW,HK U+3A5C 㩜 uni3A5C-JP
❶ E2💡 TW,HK U+5C37 尷 uni5C37-CN
❶ E3💡 TW,HK U+61E2 懢 uni61E2-JP
E4💡 TW,HK U+64E5 擥 uni64E5-JP
❶ E5💡 TW,HK U+652C 攬 uni652C-CN
❶ E6💡 TW,HK U+6ABB 檻 uni6ABB-CN
❶ E7💡 TW,HK U+6B16 欖 uni6B16-JP
❶ E8💡 TW,HK U+6FEB 濫 uni6FEB-CN
❶ E9💡 TW,HK U+76EC 盬 uni76EC-JP
E10💡 HK U+787B 硻 uni787B-JP TW has its dedicated glyph.
❶ E11💡 TW,HK U+7925 礥 uni7925-JP
❶ E12💡 TW,HK U+81E9 臩 uni81E9-JP
❶ E13💡 TW,HK U+89A7 覧 uni89A7-JP
❶ E14💡 TW,HK U+8CE2 賢 uni8CE2-JP
❶ E15💡 TW,HK U+8D12 贒 uni8D12-JP
❶ E16💡 TW,HK U+8F5E 轞 uni8F5E-JP

F. Stroke form issues with the component 㡀

This can be easily overlooked, so here they are:

bit

# Region(s) Codepoint Current glyph name Remarks
❶ F1💡 TW,HK U+5F46 彆 uni5F46-CN
❶ F2💡 TW,HK U+87DE 蟞 uni87DE-CN
❶ F3💡 HK U+8E69 蹩 uni8E69-CN
❶ F4💡 TW,HK U+9128 鄨 uni9128-CN
❶ F5💡 TW,HK U+9C49 鱉 uni9C49-CN
❶ F6💡 TW,HK U+9DE9 鷩 uni9DE9-CN
❶ F7 HK U+9F08 鼈 uni9F08-HK Can also map TW glyph to HK

G. Stroke form issues with the component 今

gam

The form of 今 adapted by HK/TW is different from JP/KR/CN. This is unfortunate, I like the JP/KR form more, which should also be fine at least for HK (the HKSCS document is sometimes inconsistent). But anyway.

gam2

# Region(s) Codepoint Current glyph name Remarks
G1 HK U+3597 㖗 uni3597-HK
uni3597-HK
G2 HK U+6407 搇 uni6407-HK Ditto.
uni6407-HK
G3 HK U+6657 晗 uni6657-JP
G4 HK U+6C75 汵 uni6C75-HK
G5 HK U+6D5B 浛 uni6D5B-HK If unmodified, the form is essentially the same as uni6D5B-JP.

H. Stroke form issues with the component ⺮

siu

Similarly, Source Han Serif adopts different design of the ⺮ component for TW/HK. Not a big deal, just for better consistency.

bamboo

# Region(s) Codepoint Current glyph name Remarks
❶ H1 💡 HK U+7BB8 箸 uni7BB8-JP90-JP
H2 HK U+21828 𡠨 u21828-HK
H3 HK U+25D20 𥴠 u25D20-HK

(No section I in this post)

J. Glyph position issues

This is related to the glyph positioning discrepancy between HKSCS and the current form. HKSCS mostly prefers centering the glyph to the em box. Depending on whether the current glyph position is insisted by the mapped region, new glyphs might be necessary.

emcenter

# Region(s) Codepoint Current glyph name Remarks
J1 HK U+4EBB 亻 uni4EBB-CN
J2 HK U+5202 刂 uni5202-CN
J3 HK U+5FC4 忄 uni5FC4-JP
J4 HK U+624C 扌 uni624C-CN
J5 HK U+6C35 氵 uni6C35-JP
J6 HK U+72AD 犭 uni72AD-CN
J7 HK U+793B 礻 uni793B-CN
J8 HK U+7E9F 纟 uni7E9F-CN
J9 HK U+7F52 罒 uni7F52-CN
J10 HK U+8BA0 讠 uni8BA0-CN
J11 HK U+961D 阝 uni961D-CN
J12 HK U+248E9 𤣩 uni2EA9-JP
J13 HK U+2626A 𦉪 u2626A-HK

K. Others

# Region(s) Codepoint Current glyph name Remarks
❶ K1 HK uni819A-HK The contour of 月 is too wide.
uni819A-HK
@tamcy tamcy changed the title Curated issues on Source Han Serif v2.000 My curated issues on Source Han Serif v2.000 Oct 29, 2021
@tamcy
Copy link
Author

tamcy commented Oct 29, 2021

Turns out that I missed serveral items in my last post. My apology. These should have belonged to section B.

L. Addendum of glyph issues

L

# Region(s) Codepoint Current glyph name Remarks
❶ L1💡 HK U+3DDB 㷛 uni3DDB-CN Stroke type difference.
❶ L2 HK U+453D 䔽 uni453D-HK Stroke type difference.
❶ L3 HK U+50E7 僧 uni50E7-HK Stroke type difference.
L4 (on hold)💡 HK U+5DCD 巍 uni5DCD-TW Stroke type difference.
❶ L5💡 HK,TW U+5DD6 巖 uni5DD6-JP Stroke form difference, 山 component can't share with JP. See the character above for reference.
❶ L6💡 HK U+7171 煱 uni7171-JP Stroke type difference.
L7 (on hold)💡 HK U+8636 蘶 uni8636-TW Stroke type difference.
L8 (on hold) HK,TW U+8985 覅 uni8985-TW Stroke type difference. Can remap to uni8985-CN. BTW, The TW glyph should also map to CN. 8985 (Update Nov 16: I forgot why I wrote this, but obviously uni8985-CN cannot be used for TW and HK. uni8985-TW should be redesigned s.t. the middle of 女 is a horizontal stroke)

@pan-asian-wok
Copy link

pan-asian-wok commented Oct 30, 2021

Hi @tamcy! I hope you don't mind making me an addendum to your post. I don't want to flood the Issues section with more reports and my finding is similar to what you have discovered with other characters.

Stroke Form Issues with the Component 將

將 Stroke Component Issue

The stroke form of 將 is different for Japanese/Korean, Taiwanese, and Hong Kong versions. Because 將 is printed differently in the Taiwanese versions, it looked like all the glyphs with that component were redone to match. However, in the Hong Kong version, it looks like some characters with 將 were not redone if another character component was not different from the Japanese and/or Korean version. In other words, the stroke form is not consistent between all words with the 將 component in the Hong Kong version. Please refer to the picture above for examples.

@tamcy
Copy link
Author

tamcy commented Oct 30, 2021

@pan-asian-wok No problem. I am aware of the codepoints you mentioned, in fact there are 7 codepoints that exhibit the incorrect glyph form. They U+5C07 將, U+6F3F 漿, U+69F3 槳, U+5D88 嶈, U+588F 墏, U+87BF 螿 and U+8E61 蹡. Given that

  • This is obviously a component level issue. All codepoints that have no HK dedicated glyphs are mapped incorrectly. There's only one exception: U+93D8 鏘.
  • These 7 codepoints should have been mapped to CN, but turned out being mapped to JP.
  • Since the codepoints can be remapped to CN, no new glyph is needed. Which means the glyph difference is probably not overlooked.

As I have said, this issue is about isolated glyph issues, so those thought to be related to the build process migration will not be mentioned. Which is why I didn't bother reporting them.

(Update 2022/1/28: These codepoints are now consolidated in #155 with other issues found in v2.001)

@NightFurySL2001
Copy link

NightFurySL2001 commented Oct 30, 2021

Notes:
B1 B16 (U+9D0C 鴌): According to Unicode 14.0, there is no JP version, and the other regions use 天 rather than 夭. The JP glyph can probably be edited and renamed to either CN or HK and then map to all regions. (HK is more believable since it is in HKSCS)
image
No new glyph might be needed since currently ⿱夭鳥 is unencoded and there exist no IVDs for 鴌.

E (臣) series: Those that do not have a CN glyph can use a mapping too (E9-E12, E14-15).

J (centering): For simplified components (e.g. 纟讠), is it reasonable to follow HKSCS? Do other simplified components (e.g. 钅饣) need such adjustments too? Will Source Han Sans need this adjustments too?

L5 L6 (U+7171 煱): Can be applied for TW too.

L8 (U+8985 覅): Repeated with B12. Also, since 女 is different in CN and TW/HK but TW/HK both use horizontal stroke, just adjust the current TW/HK glyph will be good enough, i.e. no new glyph needed.

(Edit: wrong code given, please check again)

@NightFurySL2001
Copy link

Actually, just noticed that the style of G (今) series is not mentioned anywhere in Taiwan MOE style (at least on documents available online), it only exist in fonts similar to the style of 亠 and 宀 touching the horizontal stroke.

@punchcutter
Copy link
Member

@tamcy For B6 is the TW glyph completely unacceptable because of the top of 分? At least for now we could map it to TW. Sans is the same with HK mapped to CN.

@tamcy
Copy link
Author

tamcy commented Nov 16, 2021

@punchcutter Standards aside, either one is fine given this isn't a frequently used character. I'd suggest to leave it unchanged if it is going to be an interim solution, because the top-right stroke of TW's 分 is written as 乁, which isn't quite "native" to the HK version. But for CN's 掰, the left component can be seen in 拜, and its center component can be seen in 分, 扮 etc.

@tamcy
Copy link
Author

tamcy commented Nov 21, 2021

I hope it isn't too late, I would like suggest to put the fixes of L4 (U+5DCD 巍) and L7 (U+8636 蘶) on hold first, because there are actually more codepoints/glyphs affected, to the extent that I think it might be better to have it categorized as a "acceptable unifiable difference".

@punchcutter
Copy link
Member

@tamcy Not too late. I reformatted the issue lists here to add links to each point. That makes it easier to reference and since I'm also updating the release notes for the next version it's nice to link directly to a specific point like L4.

@tamcy
Copy link
Author

tamcy commented Dec 10, 2021

# Region(s) Codepoint Current glyph name Remarks
❶ B22 HK U+245C8 𤗈 u245C8-HK The left component is a 片, not 爿:
245C8

@punchcutter
Copy link
Member

@tamcy Thanks for checking and closing some issues. And also for putting the marker on items addressed in this issue. That's very helpful.

@tamcy
Copy link
Author

tamcy commented Jan 27, 2022

@punchcutter Don't mention it. Glad it helps!

@NightFurySL2001
Copy link

NightFurySL2001 commented Jun 5, 2022

@tamcy just to check, you have marked B6 as done (❶ B6), but there is no mention of it in the README.pdf or updates in the font file. (@punchcutter please take note of B6 is not fixed yet)

@tamcy
Copy link
Author

tamcy commented Jul 11, 2022

@NightFurySL2001 Thank you, I have removed the marker.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants