Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Justifying Korean text #95

Closed
r12a opened this issue Mar 16, 2016 · 2 comments
Closed

Justifying Korean text #95

r12a opened this issue Mar 16, 2016 · 2 comments
Labels
agreed-to-close-during-mtg i18n group has discussed and resolved to close, typically in telecon close? The related issue was closed by the Group but open here i:justification Text alignment & justification klreq s:css-text https://drafts.csswg.org/css-text/ script-kore Korean script tracker i18n is following a discussion, but doesn't require resolution. type-info-request

Comments

@r12a
Copy link
Contributor

r12a commented Mar 16, 2016

http://www.w3.org/Mail/flatten/index?subject=Justifying+Korean+text&list=public-i18n-cjk

Raised by:Richard Ishida
Opened on:2014-07-10

About: http://dev.w3.org/csswg/css-text/#text-justify-property
Raised by: Koji Ishii

Hello/안녕하세요

Could someone please help us to discuss what’s right for justifying Korean text? This is a bit long e-mail, sorry for not being able to write in short.

Here’s a background. Last year, the CSS WG discussed on the text-justify property[1] and made a few resolutions. The full resolutions are here[2], but in summary:

  1. Make justification behavior as automatic to the content language[3] as possible, and remove as much behavior-specific values as possible.
  2. With that, “inter-ideograph” value (to expand between ideographic characters) was removed, but “inter-word” value (not to expand between ideographic characters) is still in.

In this context, I’m having difficulty to come up with what’s good for Korean text.

In my understanding, there are 3 types of Korean documents:

  1. Ideographic only, ancient documents (may sometimes contain some hangul characters.)
  2. Mostly Hangul, a few to some ideographic characters per a paragraph or a page.
  3. All Hangul, no ideographic characters.

Q1. Is this understanding correct, or do I miss any other types?

I do not have a good sense of how many each documents are, so here’s the first question.

Q2. Can you give us the ratio of each type of documents on the web? I mean, ratios such as “0:40:60”. Any statistics would be great, but your own ratio as you feel is also helpful; if 10 people respond my-own-ratio, it’s a sort of statistics I suppose.
Q3. Is the ratio for papers/books/e-books different from the ratio for the web documents? How about TV/movie captions, signage, or anywhere else where web platform is used?

Next, let’s think about when author sets lang=“ko” to the document (and text-align:justify of course.) This case is easier because we can focus on what’s right for Korean. In this case, in my understanding, you want to expand only at spaces, correct? All existing browsers do not expand between Hangul, I suppose this is the correct behavior. However, Chrome/Safari expands between ideographic characters, I’m guessing this is not an expected behavior for type #2 documents and you want to fix this.

Q4. Is the assumption above correct?

The challenge in this case is that, you will not be able to justify type #1 documents, because text-justify does not have a value to expand between ideographic characters. If you want to solve this, you have following options:

  1. Mark such documents as lang=“zh” (Chinese.) I’m not sure how right or wrong this is to you; are ancient documents considered as Chinese, or are they ancient Korean? I’m guessing this is wrong, but just wanted to ask. I’m sorry if this is really a bad, impolite question, I hope you understand that I’m just trying to list up all technically possible options here.
  2. Propose CSS WG to revive “inter-ideograph” value, so that you can mark as lang=“ko” and optionally expand between ideographic characters.
  3. Make “expand between ideographic and Hangul characters” default, and always use “inter-word” for type Update CSS/JS links to HTTPS #2/Enhancing color contrast for headings #3 documents. This give you a choice, but as a cost, you have to mark all type Update CSS/JS links to HTTPS #2/Enhancing color contrast for headings #3 documents as “inter-word”. I’m guessing the cost does not worth the value here?
  4. Such documents are rare, justifying such documents are even rare to zero, so don’t need to fix this specific case (please consider Q2/Q3 above.)

Q5. Which option looks right to you, or anything else?

Next. This is harder one; when language is not specified. I suspect a large number of existing documents do not have lang, so this might affect backward compatibility more than Q5 does. I have to say that, in this case, there’s no single right solution because all existing browsers behave differently; we need to come up with some compromised, good enough behavior.

In this case, Chinese and Japanese documents want to expand between ideographic characters, while Korean type #2 documents do not, so there’s a conflict. I don’t know how to properly resolve this conflict, I’m guessing we should take Chinese and Japanese documents because they use justification more often, and the use of ideographic characters in Korea is not the primary use, but this is my personal opinion. Others might think differently, and answers to Q2/Q3 may also affect this.

Q6. What do you think about this?

Next. Let’s assume we took Chinese and Japanese (expand between ideographic characters) in Q6. In this case:

Q7. Do you want a) to expand between Hangul because Hangul and ideographic should behave the same way for type #2 documents, or b) not to expand between Hangul because doing so helps type #3 documents, even if it’s strange for type #2 documents?

Note that all browsers today do not expand between Hangul, even when they expand between ideographic characters. I have no idea how strange this behavior is to you, especially when thinking type #2 documents. In case you’re interested in seeing my investigation result of existing browser behaviors, here it is[4]. It’s primarily my own memo, quite terse and maybe hard to understand though.

Lastly, this is not a question, but if you create justified Korean HTML documents today, I recommend you to add 1) lang=“ko” and 2) text-justify:inter-word. It’s hard to predict how the future will be, but from what I can tell you at this moment, this is considered as the best practice to protect your documents in future.

If you could answer only part of questions, it’s still helpful. Thank you for reading this long e-mail, and look forward to hearing from you.

[1] http://dev.w3.org/csswg/css-text/#text-justify-property

[2] http://lists.w3.org/Archives/Public/www-style/2013Feb/0474.html

[3] http://dev.w3.org/csswg/css-text/#content-language

[4] http://1drv.ms/1r3iYme

/koji

@r12a r12a added pending Issue not yet sent to WG, or raised by tracker tool & needing labels. s:css-text https://drafts.csswg.org/css-text/ tracker i18n is following a discussion, but doesn't require resolution. and removed pending Issue not yet sent to WG, or raised by tracker tool & needing labels. labels Mar 16, 2016
@r12a r12a modified the milestone: track Mar 17, 2016
@r12a r12a added hangul i:justification Text alignment & justification type-info-request labels Jul 21, 2017
@r12a r12a added the klreq label Sep 6, 2018
@r12a r12a removed the hangul label Feb 25, 2019
@xfq
Copy link
Member

xfq commented May 6, 2020

I think this should be a klreq issue instead of a css-text issue (although the results of the issue may have an effect on css-text)?

@r12a
Copy link
Contributor Author

r12a commented Dec 4, 2020

I agree that this is more of a question for klreq (which didn't have a repo at that time). I pointed to the thread from w3c/klreq#12, and propose that we close this issue.

@r12a r12a added close? The related issue was closed by the Group but open here agreed-to-close-during-mtg i18n group has discussed and resolved to close, typically in telecon labels Dec 4, 2020
@r12a r12a closed this as completed Dec 10, 2020
@r12a r12a added the script-kore Korean script label Jul 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
agreed-to-close-during-mtg i18n group has discussed and resolved to close, typically in telecon close? The related issue was closed by the Group but open here i:justification Text alignment & justification klreq s:css-text https://drafts.csswg.org/css-text/ script-kore Korean script tracker i18n is following a discussion, but doesn't require resolution. type-info-request
Projects
None yet
Development

No branches or pull requests

2 participants