Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do we need a specialized version of text-spacing-trim: trim-all for brackets vs commas? #377

Open
frivoal opened this issue Oct 2, 2023 · 3 comments
Labels
question Questions about how Japanese works. These issues should be tracked in i18n-activity tracker.

Comments

@frivoal
Copy link
Contributor

frivoal commented Oct 2, 2023

As requested in w3c/csswg-drafts#4246 (comment), and resolved in w3c/csswg-drafts#8482, the CSS-WG has added the trim-all value to the text-spacing-trim property. This allows authors to set fullwidth punctuation using half-width glyphs independently of context, enabling them to handle the situations described in fig 65, fig 66, or fig 67 of JLREQ.

(Note that this comes in addition to the other preexisting values, which contextually trim full-width punctuation, dealing with 3.1.4, 3.1.5 or 3.1.9)

The CSS-WG suspects that some of the use cases (such as fig 67) are semantic rather than stylistic, and that for those, we would probably also need something in unicode to be able to handle that regardless of css, preserving the separation of style and content.

Within the cases that are stylistic, authors can now achieve what they want, but depending on exactly which punctuation marks they want to apply it to and in what context, they may need to inject markup into their document to set up classes on the right subset, so that text-spacing-trim: trim-all can be applied through the right selector.

Is that good enough?

Do we instead need more specialized values (maybe trim-brackets or trim-comma) that trim regardless of context, like trim-all, but only apply to certain subsets of fullwidth punctuation? If there are well established patterns, maybe having such values could make life easier for authors, but adding many values to deal with various subjective preferences seem inappropriate.

See w3c/csswg-drafts#8482 (comment) for the minutes of the meeting where this was discussed.

@frivoal frivoal added the question Questions about how Japanese works. These issues should be tracked in i18n-activity tracker. label Oct 2, 2023
@frivoal frivoal changed the title Do we need specialized version of text-spacing-trim: trim-all for brackets vs commas? Do we need a specialized version of text-spacing-trim: trim-all for brackets vs commas? Oct 2, 2023
@KobayashiToshi
Copy link
Contributor

まず,一般的な感想ですが,大変そうな問題に取り組むのですか.この問題は,ほっておけばいいようにも思う.つまり,たいした問題でない.

理由1 3桁ごとの数字表記は,表組等を除き,徐々に少なくなっている.“二三、〇〇〇人”ではなく,“二万三〇〇〇人”とする表記の方が多い.横組の表組ではアラビア数字を使用するので,この問題はでない.

理由2 3桁ごとの数字での区切りの読点を二分にする方法を採用している人は,どちかといえば,少数派(私は少数派です).ただし,InDesignでは簡単に処理できるので,減ってはいない.

理由3 実は,漢数字での3桁ごとの読点を二分にする方法を採用した場合,以下の場合も同様に扱わないといけない.(もちろん,これを詰める方針と詰めない方針とがあるが,詰める方針の人は少数派,私は少数派)
 二、三の問題 五、六個

で,漢数字の3桁ごとの読点を二分にする方法は,するか,しないかだけで判断できる.二分にするとすれば,全部がそうなる.しないとなれば全部をそうしないでよい.ですので,ある程度は機械的に処理できる.しかし“二、三の問題”や“五、六個”は,そうではない.この言葉で詰める方針の場合では“にさんの問題”と読む場合は詰めるが,“に”と“さん”の問題と読める場合は,詰める方針であっても詰めない.まさに文脈による,といえる.

ですので,機械的に処理できるが,漢数字の中の読点をすべて二分という処理はできない.

なお,漢数字の3桁ごとの読点を二分とする場合,縦組で1字1字のアラビア数字を“2、300”と縦向きにする場合も,この読点は二分にしないといけない.つまり対象は数字だけでなく,アラビア数字も対象になる.

で,おっしゃるように,字幅が二分の読点が準備され,それを著者なり編集者が必要に応じて挿入すればよい,ということになる.

もう一つの解決方法は,“マイナス二分”の文字を文字コードに追加する方法もある.ここ二分だけ詰めたいな,という場合はあるので,この“マイナス二分”の文字コードはあると便利です.写研のSAPCOLには,このコマンドがあった.

@kidayasuo
Copy link
Contributor

Below is a translation of the comment from bin-sensei, by chatGPT with a few instructions such as translating "二分" as "half-width".


Here's a translation of the provided text. I've tried to capture the nuances and meaning as accurately as possible:


First, my general impression is, are you tackling such a challenging issue? I feel like this problem might be one that could be left as is. In other words, it's not that significant.

Reason 1: The notation of numbers with every three digits is gradually becoming less common, excluding table layouts. Most people use notations like "二万三〇〇〇人" (20,300 people) instead of "二三、〇〇〇人". In horizontal table layouts using Arabic numerals, this issue doesn't arise.

Reason 2: Those who adopt the method of separating every three digits with half-width punctuation are, if anything, in the minority (I am in that minority). However, this can be easily handled in InDesign, so it hasn't decreased.

Reason 3: In reality, if one adopts the method of separating every three digits with half-width punctuation in kanji numerals, the following cases should be treated similarly. (Of course, there are policies for both packing and not packing, but those who adopt the packing policy are in the minority, and I'm one of them)

Examples: a few problems, five, six pieces.
So, the method of separating every three digits in kanji numerals can be decided simply by doing it or not. If you decide to separate, then everything follows that rule. If not, then everything doesn't. Therefore, to some extent, it can be mechanically processed. However, terms like "a few problems" or "five, six pieces" are not so straightforward. In cases where the intention is to read as "a few problems", it's packed, but when it can be interpreted as "the second and the third problem", even with a packing policy, it isn't packed. It's truly context-dependent.

Therefore, although it can be mechanically processed, it's not possible to treat all punctuation within kanji numerals as half-width.

Additionally, when punctuating every three digits in kanji numerals as half-width, when Arabic numbers are vertically arranged character by character as "2、300", this punctuation must also be half-width. Meaning, not only kanji numerals but Arabic numerals are also targeted.

As you've said, if half-width punctuation is prepared, then authors or editors can insert it as needed.

Another solution could be to add a "minus half-width" character to the character code. There are times when you'd want to pack just this half-width, so having this "minus half-width" character code would be convenient. The SAPCOL of Shaken had this command.


Please note that some technical or domain-specific terms related to typography or layout design might not have direct English equivalents, so I translated in a way to convey the intended meaning. Let me know if further context or refinements are needed!

@macnmm
Copy link
Contributor

macnmm commented Oct 14, 2023

I also wonder if partial compression scenarios are neglected with the all-or-nothing approach, but in web layout the odds of needing partial prioritized compression of the punctuation glyphs is high.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Questions about how Japanese works. These issues should be tracked in i18n-activity tracker.
Projects
None yet
Development

No branches or pull requests

4 participants