Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Maiyamok wraps alone to the start of a line #45

Open
r12a opened this issue Jan 27, 2021 · 10 comments
Open

Maiyamok wraps alone to the start of a line #45

r12a opened this issue Jan 27, 2021 · 10 comments

Comments

@r12a
Copy link
Contributor

r12a commented Jan 27, 2021

‎ๆ [U+0E46 THAI CHARACTER MAIYAMOK] should not wrap to the beginning of a line, even if surrounded by spaces. Space before the maiyamok is mandated by the writing style guide of the Royal Institute, which is considered to be the official Thai language style guide, and is used by government officials. The same guide also requires a space before maiyamok.

Specs:
This level of detail is not described in the CSS specs. However, it may be useful to add a note to the spec to say that occasionally characters like this need special handling, especially when following a space.

Tests & results:
interactive test, Maiyamok doesn't wrap to a new line alone, even if separated from the previous word by a space
Gecko, Blink, and WebKit, all fail the test.

Priority:
Marking as advanced, since the undesirable behaviour isn't a major problem.

@r12a r12a added i:line_breaking Line breaking & hyphenation gap p:advanced doc:thai labels Jan 27, 2021
@srakrn
Copy link

srakrn commented Jan 29, 2021

There might be questions on why this should behave identically to typing Maiyamok without preceding space. In other words, some might raise questions that this behaviour should be overridden by not typing space in front of Maiyamok.

The writing style guide of the Royal Institute--considered as an official Thai language style guide (and therefore is in use by government officials)--mandates the space in front of Maiyamok, despite that the space between the phrase and Maiyamok should not be broken into a new line.

@r12a
Copy link
Contributor Author

r12a commented Jan 29, 2021

The first comment in this issue contains text that will automatically appear in one or more gap-analysis documents as a subsection with the same title as this issue. Any edits made to that comment will be immediately available in the document. Proposals for changes or discussion of the content can be made in comments below this point.

Relevant gap analysis documents include:
Thai

@r12a
Copy link
Contributor Author

r12a commented Jan 29, 2021

Thanks, @srakrn. I paraphrased your comment in the text.

By the way, please let me know if there are other issues in Thai that we need to track. This Thai document is still an initial draft, and I have a plan to go through the Lao gap-analysis shortly to catch any related issues. I'll also be checking the currently open issues in this repo, too. But if you want to raise other items, please create a new issue for each.

@srakrn
Copy link

srakrn commented Jan 29, 2021

@r12a: Thank you. As far as I know, this is the only concern.

Maiyamok seemed to be the only exception where space should be inserted in front of it. Other succeeding punctuation(s) like Paiyalnoi (ไปยาลน้อย) should not be spaced in front (example sentence: "นายกฯ เป็นประธานในพิธีเปิดอาคาร"), and therefore is not a problem.

@bact
Copy link
Contributor

bact commented Aug 20, 2021

  • ๆ, a space and a character before it (<character><space><maiyamok>) should be considered as connected units, that form a larger "atomic" unit - from line-breaking point of view

  • the <space> there is similar to non-breaking space in this sense, it acts like a glue that keeps and together. (we could think about it as combined characters as well, like ำ Sara Am U+0E33)

  • however, when considering about spacing for justification, that <space> in question should be treated as "another character". small spacing could be added between characters, and so the <space> - at the same amount. the <space> here is not a whitespace (that could be considered an opportunity for wider spacing, compared to non-whitespace characters).

@r12a
Copy link
Contributor Author

r12a commented Nov 23, 2021

It seems to me that we have a number of possible ways to go here, and i'd like feedback on which is best.

  1. suggest that browsers recognise <space>ๆ (ie. U+0020) as a special combination which should not be split, and allow content authors to keep using an ordinary space before ๆ.
  2. advise content authors to use a non-breaking space before ๆ. This will prevent the inappropriate wrapping, however there are wrinkles, which i'll outline below. (Of course, many people will continue to use an ordinary space, but people who really care about the presentation can use this.)

If the content author uses U+00A0 (NO-BREAK SPACE) the gap between ๆ and the preceding word the space will grow during justification. Here's an illustration:
Screenshot 2021-11-23 at 16 24 58

There is, however, another no-break space – U+202F NARROW NO-BREAK SPACE. This doesn't appear to expand during justification, but it is narrower than an ordinary space. I'm not sure whether the narrowness is a problem or a good thing. Here's the same text using NNBSP - all the parameters are exactly the same, i just swapped in the different space character.
Screenshot 2021-11-23 at 16 27 56

Here are a couple of tests. They result in the same behaviour on Chrome, Firefox, and Safari.

  1. A NO-BREAK SPACE (U+00A0) will stretch when justification is applied.
  2. A NARROW NO-BREAK SPACE (U+202F) will NOT stretch when justification is applied.

@r12a
Copy link
Contributor Author

r12a commented May 16, 2022

It occurs to me that a third option is to not require a space before the maiyamok at all (in fact, to require no space), but to build the space into the character glyph in the font. Apparently, this approach is used for some Mongolian punctuation. Of course, current fonts and legacy documents don't do that, so this solution would probably need to run alongside one of the previous solutions, and be based on what the font does. Whether that introduces semantic differences, i doubt. However, it would probably prove to be a problem for fallback fonts on the Web, since the fonts in your fallback list may behave differently, producing non-ideal results.

@eric-muller
Copy link

In addition to U+00A0 and U+202F, we have all the fixed spaces at the beginning of the 2000 block.

The situation is similar for ; : ? ! « » in French typography (and earlier English typography). There, the addition of space by a font is a bit dicey. For example, » is also used in list-like settings to indicate duplication of the previous item, and no space should be introduced before it. In general, a font knows essentially nothing about the context, so the behaviors need to be triggered by the layout engine (e.g. by explicit application of OT features).

@r12a
Copy link
Contributor Author

r12a commented May 16, 2022

In addition to U+00A0 and U+202F, we have all the fixed spaces at the beginning of the 2000 block.

Except that they're not no-break spaces, so they don't keep the maiyamok on the same line as the preceding word.

The situation is similar for ; : ? ! « » in French typography (and earlier English typography).

Yep. And there again, NNBSP seems to be popular these days.

@eric-muller
Copy link

eric-muller commented May 16, 2022

My bad on the fixed spaces. U+2007 FIGURE SPACE is GL, so is a candidate, but it's the only one.

I would argue that we are missing a non-breaking, non-justifying space, same width as an unjustified space. That's what the house rules of the Imprimerie nationale call for before a : and inside « ». And that's also what one would want before a Maiyamok, if I read this comment and the following correctly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Under discussion
Development

No branches or pull requests

4 participants