-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Variable sized spaces in Thai #46
Comments
The width of small space is equal to the width of Thai character Ko Kai “ก” U+0E01.
|
I got interested in the text spacing in Thai text when I received my graduation remark produced by the Office of His Majesty's Principal Private Secretary, which looks like this. There are 3 kinds of spaces found in the document:
In my web publication (in which I intend to have 2 different spacings), right now I do this But in practice, most of the time they don’t. Most Thai contents on the web that I read uses a single space. I think Thai readers are used to it by now. Even the Office of the Royal Society’s web page about the rules of spacing uses a single space for both small and large spaces, and so they are indistinguishable from each other. I never see |
See #19 (comment) (Maiyamok in justified paragraph) |
Could confirm @dtinth that I don't care. Most of the people I know don't seemed to care too. Also could confirm @bact. My mum used to write lots of documents, and it appears that she used one space for "small" space and two for "large" ones. To my knowledge this is relatively identical to how people typeset documents in English using two spaces after a sentence-ending full stop. |
So i guess that the questions i have are, if people don't use the wide space much these days, is it because: a. they'd like to, but just can't figure out how to do it, and there's no em-space key on their keyboard Should i raise an issue in the gap analysis document (presumably not as a high priority) about the need to support different space widths, or should i not? |
This is solely my two cents: yes, better raise the issue to gap analysis documents on low priority.
|
By the way, some languages that use the Khmer script, such as Krung and Tampuan, separate words with narrow spaces such as U+2006 SIX-PER-EM SPACE, and also separate phrases with a wider space such as U+2003 EM SPACE. (Khmer script used for Cambodian language generally uses a normal SPACE for phrase separation, and no space around words.) |
Created some text about this at #49 which appears in the gap-analysis document at https://www.w3.org/TR/2021/WD-thai-gap-20210218/#punctuation_etc |
How common is it for Thai type designers to include em space in their
fonts.
Part of the problem is availability on fonts, part their availability in
keyboard layouts, and part in general knowledge of the practice.
…On Fri, 19 Feb 2021, 05:12 r12a ***@***.***> wrote:
Created some text about this at #49
<#49> which appears in the
gap-analysis document at
https://www.w3.org/TR/2021/WD-thai-gap-20210218/#punctuation_etc
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#46 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AALGM6572QXNQJEMPRTWN2DS7VKANANCNFSM4WYMV5JA>
.
|
Some Thai designers have mentioned to me before that a wordspace designed for Latin will be too small for Thai and one designed for Thai will be too big for Latin. (And we can sometimes see a distinction in its width between fonts made by Thai-native designers and fonts made by Latin-native designers.) It makes some kind of sense because the purpose and frequency are different. So it raises the question of whether the wordspace is technically the best character to use in Thai. Of course it wouldn't be practical to suggest any kind of change, just an observation. I don't think I've ever specifically included an em-space in the Thai/Lao/Khmer fonts I've made, but can easily do so in future projects. |
Ben, is the problem that they are trying to use U+0020 for everything,
rather than exploring the possibilities of all the other spaces?
Even for English there are times we need other spaces for typesetting.
…On Fri, 19 Feb 2021, 19:02 Ben Mitchell ***@***.***> wrote:
Some Thai designers have mentioned to me before that a wordspace designed
for Latin will be too small for Thai and one designed for Thai will be too
big for Latin. (And we can sometimes see a distinction in its width between
fonts made by Thai-native designers and fonts made by Latin-native
designers.) It makes some kind of sense because the purpose and frequency
are different. So it raises the question of whether the wordspace is
technically the best character to use in Thai. Of course it wouldn't be
practical to suggest any kind of change, just an observation.
I don't think I've ever specifically included an em-space in the
Thai/Lao/Khmer fonts I've made, but can easily do so in future projects.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#46 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AALGM63BMHIBKCJJT32JTOTS7YLKFANCNFSM4WYMV5JA>
.
|
I think most Thai people never aware that there are other kinds of space. I
do not remember there was any teacher tell me about how to use spacing
other than a single space in Thai. I had learned there are more than U+0020
when I studied HTML, and just learned there are smaller-space and
larger-space usage in Thai context from this discussion.
…On Fri, Feb 19, 2021 at 3:06 PM Andj ***@***.***> wrote:
Ben, is the problem that they are trying to use U+0020 for everything,
rather than exploring the possibilities of all the other spaces?
Even for English there are times we need other spaces for typesetting.
On Fri, 19 Feb 2021, 19:02 Ben Mitchell ***@***.***> wrote:
> Some Thai designers have mentioned to me before that a wordspace designed
> for Latin will be too small for Thai and one designed for Thai will be
too
> big for Latin. (And we can sometimes see a distinction in its width
between
> fonts made by Thai-native designers and fonts made by Latin-native
> designers.) It makes some kind of sense because the purpose and frequency
> are different. So it raises the question of whether the wordspace is
> technically the best character to use in Thai. Of course it wouldn't be
> practical to suggest any kind of change, just an observation.
>
> I don't think I've ever specifically included an em-space in the
> Thai/Lao/Khmer fonts I've made, but can easily do so in future projects.
>
> —
> You are receiving this because you commented.
> Reply to this email directly, view it on GitHub
> <#46 (comment)>, or
> unsubscribe
> <
https://github.com/notifications/unsubscribe-auth/AALGM63BMHIBKCJJT32JTOTS7YLKFANCNFSM4WYMV5JA
>
> .
>
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#46 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAOWJLDJP42RUTDBW2QTVNTS7YLZFANCNFSM4WYMV5JA>
.
|
@andjc I'm afraid I don't have insight about the why, but as you mentioned above, if fonts and keyboards don't have a way to make the space smaller or larger, and if there's no general awareness of the practice, it seems unlikely people would be prompted to try other spaces. |
Thoughts off the top of my head: I agree that it's probable that most people won't worry about space width, but i imagine that some people who want to do careful typography, and perhaps some authors of ebooks, etc. may want to avail themselves of the opportunity to do this. Since there seems to be no clear indication of how it is to be done, i think it's helpful to establish the principle and discuss how it could work. If the EM SPACE works, then those people could probably find a way to use it even if it's not on a keyboard (though it would be better if it were, of course). But a gap in the font could be more problematic. To improve the visibility for this thread, here's a note i included in the gap-analysis document about support for EM SPACE in Thai fonts: this test shows that no Thai fonts on Mac OS X or Windows 10 have a glyph for EM SPACE, with the exception of Arial Unicode MS and Tahoma. (For best results you need to download Adobe NotDef, and view the page on both Mac and Windows OS.) Also, at #49 we started asking the following questions:
|
I find myself wondering whether a better alternative might be to add a special white-space property/value for use with Thai/Khmer/etc that stipulates that multiple U+0020 SPACE characters should not be collapsed. That would presumably remove the issues around wrapping, justification, font support and keyboards. Its pretty easy and accessible for users to type a double space if you want to increase the gap between sentences. Just a thought. Note that the key issue we face here is the tendency of HTML and other markup to reduce white-space before rendering. In plain text, double-spacing my well occur already(see @bact 's comment about typewriters above). Btw, fwiw, I just added EMSP to the Thai character app, to facilitate experimentation. |
My gut reaction is that double space is a remanant from the days of
typewriters, and more typographically appropriate solutions should be
encouraged.
…On Fri, 19 Feb 2021, 21:12 r12a ***@***.***> wrote:
I find myself wondering whether a better alternative might be to add a
special white-space property/value for use with Thai/Khmer/etc that
stipulates that multiple U+0020 SPACE characters should not be collapsed.
That would presumably remove the issues around wrapping, justification,
font support and keyboards. Its pretty easy and accessible to type a double
space if you want to increase the gap between sentences. Just a thought.
Note that the key issue we face here is the tendency of HTML and other
markup to reduce white-space before rendering. In plain text,
double-spacing my well occur already(?)
Btw, fwiw, I just added EMSP to the Thai character app
<https://r12a.github.io/pickers/thai/>, to facilitate experimentation.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#46 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AALGM62MCIAO3PE5P757SCTS7Y2SDANCNFSM4WYMV5JA>
.
|
There already is a value that does not collapse consecutive spaces but allows wrapping: EM SPACE is currently considered a fixed-width space. Afaik it's expected to be exactly one em wide, and is therefore not adjusted by justification. There's also U+3000 IDEOGRAPHIC SPACE which is typically 1em wide and is allowed to be adjusted by justification, but it's not exactly double U+0020. It does make me wonder if Unicode needs a dedicated sentence-ending space codepoint... But if Khmer is using fixed-width spaces for word spaces and sentence-ending spaces already, we've got a problem if we can't justify them. :/ I guess an important question there becomes whether such spaces are supposed to be adjusted equally or proportionally. Wrt white space at the end of the line, css-text-3 currently allows all (invisible) space characters to hang at the end of a line for all values of |
|
Which I guess is to say, if fixed-width spaces are now supposed to be treated as variable-width spaces because that's how Unicode wants them to be used these days, someone should file an issue against css-text-3 about that; and if we need a sentence-ending space in Unicode someone should file an issue against Unicode. |
Just to be clear: what i noticed is not stretched when |
|
I think the issue applies to Latin text as well. People who learned typing on typewriters often use two spaces after a sentence-ending period, and one space elsewhere. This practice is no longer terribly fashionable, and people who continue to do it get push-back, and are told that since we don't live in a monospaced world anymore, that's not the right way to achieve, it… but there isn't really a right way to achieve it. As @fantasai said earlier, I suspect the solution is to either change (in css-text?) the definition of (some of?) the various fixed-width space, so that they can grow due to justification, or to introduce a new unicode codepoint for a larger-than-U+0020-but-stretchable space. I suspect the former is more likely to be practical. |
I mentioned this briefly at https://www.w3.org/International/sealreq/thai/#space_widths |
In another issue @bact referred to https://www.si.mahidol.ac.th/th/division/soqd/admin/knowledges_files/373_18_1.pdf I'm mentioning it here because the initial part of that page says that 2 spaces are needed between sentences. |
That two spaces practice from the typewriter era, for Thai, is to mimic the writing where the writer can do the spacing but just move their hand a bit further after the last character of the previous sentence. This two spaces practice work well in a word processor but is quite ineffective in HTML, where multiple spaces are treated like just one space. Without an explicit end of sentence symbol, it is also difficult for a web browser to render a space differently according to its semantic (a space between sentences vs a space between words). |
@srakrn pointed to this article, which describes two different sizes of space in Thai: large spaces between sentences, and small spaces in other places (eg. for separating sub clauses).
Given that web browsers reduce multiple spaces in the source text to a single space for display, how do content authors normally achieve this size distinction?
The text was updated successfully, but these errors were encountered: