Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[css-ruby] Handling apostrophes in pinyin #5997

Open
frivoal opened this issue Feb 15, 2021 · 10 comments
Open

[css-ruby] Handling apostrophes in pinyin #5997

frivoal opened this issue Feb 15, 2021 · 10 comments
Labels
css-ruby-2 i18n-clreq Chinese language enablement i18n-tracker Group bringing to attention of Internationalization, or tracked by i18n but not needing response.

Comments

@frivoal
Copy link
Collaborator

frivoal commented Feb 15, 2021

In Chinese, pinyin syllables can be separated by spaces, but when part of a single compound word, they're more typically just juxtaposed with no separator. Except some of them: for disambiguation, certain combinations that would otherwise combine to form a single are separated by an apostrophe. E.g. "dong" and "xi", when next to each other, are just "dongxi", but "xi" and "an" are "xi'an", not "xian" (which is a single syllable with a different pronunciation).

We currently don't have anything in css-ruby that would let us automatically inject these apostrophes when needed. In a way, this is very language specific, and maybe we cannot solve it fully automatically. But it also depends on layout: whether the ruby of adjacent syllables lack sufficient space for visual separation or not. If we could find something generic enough, it would be nice to be able to handle such cases, even if it needed some amount of preprocessor / markup support.

Possibly, if compound words are marked up as single ruby segments, the apostrophes could go in the markup so that there would be no need for the layout engine to guess where they go, and so that if the annotation is rendered inline, it is correct. In that case, what we'd need in css is a way to make them disapear in the right circumstances.

@frivoal frivoal added the css-ruby-1 Current Work label Feb 15, 2021
@xfq xfq added i18n-clreq Chinese language enablement i18n-tracker Group bringing to attention of Internationalization, or tracked by i18n but not needing response. labels Feb 18, 2021
@xfq
Copy link
Member

xfq commented Feb 18, 2021

I understand the "xian" / "xi'an" example, but I don't quite understand why/how we automatically inject these apostrophes. Could you provide some example code?

@tabatkins
Copy link
Member

I think Florian is meaning that, if "xi" and "an" have ruby annotations wide enough that causes them to have some visual separation, they wouldn't need an apostrophe; if they had no ruby, or the ruby was small enough to not cause them to separate, they need the apostrophe.

@frivoal
Copy link
Collaborator Author

frivoal commented Feb 18, 2021

I don't have a solution for how. I am not even sure that we can find one. But if we can, it might be worth trying, because it is a problem. Here's some more details:

Say you have this markup:

<ruby><rb>西<rb><rt>xi<rt>an<rb><rt>de<rb><rb>西<rt>dong<rt>xi</ruby>

with default styling, it will look like this:

Screen Shot 2021-02-19 at 8 38 55

That's fine.

However, let's say we apply ruby-merge: merge to group the annotations per word, as afforded by this markup:

Screen Shot 2021-02-19 at 8 43 23

The "dongxi" over 东西 is fine, but the "xian" over 西安 is not. What we would want instead is something like this:

Screen Shot 2021-02-19 at 8 45 07

Here's another example. This is less realistic, but could happen too. Let's say we increase the font-size of the annotations:
Screen Shot 2021-02-19 at 8 40 34

That's not good. What we'd actually want is more something like:

Screen Shot 2021-02-19 at 8 41 48

Again, I don't think I know for sure how to solve it, but if someone can think of something, that would be good.

@acli
Copy link

acli commented Mar 1, 2021

I don’t think this is specific to pinyin. If Japanese could be annotated with romaji (let’s ignore how unlikely this is, for the sake of argument) we’d need this too.

I’ve seen Jyutping ruby annotations in the wild. This problem doesn’t affect Jyutping since it uses superscripted numerals for tones, but the fact that I’ve seen Jyutping annotations suggests we can’t discount the possibility that we’ll eventually need this for something other than pinyin.

@Jeffxz
Copy link

Jeffxz commented Mar 1, 2021

LOL, this is such a nice example. "Xian de dong xi" sounds like "something salty" (or "something fresh" depends on how you read it) in Chinese which is so different from "Xi'an de dong xi" (meaning "something from Xi'an"). Maybe the example of "xiandedongxi" is a different problem. But just look into "Xi'an".
Aside of meaning, I think if we write ruby like this way <ruby><rb>西安</rb><rt>xi'an</rt></ruby><ruby>的</ruby><ruby><rb>东西</rb><rt>dongxi</rt></ruby> (https://codepen.io/jeff_xu/pen/mdOxjqw). It should be fine for Chinese. The only thing is display is not good for easily reading. I don't find any layout spec for Chinese specifically (see here https://www.w3.org/TR/clreq/) but it looks similar as group ruby in Japanese requirement here (https://www.w3.org/TR/jlreq/#positioning_of_groupruby_with_respect_to_base_characters).

I really like this example and consideration. But I kind of feel we might need definition about how to display pinyin with apostrophes in better way instead of considering inject apostrophes.

@xfq
Copy link
Member

xfq commented Mar 1, 2021

@frivoal I see. Thank you for your explanation!

@xfq
Copy link
Member

xfq commented Mar 1, 2021

I don't find any layout spec for Chinese specifically (see here https://www.w3.org/TR/clreq/) but it looks similar as group ruby in Japanese requirement here (https://www.w3.org/TR/jlreq/#positioning_of_groupruby_with_respect_to_base_characters).

@Jeffxz For Chinese, group-ruby is documented in § 3.3.4.3 Words as the Basic Units for Annotating Pronunciation, but it doesn't mention pinyin with apostrophes. We're tracking it in w3c/clreq#351

But I kind of feel we might need definition about how to display pinyin with apostrophes in better way instead of considering inject apostrophes.

Ideally, it is best to solve both problems, i.e., not only makes the inserted apostrophes display better, but also automatically inject these apostrophes when needed.

@heycam
Copy link
Contributor

heycam commented Mar 1, 2021

Regardless of whether ruby-merge is separate or merged, it's possible for ruby annotation boxes to end up abutting, causing ambiguities, where it'd be wrong to introduce an apostrophe because that implies syllables being part of the one compound word. So I don't think automatic apostrophe introduction without any help from the author is a workable solution.

Is it acceptable for syllables to abut even when there is no ambiguity? It's not super readable. Is / should there be a way to require a minimum spacing between adjacent ruby annotation boxes? (Not sure if margin-inline is sufficient.)

@xfq
Copy link
Member

xfq commented Mar 1, 2021

Is it acceptable for syllables to abut even when there is no ambiguity? It's not super readable. Is / should there be a way to require a minimum spacing between adjacent ruby annotation boxes? (Not sure if margin-inline is sufficient.)

There were some related discussions in #3498. I think even if there is no ambiguity, it is easier to read if the syllables do not abut, so I think there should be a way to require a minimum spacing between adjacent ruby annotation boxes.

@frivoal frivoal added css-ruby-2 and removed css-ruby-1 Current Work labels Mar 6, 2021
@xfq
Copy link
Member

xfq commented Jan 18, 2024

Similar problems might occur if you use romaji for Japanese. For example, "sinai" is しない[1], but "sin'ai" is しんあい[2], so when a vowel follows the 'n' sound, an apostrophe needs to be added after the 'n' sound.

Footnotes:

[1] Kanji: 竹刀,市内
[2] Kanji: 親愛

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
css-ruby-2 i18n-clreq Chinese language enablement i18n-tracker Group bringing to attention of Internationalization, or tracked by i18n but not needing response.
Projects
None yet
Development

No branches or pull requests

6 participants