Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove some checkme in pinyin sections #263

Merged
merged 4 commits into from
Apr 10, 2020
Merged

Remove some checkme in pinyin sections #263

merged 4 commits into from
Apr 10, 2020

Conversation

xfq
Copy link
Member

@xfq xfq commented Feb 22, 2020

index.html Outdated Show resolved Hide resolved
index.html Outdated
<p its-locale-filter-list="zh-hant" lang="zh-hant">正文与标音双方皆分词连写。相邻基文之间有约1/2em的空格隔开,基文内部字距通常正常。</p>
</li>
<li id="id151">
<p its-locale-filter-list="en" lang="en">Many word-based annotations indicate the logic of the whole sentence, rather than merely the pronunciation: these phonetic annotations have sentence case, as well as punctuation marks which follow the previous annotations.</p>
<p its-locale-filter-list="zh-hans" lang="zh-hans" class="checkme">许多分词连写标音用例体现出对整个句子标音的逻辑,而非简单对词语标音:注文有句首大写、专名首字母大写。有标点,标点跟随左侧(前方)注文。</p>
<p its-locale-filter-list="en" lang="en">Many word-based annotations indicate the logic of the whole sentence, rather than merely the pronunciation: these phonetic annotations have capitalized sentences and capitalized proper names. Punctuations follow the previous annotation text.</p>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what "Punctuations follow the previous annotation text." means. Note btw, that it should read "Punctuation follows...".

Copy link
Member Author

@xfq xfq Feb 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It means the punctuation immediately follows the ruby text.

For example, it should be:

1

instead of something like the following:

2

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's much clearer now. I strongly recommend including the top image as a figure, and saying something like "Punctuation may also be included in these annotations, but is kept with a preceding annotation, as shown in Fig XX, and doesn't appear over the punctuation in the base text."

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just pushed a commit to clarify this point.

@r12a
Copy link
Contributor

r12a commented Feb 24, 2020

I would have suggested using the first line of w3c/type-samples#94 instead as the example, since it includes both a comma and a question mark (and even a colon too, if you extend to the beginnign of the line). However, the examples on that picture extend the annotations to cover the punctuation of the base text too - which i think clreq says shouldn't happen...

@xfq
Copy link
Member Author

xfq commented Feb 24, 2020

However, the examples on that picture extend the annotations to cover the punctuation of the base text too - which i think clreq says shouldn't happen...

Perhaps I'm missing something, but I don't think it is in contradiction with clreq...

@r12a
Copy link
Contributor

r12a commented Feb 24, 2020

See 3.3.4.2 Characters as the Basic Units for Annotating Pronunciation
https://w3c.github.io/clreq/#h-characters_as_basic_units_for_annotating_pronunciation

The base text is a single Han character. Only Han characters are annotated: European numerals or punctuation marks are excluded.

@r12a
Copy link
Contributor

r12a commented Feb 24, 2020

And note how in your original example, although the comma is included in an annotation, the base comma is not annotated – the annotation that includes the comma is centred over the word 方面.

I don't really know what the right answer is – perhaps there are no stringent rules. I'm just looking at the examples and the text from clreq, and noticing a divergence.

@xfq
Copy link
Member Author

xfq commented Feb 25, 2020

See 3.3.4.2 Characters as the Basic Units for Annotating Pronunciation
https://w3c.github.io/clreq/#h-characters_as_basic_units_for_annotating_pronunciation

The base text is a single Han character. Only Han characters are annotated: European numerals or punctuation marks are excluded.

Hmm, the section title is Characters as the Basic Units for Annotating Pronunciation, but the example in w3c/type-samples#94 should use the rules in 3.3.4.3 Words as the Basic Units for Annotating Pronunciation.

@r12a
Copy link
Contributor

r12a commented Feb 25, 2020

I think it would be good to clarify in clreq whether there are any rules about this.

@xfq
Copy link
Member Author

xfq commented Feb 25, 2020

I think it would be good to clarify in clreq whether there are any rules about this.

Sorry, I don't quite understand what needs to be clarified in clreq. Currently, 3.3.4.2 (the "character as a unit" section) mentions that punctuations are excluded, and 3.3.4.3 (the "word as a unit" section) mentions that punctuations can be included.

@r12a
Copy link
Contributor

r12a commented Feb 25, 2020

The question for me is how one handles punctuation in annotations when using the word-based approach (ie. using group-ruby for words). There appear to be three alternatives in the samples we have seen in this thread:

[1] Add punctuation to the annotations by attaching an annotation over the punctuation in the base text (only), like this:
75141037-14a83680-572b-11ea-9a83-a1cf19419842

[2] Don't add an annotation to the punctuation in the base line: add the punctuation instead to the previous annotation, like this:
75141016-0a863800-572b-11ea-9d36-1cfddfc432d4

[3] Group the base text punctuation with the preceding han character(s) and annotate both/all as a unit, like this (look particularly at the question mark):
Screenshot 2020-02-25 at 11 56 47

Are all of these approaches ok? Is it just down to the author as to which is chosen?

@r12a
Copy link
Contributor

r12a commented Feb 25, 2020

https://w3c.github.io/clreq/#h-words_as_basic_units_for_annotating_pronunciation seems to suggest that option 2 is the correct approach.

@xfq xfq changed the title Remove some checkme in pinyin sections [WIP] Remove some checkme in pinyin sections Mar 6, 2020
@r12a
Copy link
Contributor

r12a commented Mar 6, 2020

The difference between [2] and [3] can be seen if you think about how you would mark up these two alternatives.

[2] would have the following markup: <rb>方面</rb><rt>fāngmiàn,</rt></ruby>, with the base comma outside the markup.

in [3], however, the base ? appears below the annotation shéi?, so the markup must be <rb>谁?</rb><rt>shéi?</rt></ruby> with the base question mark inside the markup.

I'm not making any judgements here about which is correct – just noting the difference, and asking whether both approaches are common.

@xfq
Copy link
Member Author

xfq commented Apr 7, 2020

I see. Option 2 is the correct approach indeed.

@xfq xfq changed the title [WIP] Remove some checkme in pinyin sections Remove some checkme in pinyin sections Apr 7, 2020
@xfq xfq merged commit cb2c6a7 into gh-pages Apr 10, 2020
@xfq xfq deleted the xfq/pinyin-checkme branch April 10, 2020 02:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants