Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Low line or underline? #115

Open
r12a opened this issue Nov 22, 2016 · 11 comments
Open

Low line or underline? #115

r12a opened this issue Nov 22, 2016 · 11 comments

Comments

@r12a
Copy link
Contributor

r12a commented Nov 22, 2016

Indicator Punctuation Marks > Fullwidth low line
https://www.w3.org/TR/clreq/#indication_punctuation_marks

U+FF3F FULLWIDTH LOW LINE [_] is positioned underneath proper nouns such as a person's name, the name of a place, etc.

This doesn't seem right to me, not least because U+FF3F is not a combining character. I think what we are actually talking about is underline, isn't it? (in which case this subsection about fullwidth low line should be moved to a new location)

My understanding is that the u tag in HTML is designated as the way to indicate such lines below text, again making me think that this is actually a type of underlining.

The note that follows indicates that a 'wavy low line' is sometimes used. Does that look like a combining character (for which i couldn't find anything in Unicode on a quick search), or like a wavy line (in which case it's a type of underline that CSS could specify)? (It would help to have a picture of it in clreq.)

@r12a
Copy link
Contributor Author

r12a commented Nov 22, 2016

[adding an excerpt from an email i received from Bobby (before raising this issue) that adds some useful background]

In CLREQ, 3.1.1.2 Indicator Punctuation Marks - 9 Fullwidth low line is how underline as punctuation in Chinese.

  • for horizontal writing: underline
  • for vertical writing: leftline

but the most important thing. If possible, lines should be distinct with a gap. Sometimes location and people's name will appear together.

i.e. <u>安平</u><u>鄭成功</u>

鄭成功 is people's name, 安平 is where he born. People may read as: 安平 鄭成功 or 安平鄭 成功.

That's usage in Chinese.

@moyogo
Copy link

moyogo commented Nov 22, 2016

The CJK Compatibility Forms (FE30:FE4F) has ﹏ FE4F WAVY LOW LINE and others.

@c933103
Copy link

c933103 commented Nov 24, 2016 via email

@ethantw
Copy link
Member

ethantw commented Nov 26, 2016

@r12a

Hello! 😄 Sorry for taking this long to reply, had been busy.

This doesn't seem right to me, not least because U+FF3F is not a combining character. I think what we are actually talking about is underline, isn't it? (in which case this subsection about fullwidth low line should be moved to a new location)

It is intentional here choosing a character over text decorations. Considering the fact that the usage are classified as punctuation marks, it is reasonable here stating proper name marks and title marks as characters [1]. Since they are both punctuation marks, font designers and typographers, in my opinion, should have a say in how they look and where they are put, etc (by designing these characters in typefaces). It is not just a task of browser/reader rendering.

We can see the similar concept in both JLReq and CLReq where emphasis marks are stated as either or . (By using text-emphasis property of CSS, it is easy to attach text with non-combining characters.)

The project is a set of requirements of layouts. The above-mentioned subsection of the document does not, in any way, suggest or imply that we implement such feature with combining character technology of Unicode or the <u> element of HTML. It simply indicates the most idealistic solution. In which case, proper name marks and title marks are both characters just like commas, periods and emphasis marks, plus the gap thing @bobbytung said about.

The text decoration solution/fallback seems perfect to me too. We can use the <u> element to annotate proper names in HTML documents along with text-emphasis or text-decoration to style the element. Either way is fine. I won’t resist using text decorations when it comes to reality. 😝

[1]: http://language.moe.gov.tw/001/Upload/FILES/SITE_CONTENT/M0001/HAU/h13.htm

@bobbytung
Copy link
Contributor

@ethantw About "Book Title Mark" and "Proper noun marks". I think we should not assign specified unicode characters. Because the way restricted implement.

But we can use unicode characters as reference to show presentation form.

書名號
甲式呈現樣貌為波浪底線U+FE4F WAVY LOW LINE [﹏]
U+FE4F WAVY LOW LINE [﹏] as presentation form of book title marks, is positioned at the foot end of the annotated text.

專名號
專名號呈現樣貌為U+FF3F FULLWIDTH LOW LINE [_]
U+FF3F FULLWIDTH LOW LINE [_] as presentation form of proper noun mark, is positioned underneath proper nouns such as a person's name, the name of a place, etc.

@r12a
Copy link
Contributor Author

r12a commented Jul 23, 2018

The project is a set of requirements of layouts. The above-mentioned subsection of the document does not, in any way, suggest or imply that we implement such feature with combining character technology of Unicode or the element of HTML. It simply indicates the most idealistic solution. In which case, proper name marks and title marks are both characters just like commas, periods and emphasis marks, plus the gap thing @bobbytung said about.

I agree that the document should provide requirements rather than technical implementations or solutions, and that's actually why i'm concerned about this. I'm not yet convinced that this section is correct or pitched at the right level.

The current wording implies quite strongly that use of a character is the normal way to achieve one type of book title mark. "U+FE4F WAVY LOW LINE [﹏] is positioned at the foot end of the annotated text." That is implying, for me, a proposed solution or implementation detail. Same applies in the following section, Proper noun marks, and maybe elsewhere. There is no mention that this is normally achieved using text decoration styling.

Also, to classify this effect as punctuation just because those characters or their alternatives are classed as a type of punctuation doesn't necessarily follow, for me. Actually, I think we should have a section on text decoration that describes wavy lines as an alternative means of indicating a book title, and point to that from the punctuation section, which describes the use of angle brackets.

The punctuation section is a little unusual in that sometimes it is useful to indicate usage patterns by referring to characters used, such as use of double ellipsis characters, but i don't think that's the case here, because:

  1. although the characters cited suggest the shape to be used, it's not clear to me that they are used to achieve the effect cited. Have you seen these characters in use?

  2. it's true that both characters have the Unicode property of "Puncuation, connector" but they are also both compatibility characters, which map to U+005F LOW LINE

  3. the Unicode standard says of both characters "They were intended, in the Chinese standard, for the representation of various types of overlining or underlining, for emphasis of text when laid out horizontally. Except for round-trip mapping with legacy character encodings, the use of these characters is to be discouraged; use of styles is the preferred way to handle such effects in modern text rendering." v10, pp288-289 [my bold emphasis]

So i agree with your point about this document describing requirements, rather than solutions. I also think that it's ok to describe typical implementations of punctuation by referring to characters. But i don't think those things apply to use of wavy or not wavy underlines. And i think we should cross-reference to a section about text decoration.

--

Btw, the proofread changes you made recently are useful, i think, where the wording changes make the subsections describe functions rather than a particular Unicode character, eg. 'Fullwidth colon and fullwidth semicolon' -> 'Periods, commas and secondary commas', or 'interpunct' (as a feature) rather than 'middle dot'. I have long thought we should be moving in that direction. I think the section titles should describe some semantic function that one wants to achieve, and then explain how that is typically done. There are still some subsections, however, that are focused more on taking a Unicode code point and showing how it is used, for example 'Solidus' (which might be better titled as 'Poetry separators' and 'Character separators', or 'Parentheses', rather than 'Clarifications & asides', etc.

@ronaldtse
Copy link

This is my first comment in this topic, so I'm not sure if the right place is an issue last commented 4 years ago.

From what I understand the book title mark is not supported in Unicode. The current draft contains this text:

Book title mark type A U+FE4F WAVY LOW LINE [﹏] has a wavy line appearance and is positioned at the foot end of the annotated text.

In "positioned at the foot end of the annotated text", what is the "annotated text"? Does it mean that this is accomplished through ruby?

In the earlier draft, this text is present:

U+FE4F WAVY LOW LINE [﹏] is positioned beneath the corresponding characters. When two works are listed next to each other, the wavy lines for each should be clearly separated.

The context that was lost, i.e. "positioned beneath the corresponding characters", is an important part to understand how the mark works, because it is exactly how the book title mark is meant to be used (and appear). Without this context, in the current draft, I cannot quite make sense of "positioned at the foot end of the annotated text".

With regards to whether the book title mark should exist in Unicode, it would be an enthusiastic yes -- a purely stylistic requirement does not work for normal text.

For example, on Wikipedia, the wavy line is implemented as a style encoded in HTML:

<span style="text-decoration: underline; text-decoration-style: wavy;">離騒</span>

This is a purely stylistic encoding that is bound to HTML, and not reproducible in Unicode text.

@ronaldtse
Copy link

I think we should not assign specified unicode characters. Because the way restricted implement.

I'm slightly baffled -- without Unicode characters wrapping the relevant text (e.g. start name, end name), how it is possible to implement the proper name mark or book title mark in plain Unicode text? These marks have a defined start and a defined end.

Or perhaps am I missing some obvious way of implementing these marks...?

@r12a
Copy link
Contributor Author

r12a commented Nov 22, 2022

There are many things that can't be expressed in plain text. For example, English language emphasis can't be expressed either in plain text, but can be expressed using <em> or <strong> in HTML. Perhaps the problem is relying on the use of plain text, rather than text mixed with semantic markers?

@ronaldtse
Copy link

ronaldtse commented Nov 22, 2022

That's true, yet Unicode does also have the "combining low line".

  1. In any case, in the current draft, adding an example of showing the wavy line in a presentational manner, e.g.
<span style="text-decoration: underline; text-decoration-style: wavy;">離騒</span>

Would be very helpful.

  1. If this document is not meant to specify the Unicode characters for the wavy line, perhaps the text referring to U+FE4F WAVY LOW LINE should be removed to avoid confusion. The current wording reads like it is possible to put the wavy line below the text using normal Unicode features, which is not the case.

Would a PR be welcome? Thanks!

@xfq
Copy link
Member

xfq commented Nov 24, 2022

Does it mean that this is accomplished through ruby?

No. As you mentioned, in HTML and CSS, this is implemented using text-decoration: underline; text-decoration-style: wavy;.

The context that was lost, i.e. "positioned beneath the corresponding characters", is an important part to understand how the mark works, because it is exactly how the book title mark is meant to be used (and appear). Without this context, in the current draft, I cannot quite make sense of "positioned at the foot end of the annotated text".

I agree that the text here could be improved, but if you look at https://w3c.github.io/clreq/#term.foot-side , "foot side" means "positioned beneath the corresponding characters". The corresponding CSS terminology would be block-end.

This is a purely stylistic encoding that is bound to HTML, and not reproducible in Unicode text.

As mentioned by @ethantw in #115 (comment) , the Unicode code points are not listed here to reproduce proper noun marks and book title marks in plain text, but to use glyphs of Unicode code points to illustrate what these punctuations / text decoration should look like. That said, I agree that it can be confusing, and the task force will discuss how to improve the text.

In any case, in the current draft, adding an example of showing the wavy line in a presentational manner, e.g.

We could add a figure, but we usually don't add HTML code examples, as this document tries to be technology agnostic and not to describe particular technological solutions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants