Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[com.google.fonts/check/soft_hyphen] improve rationale #4095

Open
RosaWagner opened this issue Mar 16, 2023 · 18 comments
Open

[com.google.fonts/check/soft_hyphen] improve rationale #4095

RosaWagner opened this issue Mar 16, 2023 · 18 comments
Assignees
Labels
Check improvement proposal GF's priority list List of high priority issues for google/fonts CI Message / Rationale improvements Suggestions on improvements to check-result messages or rationale text. Profile: Universal Checks that evaluate adherence to the best practices shared among the type design community

Comments

@RosaWagner
Copy link
Contributor

RosaWagner commented Mar 16, 2023

Observed behaviour

⚠ WARN: Does the font contain a soft hyphen? (com.google.fonts/check/soft_hyphen)
⚠ WARN This font has a 'Soft Hyphen' character. [code: softhyphen]

Expected behaviour

The check doesn't say what should be done with the soft hyphen. I would propose:

"The softhyphen is sometimes designed empty with no width (such as a control character), sometimes the same as the traditional hyphen, sometimes double encoded with the hyphen. That being said, it is recommended to not include it in the font at all, because discretionary hyphenation should be handled at the level of the shaping engine, not the font. Also, even if present, the software would no display that character. More discussion here https://typedrawers.com/discussion/2046/special-dash-things-softhyphen-horizontalbar and here #3486"

cc @vv-monsalve @twardoch thumbs up if you agree

@RosaWagner RosaWagner added Check improvement proposal Profile: Universal Checks that evaluate adherence to the best practices shared among the type design community Message / Rationale improvements Suggestions on improvements to check-result messages or rationale text. labels Mar 16, 2023
@RosaWagner RosaWagner self-assigned this Mar 16, 2023
@felipesanches felipesanches added this to the 0.8.12 milestone Mar 16, 2023
felipesanches added a commit to felipesanches/fontbakery that referenced this issue Mar 16, 2023
felipesanches added a commit that referenced this issue Mar 16, 2023
@glenda-tn
Copy link

Also, is it your intent to always report it as a WARN if it simply exists? Seems to me that the report should be:

  1. INFO: if u00AD is present and is at zero width with no outlines, as expected
  2. WARN: if u00AD is present but there is an outline or it is double encoded to u002D (hyphen)
  3. FAIL: if not present

@felipesanches felipesanches reopened this Mar 16, 2023
@felipesanches
Copy link
Collaborator

@RosaWagner and others, what do you think aobut what @glenda-tn just said?

@RosaWagner
Copy link
Contributor Author

@felipesanches maybe there was a confusion from my part between rationale/message; I would have liked to make the message about what to do when rationale are not displayed in report (such as in GF repo)

⚠ WARN This font has a 'Soft Hyphen' character. [code: softhyphen] Please consider removing it from the font.

@glenda-tn, we don't want soft hyphen to be present. If the soft hyphen is not present, then it is a PASS.

@felipesanches
Copy link
Collaborator

@glenda-tn, are you OK with the approach that Rosalie proposed in her last message?

@glenda-tn
Copy link

oooh! I didn't realize Google actually wants the char to be removed.
At TN, we are following the Unicode definition as I had posted in #4046 and repeat here:

Unlike U+2010 HYPHEN, which always has a visible rendition, the character U+00AD SOFT HYPHEN (SHY) is an invisible format character that merely indicates a preferred intraword line break position. If the line is broken at that point, then whatever mechanism is appropriate for intraword line breaks should be invoked, just as if the line break had been triggered by another hyphenation mechanism, such as a dictionary lookup. Depending on the language and the word, that may produce different visible results…

At Type Network, a year ago we actually had to add uni00AD to a typeface for a client because:

Due to problems with hyphenation in printing (Office 365) documents a developer found out that the [TypeNetwork fonts] do not support soft hyphen characters ((unicode: \u00AD)

I apologize, I had misunderstood your intent with the check. Now I get that you don't actually want it in the font. That said, then your rationale is fine and clarifying. At TN, we will continue to require it, but at 0 width, with no outlines.

Thanks for your attention to this.

@felipesanches
Copy link
Collaborator

@glenda-tn, in this particular case, we're discussing a check that is currently in the universal profile, so I am trying to reach consensus on what is a good policy for everybody, instead of a Google Fonts-specific requirement.

It would then be great if we can figure out what criteria would better satisfy the needs of the majority of the font development community as a whole. Those who have diverging requirements can still have their vendor specific checks for that. But the goal here is to decide on what to do on the universal profile. What do you think?

@glenda-tn
Copy link

@felipesanches To keep the check Universal, I would think that following the Unicode spec is the correct thing to do and use the INFO, WARN, FAIL results that I posted above. The original rationale was fine though I might cross out the last sentence.

This font has a 'Soft Hyphen' character (codepoint 0x00AD) which is supposed to be zero-width and invisible, and is used to mark a hyphenation possibility within a word in the absence of or overriding dictionary hyphenation. It is mostly an obsolete mechanism now, and the character is only included in fonts for legacy codepage coverage. [code: softhyphen]

To have more clout, I might add the quote from Unicode page to the above. (quote copy is in my previous comment)

@felipesanches
Copy link
Collaborator

felipesanches commented Mar 17, 2023

Another thing to keep in mind is the fact that the checks in the universal profile cover things that go beyond what the OpenType spec dictates.

So, we have checks in the opentype profile for things that the spec requires or recommends, and then the universal profile is for additional aspects that are not covered by the spec but that are generally established as good practices in the type design community.

@twardoch
Copy link
Collaborator

I once made a font which had a glyph associated with the soft hyphen Unicode. The glyph looked like hyphen but had a negative sidebearing so that the hyphen would stick out a bit at the hyphenation point.

I observed that most implementations ignored that glyph and displayed the actual hyphen glyph at the end of the line, but at least one app* did display the glyph associated with the soft hyphen Unicode — which is what I actually wanted. But I did not test what would happen if the font did not include a glyph there. But my conclusion was that it was safer if the soft hyphen Unicode is supported by the font cmap — be it as a separate glyph or as an additional mapping to the hyphen glyph.

*) Unfortunately I longer remember which app it was

@RosaWagner
Copy link
Contributor Author

@twardoch when we chatted about it 18 month ago you recommended to remove the glyph completely ^^

@felipesanches taken all comments into account I would go back to previous rationale with the proper quote from the unicode provided by @glenda-tn. I am still bothered my the WARN message though that doesn't give any clear recommendation.

I just tested the soft hyphen in Office 365 + Indesign + native apps from both windows and mac, with fonts having the soft hyphen and not, and it seems to work fine it all cases. Printing the documents directly from the app was also working, and from a PDF too. We also did these tests when implementing that check the first time.

@glenda-tn do you happen to know which version of Office 365, the OS and kind of printer your client was using?

For now to improve the check, I would propose:

  • PASS if soft hyphen is absent
  • PASS if soft hyphen is present with 0-width and no outline
  • FAIL if softhyphen is present with contour with suggestion to either remove it for recent environments or leave it but invisible for backward compatibility.

@felipesanches
Copy link
Collaborator

Great! It sounds like we're reaching consensus on this for the universal profile then!

I'll incorporate these proposed changes. Thanks!

@twardoch
Copy link
Collaborator

@twardoch when we chatted about it 18 month ago you recommended to remove the glyph completely ^^

I see :)

@glenda-tn
Copy link

@RosaWagner I do not know anything further about the client usage... this was over a year ago and we simply made it a point that going forward, we would require the code point.

I agree with @twardoch's earlier comment upthread in that the code point, 00AD, should be present in fonts. There are likely many existing documents that use it. While some apps/browsers may ignore it, there are probably some out there that don't and so there is risk of the .notdef showing up if the code point is missing from a font.

Another point to note is that the softhyphen is used in other languages and connected scripts, like Arabic. From the same Unicode link I've been referring to but written further down the page is this:

Hyphenation, and therefore the SHY, can be used with the Arabic script. If the rendering system breaks at that point, the display—including shaping—should be what is appropriate for the given language. For example, sometimes a hyphen-like mark is placed on the end of the line. This mark looks like a kashida, but is not connected to the letter preceding it. Instead, the appearance of the mark is as if it had been placed—and the line divided—after the contextual shapes for the line have been determined. For more information on shaping, see [UAX9] and Section 9.2, Arabic, of [Unicode].

That said, I do not think the absent code point should be a PASS, but rather, a FAIL.

@vv-monsalve
Copy link
Collaborator

vv-monsalve commented Mar 18, 2023

  • PASS if soft hyphen is absent
  • PASS if soft hyphen is present with 0-width and no outline
  • FAIL if softhyphen is present with contour with suggestion to either remove it for recent environments or leave it but invisible for backward compatibility.

From our previous tests and the new ones Rosalie has performed, I would support this.

However, since we are discussing a Universal profile check, I would like to hear from @tiroj, who provided the first argument we consider for the previous issue.

@tiroj
Copy link

tiroj commented Mar 18, 2023

I think I would be inclined to go with

  • WARN or INFO if soft hyphen is absent
  • PASS if soft hyphen is present with 0-width and no outline
  • FAIL if softhyphen is present with contour with suggestion to either remove it for recent environments or leave it but invisible for backward compatibility

While soft hyphen not being present is likely to be fine in most instances, there are edge cases where it could be either desired or recommended for accurate Windows codepage coverage.

There are a lot of fonts in the wild that have visible soft-hyphen glyphs, or that dual-map the /hyphen glyph to U+00AD, since many people are confused about the purpose of the character. I made a few myself before Khaled set me right a few years ago.

@glenda-tn
Copy link

glenda-tn commented Mar 18, 2023

There are a lot of fonts in the wild that have visible soft-hyphen glyphs, or that dual-map the /hyphen glyph to U+00AD, since many people are confused about the purpose of the character.

This. Noto, Roboto and RobotoFlex, Segoe, SourceSerif, SF Symbols, and so many Google fonts all have a visible contour or are dble-encoded. I have looked at so many. I only know of SF Pro that completely omits the code point.

I would be fine with @tiroj's rationale for the Universal profile.

@simoncozens
Copy link
Collaborator

Another point to note is that the softhyphen is used in other languages and connected scripts, like Arabic. From the same Unicode link I've been referring to but written further down the page is this:

Hyphenation, and therefore the SHY, can be used with the Arabic script. If the rendering system breaks at that point, the display—including shaping—should be what is appropriate for the given language. For example, sometimes a hyphen-like mark is placed on the end of the line.

This is true, in theory: the Uyghur language, at least, uses the Arabic script and supports hyphenation. But I am not sure that it is true in practice. I don't think that any layout system correctly implements hyphenation for Arabic. (Apart from my own SILE typesetter. ;-)

@vv-monsalve
Copy link
Collaborator

vv-monsalve commented Mar 23, 2023

Taking everything into consideration, the check could use the following:

Log Level Result

  • INFO if soft hyphen is absent
  • PASS if soft hyphen is present with 0-width and no outline
  • FAIL if there is a dual-map the for the /hyphen glyph to U+00AD
  • FAIL if softhyphen is present with contour with suggestion to either remove it for recent environments or leave it but invisible for backward compatibility.

Rationale

According to Unicode

Unlike U+2010 HYPHEN, which always has a visible rendition, the character U+00AD SOFT HYPHEN (SHY) is an invisible format character that merely indicates a preferred intraword line break position.

Nevertheless, it is recommended not to include it in the font at all, because discretionary hyphenation should be handled at the level of the shaping engine, not the font. If in need to add it for any backward compatibility support, 00AD should be 0-width and have no outline.

@RosaWagner RosaWagner added the GF's priority list List of high priority issues for google/fonts CI label May 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Check improvement proposal GF's priority list List of high priority issues for google/fonts CI Message / Rationale improvements Suggestions on improvements to check-result messages or rationale text. Profile: Universal Checks that evaluate adherence to the best practices shared among the type design community
Projects
None yet
Development

No branches or pull requests

8 participants