-
Notifications
You must be signed in to change notification settings - Fork 97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FLUID-6554: Handle soft hyphens in content #1010
Conversation
* markup as the separators and split the original textnode at the hpyhenation points. If there are no places to | ||
* inject a hyphenation point, the node will be left unchanged. If the original textnode inclues soft hyphens (i.e. | ||
* ­) characters, these will be used as a hyphenation point with the original location in the text node | ||
* replacded with the `softHyphenPlaceholderMarkup` to allow it to be replaced when syllabification is removed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typos hpyhenation and replacded. You probably want the last "replaced" to be "restored" for a bit of extra clarity
* | ||
* @param {HypherHyphenator} hyphenator - an instance of a Hypher Hyphenator | ||
* @param {DomNode} node - a DOM node containing text to be hyphenated | ||
* @param {String} separatorMarkup - the markup to be injected and used to indicate the separators/hyphens | ||
* @param {String} softHyphenPlaceholderMarkup - the markup to be injected and used to indicate the original | ||
* location of any soft hyphens inclded in `node` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo inclded
}, | ||
strings: { | ||
languageUnavailable: "Syllabification not available for %lang", | ||
patternLoadError: "The pattern file %src could not be loaded. %errorMsg" | ||
}, | ||
markup: { | ||
separator: "<span class=\"flc-syllabification-separator fl-syllabification-separator\"></span>" | ||
separator: "<span class=\"flc-syllabification-separator fl-syllabification-separator\"></span>", | ||
softHyphePlaceholder: "<span class=\"flc-syllabification-softHyphenPlaceholder\"></span>" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo name softHyphePlaceholder which at least was copy and pasted correctly at line 161 : P
@amb26 my spelling woes continue. Thanks for spotting these. I've pushed up fixes, ready for more review. |
* markup as the separators and split the original textnode at the hyphenation points. If there are no places to | ||
* inject a hyphenation point, the node will be left unchanged. If the original textnode inclues soft hyphens (i.e. | ||
* ­) characters, these will be used as a hyphenation point with the original location in the text node | ||
* replacded with the `softHyphenPlaceholderMarkup` to allow it to be restored when syllabification is removed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
replacded -> replaced.
var separator = $(separatorMarkup)[0]; // converts the separatorMarkup markup string into a Dom Node | ||
node = node.splitText(seg.length); | ||
node.parentNode.insertBefore(separator, node); | ||
if (node.textContent.startsWith("\u00AD")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One problem is, when a soft hyphen is in a middle of a word, syllabification breaks the word into 2 syllables at where the soft hyphen is regardless if these 2 sections are proper syllables.
For example, back
is one single syllable. When a soft hyphen is in between ba
and ck
, hypher will mark them as 2 separate syllables, which might be misleading.
I was thinking about removing soft hyphens from texts before running them thru the hypher and inserting soft hyphens back. But the difficult part is how to keep track of positions of those soft hyphens.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One suggestion I can think of is to simply clone any DOM that the soft hyphens are removed from and hold the original, together with a pointer to its parent, it in a private fragment out of the document context. This can then be swapped back in again when the enactor is turned off.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removing soft hyphens without inserting them back when the "syllable" setting is on will lead to the issue below that hyphenations are gone.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cindyli I can see your point. However, the ability to break on soft hyphens provides a means for an integrator to define where the syllables are split. Unfortunately the hyphenation engine we use it's 100% perfect. If I remember correctly, it's closer to 80% accurate. So providing an ability for an integrator to explicitly define where breaks should happen may be useful. Also for novel words or localization specific pronunciation that we may not cover.
I'd suggest not using soft hyphens where a word wouldn't have a natural hyphenation point. Although I'm not sure if there are reasons that would contradict this nor is it something we could enforce in anyway.
More thoughts/comments welcome.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's reasonable to imagine that turning on the syllabification enactor represents the user's intention "I would like to substitute UIO's idea for where syllables should break for the original page author's, if any". So long as we restore the markup to its original condition when the enactor is deactivated, I think this will be accepted. Of course it would be desirable if we could somehow incorporate the "hint" from the markup author provided by the soft hyphen but this doesn't seem to be possible with hypher.
if (node.textContent.startsWith("\u00AD")) { | ||
// converts the softHyphenPlaceholderMarkup markup string into a Dom Node | ||
var placeholder = $(softHyphenPlaceholderMarkup)[0]; | ||
var newNode = fluid.prefs.enactor.syllabification.insertIntoTextNode(node, placeholder, 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very good remarks, @cindyli!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cindyli I've pushed up some changes which add a soft hyphen to the placeholder pseudo element. Please let me know if this addresses the issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cindyli pointed found a bug in the updated implementation if the soft hyphen appears in the last segment, it doesn't get replaced with the placeholder. This is because the algorithm removes the last segment as we don't want to add a separator after it. However, we do need to replace the soft hyphen in all segments.
Could you add tests to make sure all soft hyphens, including when it's on the last syllable segment, are replaced by the soft hyphen placeholder? Thanks. |
https://issues.fluidproject.org/browse/FLUID-6554