fix for FS#2676, inserting zero length spaces into long sequences of non...#165
fix for FS#2676, inserting zero length spaces into long sequences of non...#165Chris--S merged 2 commits intodokuwiki:masterfrom
Conversation
…non-breaking characters in diffs
|
Should this be used in the diff mails, too? Or are (possibly mobile) mail clients better at that? |
inc/html.php
Outdated
There was a problem hiding this comment.
I think that entities in the form &#xHEX; (where HEX is a hex value) are valid, too.
There was a problem hiding this comment.
for simplicity, do you think changing to my later simplified pattern?
&#?\w{1,4};
I don't think its a good idea to make it overly complicated or accurate. I think its ok to catch more than the set of valid html entities. So saying, do any have more than 4 chars?
There was a problem hiding this comment.
Yes, many, have a look at http://htmlentities.com/html/entities/
|
Although the original was about URLs because URLs are by far the longest string, I wonder if it also makes sense to do something about other potentially long strings. E.g. |
|
The fix is for long strings, long being 12 characters without a breaking character. |
|
Yes, I get that. But because the idea came because of URLs, we only looked for typical characters in URLs to break a string. That's why we didn't think of |
|
I don't think those two characters should be followed by zero length spaces as they tend to indicate full words. They could be followed by Thinking out loud ... we could do a second parse for long unbroken strings looking for '-' after the first. That would avoid breaking at '-' and '_' except when they were involved in long strings without the other break characters. |
|
Using |
|
I'd say let's keep it simple for now. Would a shy add a hyphen when the browser wraps it? If yes, I'd not use that for diffs as an additional character might be confusing. |
fix for FS#2676, inserting zero length spaces into long sequences of non...
...-breaking characters in diffs
post process the html content string returned by Diff->format to locate long, unbroken strings of characters. Examine those strings and insert zero length (zl) spaces after certain characters (e.g. /#!,:;). When there are sequences of the 'special' characters only insert the zl space after the last character in the sequence.
Also, don't modify content within html tags and keep html entities together.