You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
One of the features I found extremely useful in XSLT/XPath/XQuery was the normalisation of white spaces within a string (in other words, on top of trimming a string, any multiple occurrence of a white space character gets replaced by a single white space character).
For example the string hello\r\nworld!\t would be normalised simply as hello world!.
This is extremely useful in non-latin (read Japanese, I live in Tokyo) languages where a number of characters can be used for separating words. In Japanese, normally people type the unicode character \u3000 as that's what entered by default when hitting "space" on the keyboard, but that's not necessarily what one might want to retain.
For example, I'd love for the string 山田 太郎 (the space used here is the normal space I get when using the Japanese keyboard, copy and paste it into hexdump -C on a UTF8 console) into a more normal 山田 太郎 (replaced with ASCII 0x20).
The text was updated successfully, but these errors were encountered:
This thread has been automatically locked due to inactivity. Please open a new issue for related bugs or questions following the new issue template instructions.
lockbot
locked as resolved and limited conversation to collaborators
Jan 9, 2020
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
One of the features I found extremely useful in XSLT/XPath/XQuery was the normalisation of white spaces within a string (in other words, on top of trimming a string, any multiple occurrence of a white space character gets replaced by a single white space character).
For example the string
hello\r\nworld!\t
would be normalised simply ashello world!
.This is extremely useful in non-latin (read Japanese, I live in Tokyo) languages where a number of characters can be used for separating words. In Japanese, normally people type the unicode character
\u3000
as that's what entered by default when hitting "space" on the keyboard, but that's not necessarily what one might want to retain.For example, I'd love for the string
山田 太郎
(the space used here is the normal space I get when using the Japanese keyboard, copy and paste it intohexdump -C
on a UTF8 console) into a more normal山田 太郎
(replaced with ASCII0x20
).The text was updated successfully, but these errors were encountered: