New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some character disappear on mac os x #4896
Conversation
A better bet might be to just hit the ones we know we want? |
Don't break things that already broken. :)
|
Only single byte within UTF-8 characters could be whitespace. How about this commit, @Karlson2k ? |
@sportica UTF-8 is compatible backward with US-ASCII only, not ASCII in general. Unicode has much more whitespace characters than US-ASCII. |
Okay. Thank you Karlson2k. |
Now use std::ctype functions instead of isspace(). std::ctype functions determine white spaces by system locale. How about this way? |
How about writing own function, @Karlson2k ? |
@Karlson2k It is based on your second suggestion. But only convert necessary amount, not whole string. Convert one character at once, and compare it. |
IMO I'd just implement using |
Please let me know if you know suitable STL. It seems that there is no template for our case. Or we should keep helper function. |
The original? With the simple conditional on a char you have a function that takes a char and returns a bool. Exactly what the isspace_c() function does. |
Please check again. |
@Karlson2k mind taking a look? |
Looks fine. |
jenkins build this please |
jenkins build this please again, without strange fails this time |
Some character disappear on mac os x
Some character disappear on mac os x
std::isspace() of mac os x recognize 0xa0 as whitespace.
I think that it's a bug.
http://en.cppreference.com/w/cpp/string/byte/isspace
'죠' is 0xec, 0xa3, 0xa0
'고' 0xea, 0xb3, 0xa0
Both '죠' and '고' have 0xa0.
Perhaps more character will be affected.