Properly handle supplementary characters when saving XML files #197
Conversation
…on platforms where wchar_t (and, by extension, wxUChar) is 2 bytes. Also, ignore invalid surrogates and the noncharacters U+FFFE and U+FFFF.
|
Thanks for this proposal. Would it not be simpler to change the type of c to something sure to be 32 bits wide on all platforms? Could it be the char32_t type, which is standard since C++11? |
|
The problem appears to be much deeper than I originally thought. If you look up the return type of It looks like the real solution here is to change the entire source code over from EDIT: Or maybe I'm completely wrong there. What's definitely true, though, is that if I switch just that one function to use a 32-bit character type, it would still be necessary to deal with surrogate pairs, and additionally it would be necessary to decode and encode them in that function, which isn't necessary with the code that I submitted. The code also uses |
|
In case any developers see this, I posted this issue on audacity-devel (here: https://sourceforge.net/p/audacity/mailman/message/35931978/) almost two weeks ago; it has yet to get a response. |
|
Thanks Yarn. This fix has now been committed. |
This fixes the problem experienced by a number of users, including this one, where supplementary characters (which includes many emoji) are written as pairs of escaped surrogate code points.
Note that the problem only affects platforms like Windows where
wchar_tis 2 bytes.