Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is PLATFORM_TEXT_IS_CHAR16 in UnixDefines.h correct? #2090

Open
nothingTVatYT opened this issue Dec 16, 2023 · 3 comments
Open

Is PLATFORM_TEXT_IS_CHAR16 in UnixDefines.h correct? #2090

nothingTVatYT opened this issue Dec 16, 2023 · 3 comments

Comments

@nothingTVatYT
Copy link
Contributor

UnixDefines.h defines:

#define PLATFORM_TEXT_IS_CHAR16 1

which leads to:

#if PLATFORM_TEXT_IS_CHAR16
typedef char16_t Char;

in BaseTypes.h

and that's the reason why the C/C++ wide character functions fail on Linux. I stumbled across that issue when trying to fix the command line parsing and found out that I cannot use language functions (s. C++ Reference) to convert the arguments from locale to a Flax String.

For Windows a wchar_t is actually 16 bits wide (according to Microsoft documentation but on Linux, macOS, iOS and Android it's 32 bits.

Now I wonder how data exchange with other C/C++ libraries that may use wchar_t are affected? What about file names on non-Windows platforms?

I found some code fragments that convert the Flax String to what is called Ansi using char when dealing with native code, filesystems or the OS using char. Won't that lead to data loss?

Flax version:
1.7.1

@mafiesto4
Copy link
Member

We use char16_t on Unix systems just because wchar_t is 16-bit on Windows and we want to maintain a single character size of 2 bytes across all platforms. We tend to use filesystem apis that support Unicode filenames but there might be some things to improve. Definitely that's a case for Editor-only as on cooked platforms (mobile, consoles) it's better to stick to ASCII filesystem names.

@nothingTVatYT
Copy link
Contributor Author

You are free to use any encoding in the engine and I think many C/C++ programs do that.
What would be nice is a way to interact in a lossless fashion, like String::ToNativeWide and String::FromNativeWide so that we have a chance to do it right.
Although Linux, macOS, iOS and Android typically use char at least for command line arguments that does not mean it's all US-ASCII. For the most part and especially for your customers it probably will be UTF-8.

@nothingTVatYT
Copy link
Contributor Author

One more problem with the utf16 approach on non-windows platform is that the error printing does not work. In Linux printing anything in headless mode results in something like:
Error: 0x7b1c53163de8
because the Char msg is not formatted like a char

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants