Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reworked unicode support #17

Closed
lynxlynxlynx opened this issue Mar 11, 2013 · 4 comments
Closed

Reworked unicode support #17

lynxlynxlynx opened this issue Mar 11, 2013 · 4 comments

Comments

@lynxlynxlynx
Copy link
Member

Originally reported by: Gennady Trafimenkov (Bitbucket: gennady, GitHub: gennady)


At the moment there is a number of bugs related to unicode string support: #4, #5, #6

Most of them come from differences in wchar_t on various platforms (2 bytes on Windows, 4 bytes on Linux, ? bytes on MinGW on Linux, ? bytes on MinGW on Linux).

There is also an issue with Android NDK, which doesn't properly support wchar_t.

It might be a good idea to:

  • don't use wchar_t at all;
  • keep unicode strings incoded in utf-8;
  • transform to utf-16 where it is necessary (displaying, writing into files (saves, for example).

The ISO/IEC 10646:2003 Unicode standard 4.0 says: "The width of wchar_t is compiler-specific and can be as small as 8 bits. Consequently, _programs that need to be portable across any C or C++ compiler should not use wchar_t for storing Unicode text. The wchar_t type is intended for storing compiler-defined wide characters, which may be Unicode characters in some compilers."_

Here is a seemingly good library for the job: UTF8-CPP: UTF-8 with C++ in a Portable Way
Another option is QString, but it is a huge dependency for the project.


@derSteFfi
Copy link
Contributor

afaik we are using utf8-cpp since b668ee9

close?

@lynxlynxlynx
Copy link
Member Author

That commit only introduced it to the packaging dump and there's still plenty of use of wchar_t.

@derSteFfi
Copy link
Contributor

~/ja2-strat (git)-[master] % grep -r wchar_t * | wc -l
3648

OK, that's a lot.

edit:
a bit better if we do not count binary matches:

~/ja2-strat (git)-[master] % grep -rI wchar_t * | wc -l
3260

selaux pushed a commit that referenced this issue Jan 1, 2018
@lynxlynxlynx
Copy link
Member Author

Rechecking a year later: it indeed appears not to be needed any more (as in extra changes), especially since the new android test saw no issues (#228 ). Perhaps something from Omnigraphical's Android patchset (#435) would require it, but we can handle it then. This one is too vague, while wchar_t and original bugs are not a problem any more. Closing ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants