-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ICU for Unicode handling? #10
Comments
I ended up skirting the issue with libu8, and, provided no-one tries to feed us hopelessly broken encoding, that does the job just fine without having to massively rework how strings are handled ;). (ICU is a very very large hammer to take care of the Unicode issue, and the fact that wchar_t is just hopelessly broken on Kobo probably doesn't help. Plus, the fact that some of our target devices either don't ship it, or ship wildly different versions is another thing against it, because bundling it is not an option: besides the fact that it's C++, and takes forever to build, libicudata is over 25MB in ICU 60.2 ;)). |
Ah, fair enough. Carry on... /me keeps forgetting kindles exist 😈 I read a blog post a while back, where the author advocated using UTF-8 internally, and therefore sticking with the standard *char data type. The author argued that many of the most common string operations only care about bytes, and not characters. Also, UTF-8 is a sequence of bytes, so endianess doesn't matter. I found it a rather fascinating read. |
That's essentially what I ended up going with ;). I think I may have read that very same article, (if it mentioned doing sanitization/conversions at I/O boundaries, that's the one). But with the hobbled libc, I can't really do the sanitization/conversion bit, since any libc-based locale/multibyte/widechar stuff is basically borked ;). |
It probably was the same article :p I've been looking into this area a bit lately, because I'm trying to see if I can add differential support to my VHD library, and filepath strings there are encoded as UTF16BE. Incidentally, do you know of any good cross platform C file path library? |
Not really, the only thing that comes to mind is C++ (namely, boost) :/. |
And I really don't want to say glib on the C side of things, because glib's weird, and I'm not even sure it'd do what you need ;). |
You might also find something interesting either in stb or some other small libs like that ;). |
Thanks for the suggestions. I didn't see anything that really struck me as being suitable for my requirements (simple though they may be; path joining and normalization). |
I had another look at that STB link, and noticed I had missed the Oh my... that looks just about perfect :) |
Hi @NiLuJe
I'm mulling the idea of adding basic freetype2 support, and was having a look at the FBInk codebase to see if I could figure out how to add support, and I've noticed your rant on Kobo's broken libc with regard to unicode support.
I notice that the Kobo firmware appears to include the ICU library (libicu*.so, vers. 4.6). Have you looked into using this library for dealing with strings in FBInk?
The API documentation for ICU 4.6.1 is here
The text was updated successfully, but these errors were encountered: