-
-
Notifications
You must be signed in to change notification settings - Fork 624
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compiling Liblouis with 32-bit Unicode support #6695
Comments
Hi, have you contacted Liblouis folks about this possibility? Thanks.
From: André-Abush Clause [mailto:notifications@github.com]
Sent: Wednesday, January 4, 2017 8:29 AM
To: nvaccess/nvda <nvda@noreply.github.com>
Cc: Subscribed <subscribed@noreply.github.com>
Subject: [nvaccess/nvda] Compiling Liblouis with 32-bit Unicode support (#6695)
In my opinion, the Liblouis DLL included in NVDA should be compiled with 32-bit Unicode.
Currently, we improve the French table and we'd like to add any 32-bit Unicode such as \y1D4D0, \y1D49C, \y1D4D2, \y1D49E, etc.
What do you think?
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub <#6695> , or mute the thread <https://github.com/notifications/unsubscribe-auth/AHgLkACqe_CV2gtBzv8GH_1rJZgTcW5Wks5rO8jZgaJpZM4Lazbp> .
|
@josephsl yes, it is possible. See https://github.com/liblouis/liblouis/blob/master/README.windows
|
CC @jcsteh, @dkager, @LeonarddeR
From: André-Abush Clause [mailto:notifications@github.com]
Sent: Wednesday, January 4, 2017 8:43 AM
To: nvaccess/nvda <nvda@noreply.github.com>
Cc: Joseph Lee <joseph.lee22590@gmail.com>; Mention <mention@noreply.github.com>
Subject: Re: [nvaccess/nvda] Compiling Liblouis with 32-bit Unicode support (#6695)
@josephsl <https://github.com/josephsl> yes, it is possible. See https://github.com/liblouis/liblouis/blob/master/README.windows
Edit the file configure.mk If you want 32-bit unicode change the 2 in the line UCS=2 to a 4.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub <#6695 (comment)> , or mute the thread <https://github.com/notifications/unsubscribe-auth/AHgLkN28Hz7C6gLCwbHTqmhFR70RvS61ks5rO8wQgaJpZM4Lazbp> .
|
AFAIK that needs an update to the Python wrapper similar to what is done in Java. |
|
@Andre9642, how important is this for French? Does the table fail to work with NVDA if you include these characters or do they just get ignored? Are these characters frequently used in French? |
Yes, the table fail to work with NVDA if I include these characters. Moreover Lou_checktable indicate that the table is invalid:
No, these characters aren't frequently used in French. They are used in mathematics. In fact, our additions relate in part to Unicode Math Symbols. Extract:
|
P2 because this prevents tables from even working in NVDA if they contain 32 bit characters. |
Just a gentle ping to bring the attention of NVDA code contributors to this P2 issue in case anyone desires to work on this. |
As noted in #6695 (comment) this needs some upstream work first. (In general, I'm not too fond of how liblouis deals with Unicode.) |
Hi, we need this for Python 3 (#7105(. CC @jcsteh
From: bhavyashah [mailto:notifications@github.com]
Sent: Sunday, August 6, 2017 4:52 AM
To: nvaccess/nvda <nvda@noreply.github.com>
Cc: Joseph Lee <joseph.lee22590@gmail.com>; Mention <mention@noreply.github.com>
Subject: Re: [nvaccess/nvda] Compiling Liblouis with 32-bit Unicode support (#6695)
Just a gentle ping to bring the attention of NVDA code contributors to this P2 issue in case anyone desires to work on this.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub <#6695 (comment)> , or mute the thread <https://github.com/notifications/unsubscribe-auth/AHgLkDQJgjP_feXsQSQw-ySnBWT7NP21ks5sVajNgaJpZM4Lazbp> .
|
@josephsl commented on 7 Aug 2017, 03:31 GMT+10:
No, we don't. Python 3 Unicode does handle 32 bit characters (actually, internally, i believe it's variable width now). However, Windows is still UTF-16 and thus ctypes still treats Unicode in C functions as UTF-16, since we're running on Windows. So, this is not required for Python 3. |
For anyone that does want to try to implement this into the liblouis Python bindings, here are some rough thoughts:
Note that this proposed implementation eliminates the need to consider the ctypes unicode width altogether. However, it may be slightly less optimal when the ctypes unicode width does match the liblouis unicode width because it always uses a Python byte string as an intermediary, where perhaps ctypes is more efficient internally; I'm not sure. You could get around this by checking the ctypes unicode width with |
There is also a function to get the widechar size, though that would require a library call from the wrapper:
A much more ambitious solution would be to make liblouis work with UTF-16 as opposed to UCS-2, i.e. add support for surrogate characters. |
I'm going to give this a try. |
@jcsteh commented on 7 aug. 2017 01:04 CEST:
Shouldn't we get rid of autotools macro expansions in python bindings if possible? |
Ideally, I guess so, but this piece of information is inherently tied to
the specific build of liblouis. One way you could work around that is to
have some liblouis function which returns this information, but I'm not
sure whether that'd be acceptable to the maintainers. Given that we already
have LIBLOUIS_SONAME and I really don't see how we could get rid of that, I
think it'd be less friction to just go the ugly path and use another macro
expansion, but I'm not the one implementing this. :)
|
@jcsteh commented on 30 aug. 2018 18:58 CEST:
There is such a function as @dkager pointed out in #6695 (comment) @jcsteh commented on 7 aug. 2017 01:04 CEST:
For reference, we should explicitly use the little endian versions of utf-16 and utf-32.
Now, that fails, since for c_char_p, the value is truncated as soon as a null character is encountered. Even when we would use c_wchar_p on Windows, the value will be truncated for cases like 'h\x00\x00\x00e\x00\x00\x00l\x00\x00\x00l\x00\x00\x00o\x00\x00\x00' |
Just filed liblouis/liblouis#633. I did not yet bother to create a try build due to several issues, including liblouis 3.7 refusing to build when building with NVDA's build system. Feedback is welcome though. |
Does their own NMake-based build still work? That seems to break every other release too. |
@jcsteh commented on 7 aug. 2017 00:25 CEST:
Actually, I'm afraid @josephsl has a point.
|
@leonardder commented on Sep 4, 2018, 6:44 PM GMT+10:
Sure, but that's because the bindings don't yet support both bit widths (regardless of Python bit width). Once that's fixed, this check can be removed from the bindings. That should be done as part of liblouis/liblouis#633. |
That's already covered in liblouis/liblouis#633 indeed. |
Ugh, there's one major thing I didn't think about yet. |
Hi,
As i know, this can break braille extender.
<mailto:reply@reply.github.com> @Andre9642/BrailleExtender
From: Leonard de Ruijter <notifications@github.com>
Sent: Tuesday, December 4, 2018 9:10 AM
To: nvaccess/nvda <nvda@noreply.github.com>
Cc: Subscribed <subscribed@noreply.github.com>
Subject: Re: [nvaccess/nvda] Compiling Liblouis with 32-bit Unicode support (#6695)
Ugh, there's one major thing I didn't think about yet.
When compiling liblouis with UCS-4, cursor position breaks in a major way. This is because NVDA provides the cursor position based on the internal python 2 unicode representation, which has two offsets for characters involving surrogates in UTF-16, such as 😊. A switch to Python 3 would automatically fix this I believe, since Python 3 unicode strings behave similar to 4 byte widechars (i.e. one 4 byte liblouis widechar corresponds with one offset in a python 3 string).
We could work around this by calculating the cursor position based on the UTF-32 representation of the text.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub <#6695 (comment)> , or mute the thread <https://github.com/notifications/unsubscribe-auth/AKohkyY67JlukHZAO2lJ6taFiKj04jGCks5u1i2_gaJpZM4Lazbp> . <https://github.com/notifications/beacon/AKohk0Ib7jj4AInJzLHYlFqrSpdU3TBpks5u1i2_gaJpZM4Lazbp.gif>
|
I've found a way to do this anyway. A pull request will be filed shortly. |
@LeonarddeR Should this issue be closed now that #9544 has been merged? |
Fixed in #9544 |
In my opinion, the Liblouis DLL included in NVDA should be compiled with 32-bit Unicode.
Currently, we improve the French table and we'd like to add any 32-bit Unicode such as \y1D4D0, \y1D49C, \y1D4D2, \y1D49E, etc.
What do you think?
The text was updated successfully, but these errors were encountered: