-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Right to left language (like Hebrew) causes name and size columns to be switched #4138
Comments
|
|
What is your locale? |
cat /etc/locale.conf |
Replying to ygoldfill:
Please show the output of command locale and files ~/.config/mc/in and ~/.config/mc/panels.ini |
pannels.ini is empty |
|
mc-4.8.25. Works for me. See attached screenshot. |
4.8.24 built from source works too. |
strange - just tested it on the Linux virtual machine (ubuntu based) in my chromebook and I got the same results. The mc version there is 4.8.18. The locale is the same. |
Replying to ygoldfill:
You must run ./autogen.sh at first. |
Seems that it depends on the system. On systems (like Fedora 31) where the Hebrew is displayed properly (not reversed) issuing 'ls' in shell, then mc is swapping the columns. |
Same issue here. Right to left language, namely Hebrew, causes name and size columns to be switched. But only when mc runs on X-windows. Not when mc runs on a virtual console. On the virtual console I get Hebrew characters as filled squares, not readable characters. Because I don't have Hebrew fonts for the virtual console. Still, the switching of the columns does not happen on the virtual console. With X-Windows I am also using -S sand256 as a single command line option, if that matters.
(~/.config/mc/panels.ini is empty)
It could be that ygoldfill #comment:10 holds here. But I am confused because I think that sometimes I get Hebrew reversed. Possibly not with ls output. like I wrote, I am confused about why I think that sometimes Hebrew words are reversed. |
More information for my #comment:11:
I have an old linux, from 2017, liveCD in which the problem is not seen. I will provide the details of its mc version, and so on, if requested. However, I think that ygoldfill #comment:10 is more relevant. I also wonder if the root of the problem is the terminal, and its underlying engine.
But before that, I think we should examine the behavior on a Linux virtual console.
$ setfont /usr/share/kbd/consolefonts/LatArCyrHeb-14.psfu.gz
Which gives Latin, Hebrew, and 2 other character sets, might help. I believe many users have that file on their system. With such fonts used, the behavior I get in the virtual console is that both an ls command from the shell, and mc, display Hebrew file names in reverse, wrong, direction. As if they were written from left to right. Not from right to left, as they should be. The names are also justified to the left. Not what a right to left reader used to. Yet, with the virtual console there is no problem of mc switching the order of the name and size columns. That is the behavior that I get here.
Leaving the virtual console aside, let us return to X-windows.
Looking again at the 2 screenshots that were posted earlier, ygoodfill screenshot, https://github.com/MidnightCommander/trac-archive/blob/master/attachments/ticket/4138/Screenshot%20from%202020-10-22%2017-35-43.png, manifests the bug. While the Hebrew file names are displayed as they should be. The names are also justified to the right, as is expected. In contrast, andrew_b screenshot, https://github.com/MidnightCommander/trac-archive/blob/master/attachments/ticket/4138/Screenshot%20at%202020-10-25%2010-53-10.png, does not manifests the bug. But that screenshot does display the Hebrew file names in the wrong direction. The names are justified to the left, which is not the usual
Since andrew_b's screenshot behaves the same as in a Linux virtual console, Can it be that the real problem is that there is no consensus to determine who, that is at which level, should the Right to Left languages get handled? Alternatively, should mc approach this bug by first obtaining a reasonable display at the virtual console? |
Replying to ZGMxYWFh:
There was mate-terminal-1.12.1 based on libvte-0.28.2. |
Replying to andrew_b:
sakura is a terminal from an 2017 liveCD, actually liveDVD, I have. It does not manifests the bug. Within sakura, mc displays Hebrew file names justified to the left. As if they were Latin names. And in a LTR, left to right manner. Not RTL, right to left, as Hebrew requires. It uses libvte-2.91. And so does the newer lxterminal, in which I can see the bug. Still, isn't it seem convincing that there is a tie between the bug and the terminal in which mc runs? I have copied the ldd output for my current, 2021Q4, lxterminal at the bottom of this message. It does mention usage of libfribidi. Perhaps this is the cause of the bug?
Other then the ldd output at the bottom of this message, all the following output is for mc 4.8.19 running in the sakura 3.3.4 terminal. That is, with packages from 2017.
So far for the liveDVD from 2017.
|
Replying to ZGMxYWFh:
Definitely yes. I've updated my libvte3 up to 0.58.3 with bidi support enabled and got the swapped name and size columns. |
|
Replying to andrew_b:
Firstly, I am not sure how you configure MC to show the column following the name column not numeric-only. Before writing down how I got it, I would like to mention I suspect the problem is not only the swapping of the columns. It could be related to the incosistency in justification of RTL, such as Hebrew, file names. I tried to change my MC into Left ⇒ Listing Format… ⇒ Long file list. And justification of RTL file names in the Name column, which were prevoiusly right justified, became left justified. Getting back into Left ⇒ Listing Format… ⇒ Full file list changed it back to right justification. I think this is too a bidi issue. Which is probably done, or should be mostly done, by the bidi layer of the underlying terminal.
I used how can i set the default user defined listing mode in midnight commander and Creating Custom Format to get a non numeric only column after the file names column. I stored a suitable format in .config/mc/panels.ini. And after exiting and reactivating MC, I could activate that format at Left ⇒ Listing Format… ⇒ User defined. And the outcome was like you wrote. No columns were swapped. However, in my case, the justification of the Hebrew file names changed as well. They became left justified. I guess the same happened to you. But you forgot to mention it. Possibly because perhaps you are used to left justification.
As I wrote above, even though it looks like the swapping columns depends on the listing format, I think this is not the full picture. Do note that even when the swapping columns is not seen, the wrong justification of Hebrew file names is there. The reason is that the problem is, probably, not only the swapping columns. I suspect it is also about the justification of the file names. In order to see that, I think we should talk about bidi in more length. I have already wrote that I suspect the justification of the names, and the swapping columns issue, are related. I haven’t proved that. I also suspect, but haven’t proved, the justification problem is bidi related. If that is not enough, my knowledge about bidi is poor. Still, I hope that if you follow, you will see that my suspicions might stand on solid ground.
Many years ago, many people thought the RTL issue could be settled by simply mirroring the order of characters in a line. Have the last character became the 1st character, the 2nd to last character became the 2nd character, and so on. Problem solved. It turned out this is far more complex than that. Possibly, but perhaps not limited to, due to having a mixture of LTR and RTL writing in the same line. Which is why they created quite a complex algorithm. As can be seen at Bidirectional text, there are strong characters. Weak characters. Neutral characters. Strong direction of the line. Weak direction for only some parts of the line. Artificial characters for directional enforcement. And some more definitions that are used to state, and implement, the bidi algorithm. My main point here is that it could be that the algorithm can not handle everything by itself. It must interact with the user. Sometimes it should be hinted by the user. I think in our case, that user are the MC programmers. I think the MC program must control what libfribidi does.
My bottom line is that, in my opinion, MC probably needs a programmer who knows the bidi algorithm to fix the column swapping, and the justification of the file names. And I am not sure there are not more issues. Such as locale that put dates, perhaps name of months, in an RTL language. The knowledge of the bidi algorithm will probably help such a programmer
Rethinking about what looks to me your persistence to ignore the justification issue. Perhaps you are trying to imply the problem should be solved by smaller, measurable, steps. As if you were saying: let us attempt to solve just the swapping of columns issue that was raised here. Forget, for now, about possible month names in an RTL language. Or have justification of file names in a way that is expected by readers of RTL languages. Hopefully, solving the swapped columns issue will bring us to a better position. And then we will see how to continue. In that case, perhaps the first step should be for MC to insert an LTR enforcing artificial character at the end of each column. That is, in each column, and for each line of that column, MC will insert a Left To Right enforcing artificial character. Putting that in other words: if today MC has
Where X is any character, of any language, including white space. And a bar is the columns separator character. Then after the change MC will have
where c is the bidi pseudo character to enforce a Left To Right direction. It is a pseudo character partially because the human user will not see it. It has no width. It is more a control sequence then a character of the language. It is injected to the stream the terminal receives so that its bidi layer can arrange the actual characters correctly. Of course, the Linux console, and any non bidi aware terminal, will get confused by that pseudo bidi character. |
|
I hope the following can be reproduced by many users. And supports my claim that the bidi layer of the terminal is the one involved in the undesired output. Because MC does not insert bidi control sequences into its output.
Edit: It could be important to mention that within LibreOffice Writer, I have used here
and
|
Hi ZGMxYWFh,
I think you misunderstand the situation. It's much simpler than you believe. None of us can speak an RTL language, so we don't really understand your problem, and can't provide help fixing it - we just don't have skills and/or time for it. I think mooffie could speak and read Hebrew, but very unfortunately he's vanished a long time ago.
If you can provide a patch that fixes RTL without breaking anything else, then we could try to help you to get it in. Otherwise, we'll have to wait until someone else can help us.
All the best, |
Replying to zaytsev:
The problem is shown in the first screenshot.
Unfortunately, that's true. |
Replying to zaytsev:
Hello Yuri
I think I do understand the situation. Sort of.
That is true.
That is partially wrong. I am quite sure many of you are well aware of https://en.wikipedia.org/wiki/Mirror_writing.
That is totally wrong. Absolutely wrong. You can provide help. You have the skills.
Unfortunately, with that I absolutely agree with you.
What I think has to be done is
As for a patch, and time, that is far beyond my ability.
Same for you: all the best. |
|
i think the correct pair would be FSI+PDI, see https://www.w3.org/International/questions/qa-bidi-unicode-controls for example.
anyway, the question would be how to get those characters into the stream, given that it's auto-generated by ncurses. i googled around a bit, but ncurses+bidi yields only open questions. i think it would need to gain an attribute bit for isolated text or something like that. |
Two small notes, but not the real work. (1) libraqm
I have came across libraqm. I only briefly skimmed its description. Nothing else. I believe it is used by imagemagick. I am not aware to any other usage. I don't know how libraqm compares to other, similar, libraries. In fact I don't know, and haven't searched for, similar libraries. Quoting the first line of its description
(2) GLib Unicode Support
While staring at (1), I realized mc already depends on Glib. Which has a Unicode Support component. Which might already be in use by mc. Qouting Glib Unicode Support
The words break type made me think it could be tightly related to the FSI+PDI, or whatever is required, by comment:23. I haven't pursued it further. Just a hunch I am putting on the table. |
Important
This issue was migrated from Trac:
ygoldfill
(ygoldfil@….com)u34@….ga
,mooffie
(@mooffie),dickey
(dickey@….com)When displaying directories or file names that are of both English and Hebrew the size column contents is shifted to the left and the name is shifted to the right for Hebrew names.
$ LC_MESSAGES=C mc -V
GNU Midnight Commander 4.8.24
Built with GLib 2.62.4
Using the S-Lang library with terminfo database
With builtin Editor
With subshell support as default
With support for background operations
With mouse support on xterm and Linux console
With internationalization support
With multiple codepages support
Virtual File Systems: cpiofs, tarfs, sfs, extfs, ftpfs, sftpfs, fish, smbfs
Data types: char: 8; int: 32; long: 64; void *: 64; size_t: 64; off_t: 64;
$ mc --configure-options
Note
Original attachments:
ygoldfill
(ygoldfil@….com) onOct 23, 2020 at 0:56 UTC
andrew_b
(@aborodin) onOct 25, 2020 at 7:55 UTC
ZGMxYWFh
(u34@….cf) onOct 28, 2021 at 12:21 UTC
ZGMxYWFh
(u34@….cf) onNov 3, 2021 at 16:26 UTC
The text was updated successfully, but these errors were encountered: