New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not-locale characters clamping together #739

Open
AndyScull opened this Issue Jun 17, 2016 · 36 comments

Comments

Projects
3 participants
@AndyScull

AndyScull commented Jun 17, 2016

Versions

ConEmu build: 160612 x32 (stable)
OS version: Windows 7 x64
Used shell version: Far Manager

Problem description

Problem with japanese characters overlapping one another.
See attached screenshots from 141221, it does not have this problem.
ConEmu settings for text are same - same checkboxes, fonts, font sizes.
Switching to Monolength isn't acceptable since it makes filenames with ascii characters look really bad

Steps to reproduce

  1. update conemu from 2014's version to latest

Actual results

-->screenshots

Expected results

not break existing functionality

Additional files

ConEmu 2016: some filenames do not keep original character widths (which is probably defined in ttf font?)
conemu_v2016
ConEmu 2014: each character has it's own width and displayed correctly. There are some characters with more space that visible glyph is, but at least they aren't drawn over each another
conemu_v2014
settings_comparison

@Maximus5

This comment has been minimized.

Show comment
Hide comment
@Maximus5

Maximus5 Jun 17, 2016

Owner

I need exact characters which overlaps.

Owner

Maximus5 commented Jun 17, 2016

I need exact characters which overlaps.

@AndyScull

This comment has been minimized.

Show comment
Hide comment
@AndyScull

AndyScull Jun 17, 2016

From what I know, japanese letters and general symbols they use start from 0x3000.
I used http://unicode-table.com/en/ to try few different ranges and see what results I'll get.
Keep in mind I kept using MS Gothic font, so many characters were shown as squares (but behavior of those squares were different)
0180 range - Latin Extended - do not have problems whatsoever
0370 - Greek characters - no problems
Arabic-like languages skipped, since they're right-to-left and symbols appear at start of filename despite being typed in the end
0e00 - Thai - no problems
1800 - Mongolian - no problem, except font drawing changes for whole filename, if there are visible Mongolian characters in line. If characters are cut off (left panel made narrower) - font returns to normal
1E00 - Latin Extended Additional - My font cannot show them, but squares do not jump around as I change panel width, and I believe they would show fine if I used compatible font
2c60 - Latin Extended-C - same as above
2E80 - CJK radicals - the problem I mentioned appears.

Basically, to replicate this you can:

  1. set font to something supporting japanese (if you use random font, japanese character would still be shown using substitute font and won't be consistent with ascii characters)
  2. create folder/file named "世界中のあらゆる情報を検索するためのツールを提供しています。さまざまな検索機能 を活用して、お探しの情報を見つけてください" (random string from google)
  3. change Far panel width using alt+left, alt+right and see how file name behaves. If it's shorter than file panel - then everything is alright. When you shrink the panel, letters are being compressed too forcefully and in the end you have to enable monowidth to be able to distinguish characters

AndyScull commented Jun 17, 2016

From what I know, japanese letters and general symbols they use start from 0x3000.
I used http://unicode-table.com/en/ to try few different ranges and see what results I'll get.
Keep in mind I kept using MS Gothic font, so many characters were shown as squares (but behavior of those squares were different)
0180 range - Latin Extended - do not have problems whatsoever
0370 - Greek characters - no problems
Arabic-like languages skipped, since they're right-to-left and symbols appear at start of filename despite being typed in the end
0e00 - Thai - no problems
1800 - Mongolian - no problem, except font drawing changes for whole filename, if there are visible Mongolian characters in line. If characters are cut off (left panel made narrower) - font returns to normal
1E00 - Latin Extended Additional - My font cannot show them, but squares do not jump around as I change panel width, and I believe they would show fine if I used compatible font
2c60 - Latin Extended-C - same as above
2E80 - CJK radicals - the problem I mentioned appears.

Basically, to replicate this you can:

  1. set font to something supporting japanese (if you use random font, japanese character would still be shown using substitute font and won't be consistent with ascii characters)
  2. create folder/file named "世界中のあらゆる情報を検索するためのツールを提供しています。さまざまな検索機能 を活用して、お探しの情報を見つけてください" (random string from google)
  3. change Far panel width using alt+left, alt+right and see how file name behaves. If it's shorter than file panel - then everything is alright. When you shrink the panel, letters are being compressed too forcefully and in the end you have to enable monowidth to be able to distinguish characters
@Maximus5

This comment has been minimized.

Show comment
Hide comment
@Maximus5

Maximus5 Jun 17, 2016

Owner

@AndyScull What is your OS? Show info from ConEmu/About/OS.

Owner

Maximus5 commented Jun 17, 2016

@AndyScull What is your OS? Show info from ConEmu/About/OS.

@AndyScull

This comment has been minimized.

Show comment
Hide comment
@AndyScull

AndyScull Jun 17, 2016

here you go:
ConEmu 160612 [32] Startup Info
OsVer: 6.1.7601.x64, Product: 1, SP: 1.0, Suite: 0x100, SM_SERVERR2: 0
CSDVersion: Service Pack 1, ReactOS: 0 (), Rsrv: 0
DBCS: 0, WINE: 0, PE: 0, Remote: 1, ACP: 1251, OEMCP: 866, Admin: 0
AppID: 41ff8172ae65e0896451e011f983072f::161
Desktop: Winsta0\Default, SessionId: 1, ConsoleSessionId: 4
Title: D:\Shell\Far\ConEmu.exe
Size: {0,0},{0,0}
Flags: 0x00000001, ShowWindow: 1, ConHWnd: 0x00000000
char: 1, short: 2, int: 4, long: 4, u64: 8
Handles: 0x00000000, 0x00000000, 0x00000000
Current PID: 15364, TID: 26648
Active HKL: 0x04090409
GetKeyboardLayoutList: 0x04090409 0x04190419

AndyScull commented Jun 17, 2016

here you go:
ConEmu 160612 [32] Startup Info
OsVer: 6.1.7601.x64, Product: 1, SP: 1.0, Suite: 0x100, SM_SERVERR2: 0
CSDVersion: Service Pack 1, ReactOS: 0 (), Rsrv: 0
DBCS: 0, WINE: 0, PE: 0, Remote: 1, ACP: 1251, OEMCP: 866, Admin: 0
AppID: 41ff8172ae65e0896451e011f983072f::161
Desktop: Winsta0\Default, SessionId: 1, ConsoleSessionId: 4
Title: D:\Shell\Far\ConEmu.exe
Size: {0,0},{0,0}
Flags: 0x00000001, ShowWindow: 1, ConHWnd: 0x00000000
char: 1, short: 2, int: 4, long: 4, u64: 8
Handles: 0x00000000, 0x00000000, 0x00000000
Current PID: 15364, TID: 26648
Active HKL: 0x04090409
GetKeyboardLayoutList: 0x04090409 0x04190419

@Maximus5

This comment has been minimized.

Show comment
Hide comment
@Maximus5

Maximus5 Jun 17, 2016

Owner

DBCS: 0, ACP: 1251, OEMCP: 866

And how do you imagine ConEmu would fit your string in non-intended console space without shrinking???

You have two options: either install CJK OS or do not use CJK.

Owner

Maximus5 commented Jun 17, 2016

DBCS: 0, ACP: 1251, OEMCP: 866

And how do you imagine ConEmu would fit your string in non-intended console space without shrinking???

You have two options: either install CJK OS or do not use CJK.

@AndyScull

This comment has been minimized.

Show comment
Hide comment
@AndyScull

AndyScull Jun 17, 2016

Oh... then I have a counterquestion - why does the same happens on complete japanese windows installation?

ConEmu 160612 [32] Startup Info
OsVer: 6.1.7601.x32, Product: 1, SP: 1.0, Suite: 0x100, SM_SERVERR2: 0
CSDVersion: Service Pack 1, ReactOS: 0 (), Rsrv: 30
DBCS: 1, WINE: 0, PE: 0, Remote: 0, ACP: 932, OEMCP: 932, Admin: 0
AppID: 478cb7f6f85177052b35c65a51839a7a::161
Desktop: Winsta0\Default, SessionId: 1, ConsoleSessionId: 1
Title: Z:\Far\ConEmu.exe
Size: {0,0},{0,0}
Flags: 0x00000001, ShowWindow: 1, ConHWnd: 0x00000000
char: 1, short: 2, int: 4, long: 4, u64: 8
Handles: 0x00000000, 0x00000000, 0x00000000
Current PID: 4200, TID: 4204
Active HKL: 0x04090409
GetKeyboardLayoutList: 0x04090411 0x04090409 0x04110411 0x04190419 0x08040804

AndyScull commented Jun 17, 2016

Oh... then I have a counterquestion - why does the same happens on complete japanese windows installation?

ConEmu 160612 [32] Startup Info
OsVer: 6.1.7601.x32, Product: 1, SP: 1.0, Suite: 0x100, SM_SERVERR2: 0
CSDVersion: Service Pack 1, ReactOS: 0 (), Rsrv: 30
DBCS: 1, WINE: 0, PE: 0, Remote: 0, ACP: 932, OEMCP: 932, Admin: 0
AppID: 478cb7f6f85177052b35c65a51839a7a::161
Desktop: Winsta0\Default, SessionId: 1, ConsoleSessionId: 1
Title: Z:\Far\ConEmu.exe
Size: {0,0},{0,0}
Flags: 0x00000001, ShowWindow: 1, ConHWnd: 0x00000000
char: 1, short: 2, int: 4, long: 4, u64: 8
Handles: 0x00000000, 0x00000000, 0x00000000
Current PID: 4200, TID: 4204
Active HKL: 0x04090409
GetKeyboardLayoutList: 0x04090411 0x04090409 0x04110411 0x04190419 0x08040804

@AndyScull

This comment has been minimized.

Show comment
Hide comment
@AndyScull

AndyScull Jun 17, 2016

p.s.

And how do you imagine ConEmu would fit your string in non-intended console space without shrinking???

The same way ConEmu v 141221 did. That's why I wrote that expected behavior from updating the program is to not lose current functionality

AndyScull commented Jun 17, 2016

p.s.

And how do you imagine ConEmu would fit your string in non-intended console space without shrinking???

The same way ConEmu v 141221 did. That's why I wrote that expected behavior from updating the program is to not lose current functionality

@Maximus5

This comment has been minimized.

Show comment
Hide comment
@Maximus5

Maximus5 Jun 17, 2016

Owner

why does the same happens on complete japanese windows installation?

The same? I doubt. Show screenshots of ConEmu and RealConsole on DBCS system. And issue chcp in that console, what codepage it shows?

The same way ConEmu v 141221 did.

Previously, ConEmu trims glyphs which overrun intended rectangle. Just compare screenshots. I'm sure partially displayed text is worse than compressed one.

Owner

Maximus5 commented Jun 17, 2016

why does the same happens on complete japanese windows installation?

The same? I doubt. Show screenshots of ConEmu and RealConsole on DBCS system. And issue chcp in that console, what codepage it shows?

The same way ConEmu v 141221 did.

Previously, ConEmu trims glyphs which overrun intended rectangle. Just compare screenshots. I'm sure partially displayed text is worse than compressed one.

@AndyScull

This comment has been minimized.

Show comment
Hide comment
@AndyScull

AndyScull Jun 17, 2016

OK, did it

CHCP:
_chcp

How it looks in plain Far:
_plain_far

ConEmu (2016):
_settings

And this specific long string I mentioned before:
_specific_long_filename

Realconsole (cursor on long filename):
_realconsole

AndyScull commented Jun 17, 2016

OK, did it

CHCP:
_chcp

How it looks in plain Far:
_plain_far

ConEmu (2016):
_settings

And this specific long string I mentioned before:
_specific_long_filename

Realconsole (cursor on long filename):
_realconsole

@Maximus5

This comment has been minimized.

Show comment
Hide comment
@Maximus5

Maximus5 Jun 17, 2016

Owner

This looks like a bug of Far Manager.

On DBCS enabled systems, CJK takes two cells instead of one. AFAIK Far ignores this fact and it's an issue for Mantis.

Why your long string is displayed condensed in ConEmu I'm not sure. Depends on exact glyphs location in console. Seems like Far tries to fit data which exceeds panels size. And ConEmu do its best to fit bad data...

Owner

Maximus5 commented Jun 17, 2016

This looks like a bug of Far Manager.

On DBCS enabled systems, CJK takes two cells instead of one. AFAIK Far ignores this fact and it's an issue for Mantis.

Why your long string is displayed condensed in ConEmu I'm not sure. Depends on exact glyphs location in console. Seems like Far tries to fit data which exceeds panels size. And ConEmu do its best to fit bad data...

@AndyScull

This comment has been minimized.

Show comment
Hide comment
@AndyScull

AndyScull Jun 17, 2016

Oh well ok, will stick to old version then.
Even if Far team admits and fixes it, that will leave me with VERY ugly view with a lot of spaces when I'd run Far without ConEmu (I don't always run it, only when I need to work with japanese files and see their names)
Thanks for help

AndyScull commented Jun 17, 2016

Oh well ok, will stick to old version then.
Even if Far team admits and fixes it, that will leave me with VERY ugly view with a lot of spaces when I'd run Far without ConEmu (I don't always run it, only when I need to work with japanese files and see their names)
Thanks for help

@AndyScull AndyScull closed this Jun 17, 2016

@Maximus5

This comment has been minimized.

Show comment
Hide comment
@Maximus5

Maximus5 Jun 17, 2016

Owner

I can't understand why do you prefer cropped content.

Owner

Maximus5 commented Jun 17, 2016

I can't understand why do you prefer cropped content.

@AndyScull

This comment has been minimized.

Show comment
Hide comment
@AndyScull

AndyScull Jun 17, 2016

Elaborate please, what exactly do you mead by cropped content?
If it's about my choice to stick with old version - that's because in new version I get garbled and unreadable 'content' which is no content at all. Old version has it's disadvantages (mainly cutoff on the right in dialog boxes for very long strings) but at least my files are shown correctly and I see nice charmy words, like if I'd copied them in word. With new version I'd have to try hard to read any of long filenames, or if I used monowidth, I'd have a ugly spaced ascii. IMO, old version totally wins this comparison
Just to note, I don't need and don't use any of ConEmu's fancy features except main ability to show unicode filenames - without switching locale, mucking with system fonts and et cetera. I'd use Total Commander, as it does this even better, but it sucks when working with command line

AndyScull commented Jun 17, 2016

Elaborate please, what exactly do you mead by cropped content?
If it's about my choice to stick with old version - that's because in new version I get garbled and unreadable 'content' which is no content at all. Old version has it's disadvantages (mainly cutoff on the right in dialog boxes for very long strings) but at least my files are shown correctly and I see nice charmy words, like if I'd copied them in word. With new version I'd have to try hard to read any of long filenames, or if I used monowidth, I'd have a ugly spaced ascii. IMO, old version totally wins this comparison
Just to note, I don't need and don't use any of ConEmu's fancy features except main ability to show unicode filenames - without switching locale, mucking with system fonts and et cetera. I'd use Total Commander, as it does this even better, but it sucks when working with command line

@Maximus5

This comment has been minimized.

Show comment
Hide comment
@Maximus5

Maximus5 Jun 26, 2016

Owner

OK, more reasons and description

In your own example, you have really long text in your console

edited

ConEmu can't do any magic. It shows in virtual console the data console application printed. Lets take WinWord for example. Would you like if the text you typed goes out of page margins? What happens if you send this doc to printer? You would have only half of the book, which has absolutely no sense. You can't guess what was in the cropped text (which overruns page margins).

What would happen, when you try to copy this cropped string (old ConEmu behavior) from console? Doesn't matter, with Far's grabber Alt+Ins or ConEmu internals Copy. The behaviour would be weird. You try to copy something which does not exists on screen.

Well, some fun below:

2016-06-26_14-10-22

With old behavior, when overruns were just dropped (cropped) you got that

2016-06-26_13-59-45

And you just didn't see infomation which may be valuable

2016-06-26_13-59-58

On DBCS enabled OS properly designed console application knows, that CJK takes two cells and does not try to print more data than possible. Far was not designed for CJK, thats why I've suggested you to complain on Mantis.

Owner

Maximus5 commented Jun 26, 2016

OK, more reasons and description

In your own example, you have really long text in your console

edited

ConEmu can't do any magic. It shows in virtual console the data console application printed. Lets take WinWord for example. Would you like if the text you typed goes out of page margins? What happens if you send this doc to printer? You would have only half of the book, which has absolutely no sense. You can't guess what was in the cropped text (which overruns page margins).

What would happen, when you try to copy this cropped string (old ConEmu behavior) from console? Doesn't matter, with Far's grabber Alt+Ins or ConEmu internals Copy. The behaviour would be weird. You try to copy something which does not exists on screen.

Well, some fun below:

2016-06-26_14-10-22

With old behavior, when overruns were just dropped (cropped) you got that

2016-06-26_13-59-45

And you just didn't see infomation which may be valuable

2016-06-26_13-59-58

On DBCS enabled OS properly designed console application knows, that CJK takes two cells and does not try to print more data than possible. Far was not designed for CJK, thats why I've suggested you to complain on Mantis.

@AndyScull

This comment has been minimized.

Show comment
Hide comment
@AndyScull

AndyScull Jun 26, 2016

To prove my point, single screenshot that demonstrates
image

  1. how text would be cropped if FAR used 2 cells for DBCS chars.
    Upper filename is 31 dbcs chars long (to count, I used 60-digit file). This is how FAR would print it if the 'bug' was fixed.
    Lower filename - 62 dbcs chars. It goes beyond end of tab.
    Now, tell me, which line of information should be tagged as 'cropped'? Especially consider that for 31-char line, you don't even know that it is cropped. Filename may be actually 60 characters long but FAR would print only 31 of them (one per 2 cell, right?), and ConEmu would nicely fits each of those 31 characters to the left (unless you enable monowidth which looks ugly)
  2. How file information is not overlayed by text in my case. I see all size/date tabs correctly. Must be some different font settings in your case, or maybe different version of conemu? I couldn't replicate same behavior in my installation.

//edit

Lets take WinWord for example

Bad comparison. WinWord crops not by number of characters, by but line glyph width (cm/pixels). That's exactly how old conemu works, and actually you're rooting for my team here

AndyScull commented Jun 26, 2016

To prove my point, single screenshot that demonstrates
image

  1. how text would be cropped if FAR used 2 cells for DBCS chars.
    Upper filename is 31 dbcs chars long (to count, I used 60-digit file). This is how FAR would print it if the 'bug' was fixed.
    Lower filename - 62 dbcs chars. It goes beyond end of tab.
    Now, tell me, which line of information should be tagged as 'cropped'? Especially consider that for 31-char line, you don't even know that it is cropped. Filename may be actually 60 characters long but FAR would print only 31 of them (one per 2 cell, right?), and ConEmu would nicely fits each of those 31 characters to the left (unless you enable monowidth which looks ugly)
  2. How file information is not overlayed by text in my case. I see all size/date tabs correctly. Must be some different font settings in your case, or maybe different version of conemu? I couldn't replicate same behavior in my installation.

//edit

Lets take WinWord for example

Bad comparison. WinWord crops not by number of characters, by but line glyph width (cm/pixels). That's exactly how old conemu works, and actually you're rooting for my team here

@Maximus5

This comment has been minimized.

Show comment
Hide comment
@Maximus5

Maximus5 Jun 26, 2016

Owner
  1. You are wrong. If the bug would be fixed in Far, it would print the following
║世界中のあらゆる情報を検索するためのツールを提供しています。さまざまな検索機能 を活用}Folder ║

Than you would be able to scroll long names as usual in Far with Alt-Left/Alt-Right. For example

{界中のあらゆる情報を検索するためのツールを提供しています。さまざまな検索機能 を活用し}Folder ║

That behaviour of console application would be absolute proper and ConEmu would not crop/drop/whatever any parts of text. And there would be no overlaps too.

Now, tell me, which line of information should be tagged as 'cropped'?

You can easily see that part on the left. You have no idea at all that after there is long string て、お探しの情報を見つけてください. This data was cropped/dropped/hidden/...

  1. I show you part of the status bar. There are no vertical bars, text is printed as one continous line. It's not correct to point on the one example, without mentioning a lot of other variants.

Look. I do not tell you that current implementation is ideal. ConEmu just does its best to show what console application asks to show.

Owner

Maximus5 commented Jun 26, 2016

  1. You are wrong. If the bug would be fixed in Far, it would print the following
║世界中のあらゆる情報を検索するためのツールを提供しています。さまざまな検索機能 を活用}Folder ║

Than you would be able to scroll long names as usual in Far with Alt-Left/Alt-Right. For example

{界中のあらゆる情報を検索するためのツールを提供しています。さまざまな検索機能 を活用し}Folder ║

That behaviour of console application would be absolute proper and ConEmu would not crop/drop/whatever any parts of text. And there would be no overlaps too.

Now, tell me, which line of information should be tagged as 'cropped'?

You can easily see that part on the left. You have no idea at all that after there is long string て、お探しの情報を見つけてください. This data was cropped/dropped/hidden/...

  1. I show you part of the status bar. There are no vertical bars, text is printed as one continous line. It's not correct to point on the one example, without mentioning a lot of other variants.

Look. I do not tell you that current implementation is ideal. ConEmu just does its best to show what console application asks to show.

@AndyScull

This comment has been minimized.

Show comment
Hide comment
@AndyScull

AndyScull Jun 27, 2016

You are wrong. If the bug would be fixed in Far, it would print the following

Erm. Then you probably lost me. I'll try to explain how I think, and you please correct me where I am wrong

  1. Didn't you say that proper width for DBCS characters is 2 cells? You meant character placeholders, like 80x25 in oldschool default console size, or is it something else?
  2. If so, any proper console program should print one DBCS character per 2 cells
  3. If so, it can fit width/2 characters for any given range of cells on virtual console screen. And there will be spaces, like when I enable monowidth in ConEmu
  4. If so, FAR would do the same. Not that characters would be really shown as FAR seems to be missing required fonts on non-japanese versions of windows. Instead, it shows square placeholders.
  5. Now see my previous screenshot and note right part of it with Real Console output. FAR fits all characters it can fit, showing them as squares. IF it was DBCS-aware, it would print one characters per 2 cells, effectively shortening shown filename length to half of tab width
  6. Now, if FAR would print 30 characters of long filename, ConEmu won't go out of it way to find a file, get it's name, and expand it to end of tab, right? ConEmu would just 'convert' existing strings to proper unicode glyphs, using whatever font is specified in settings. in process, shrinking whole line of text as lot of glyphs are less than 2 cells wide
  7. Then you'll get a situation like in my screenshot - where 30 characters are printed aligned to left, and a lot of space because FAR provided only those 30 chars and nothing more

Actually, this little screenshot shows how it prints DBCS chars on native jap windows, so I believe I made no mistakes in my logical chain
image
There's 2 numbers per each japanese character, so 'fixed' FAR on jap system would show half as much of these characters as it shows now, limiting string length to 30 chars. And if I wouldn't use monowidth font in ConEmu, all glyphs would get packed tightly to the left, leaving unused space to the end of tab (unless I use monolength font, which probably would look very ugly with non-japanese filenames)

//edit
After few more tests, I can't definitely say how current FAR works on native DBSC locales. Without ConEmu, it mostly shows 1 char per 2 cells, but sometimes it starts clamping them together in every cell...
I won't post this bug to FAR forum because I have no guarantee that fixed FAR would work like it works now on non-DBCS Windows. That's how I use it now and I'm pretty happy with my experience. Japanese characters are shown correctly without using Total Commander, and without changing whole system locale. Upgrading to latest ConEmu version was just a experiment to see what was fixed and improved. Unfortunately, it didn't improve anything for me so I am staying on old version. Probably forever

AndyScull commented Jun 27, 2016

You are wrong. If the bug would be fixed in Far, it would print the following

Erm. Then you probably lost me. I'll try to explain how I think, and you please correct me where I am wrong

  1. Didn't you say that proper width for DBCS characters is 2 cells? You meant character placeholders, like 80x25 in oldschool default console size, or is it something else?
  2. If so, any proper console program should print one DBCS character per 2 cells
  3. If so, it can fit width/2 characters for any given range of cells on virtual console screen. And there will be spaces, like when I enable monowidth in ConEmu
  4. If so, FAR would do the same. Not that characters would be really shown as FAR seems to be missing required fonts on non-japanese versions of windows. Instead, it shows square placeholders.
  5. Now see my previous screenshot and note right part of it with Real Console output. FAR fits all characters it can fit, showing them as squares. IF it was DBCS-aware, it would print one characters per 2 cells, effectively shortening shown filename length to half of tab width
  6. Now, if FAR would print 30 characters of long filename, ConEmu won't go out of it way to find a file, get it's name, and expand it to end of tab, right? ConEmu would just 'convert' existing strings to proper unicode glyphs, using whatever font is specified in settings. in process, shrinking whole line of text as lot of glyphs are less than 2 cells wide
  7. Then you'll get a situation like in my screenshot - where 30 characters are printed aligned to left, and a lot of space because FAR provided only those 30 chars and nothing more

Actually, this little screenshot shows how it prints DBCS chars on native jap windows, so I believe I made no mistakes in my logical chain
image
There's 2 numbers per each japanese character, so 'fixed' FAR on jap system would show half as much of these characters as it shows now, limiting string length to 30 chars. And if I wouldn't use monowidth font in ConEmu, all glyphs would get packed tightly to the left, leaving unused space to the end of tab (unless I use monolength font, which probably would look very ugly with non-japanese filenames)

//edit
After few more tests, I can't definitely say how current FAR works on native DBSC locales. Without ConEmu, it mostly shows 1 char per 2 cells, but sometimes it starts clamping them together in every cell...
I won't post this bug to FAR forum because I have no guarantee that fixed FAR would work like it works now on non-DBCS Windows. That's how I use it now and I'm pretty happy with my experience. Japanese characters are shown correctly without using Total Commander, and without changing whole system locale. Upgrading to latest ConEmu version was just a experiment to see what was fixed and improved. Unfortunately, it didn't improve anything for me so I am staying on old version. Probably forever

@Maximus5

This comment has been minimized.

Show comment
Hide comment
@Maximus5

Maximus5 Jun 27, 2016

Owner

DBCS versions of Windows works absolutely different than non-DBCS.
When you run application on DBCS Windows and use double (four) byte codepage (like 932) each CJK takes real two cells. Take a look at COMMON_LVB_LEADING_BYTE and COMMON_LVB_TRAILING_BYTE in CHAR_INFO description. This is absolutely weird and unbelievable on first sight, but actually this is the only way to print and display CJK using [A] console functions. Moreover, even if console application uses [W] function (like Far does) to write wchar_t sequences, the console doubles each CJK (first will have COMMON_LVB_LEADING_BYTE and second - COMMON_LVB_TRAILING_BYTE flag) and you have this glyph in TWO cells, otherwise [A] functions will fail to read 932 codepage!

The only exception is codepage 65001. It uses one unicode (wchar_t) real cell.

The console window (conhost.exe) and ConEmu known about that and display sequence of cells (COMMON_LVB_LEADING_BYTE ... COMMON_LVB_TRAILING_BYTE) as one CJK glyph.

So, when you are using CJK Windows, console applications must work in different way than on non-CJK Windows.

Owner

Maximus5 commented Jun 27, 2016

DBCS versions of Windows works absolutely different than non-DBCS.
When you run application on DBCS Windows and use double (four) byte codepage (like 932) each CJK takes real two cells. Take a look at COMMON_LVB_LEADING_BYTE and COMMON_LVB_TRAILING_BYTE in CHAR_INFO description. This is absolutely weird and unbelievable on first sight, but actually this is the only way to print and display CJK using [A] console functions. Moreover, even if console application uses [W] function (like Far does) to write wchar_t sequences, the console doubles each CJK (first will have COMMON_LVB_LEADING_BYTE and second - COMMON_LVB_TRAILING_BYTE flag) and you have this glyph in TWO cells, otherwise [A] functions will fail to read 932 codepage!

The only exception is codepage 65001. It uses one unicode (wchar_t) real cell.

The console window (conhost.exe) and ConEmu known about that and display sequence of cells (COMMON_LVB_LEADING_BYTE ... COMMON_LVB_TRAILING_BYTE) as one CJK glyph.

So, when you are using CJK Windows, console applications must work in different way than on non-CJK Windows.

@Maximus5

This comment has been minimized.

Show comment
Hide comment
@Maximus5

Maximus5 Jun 27, 2016

Owner

If so, it can fit width/2 characters for any given range of cells on virtual console screen. And there will be spaces, like when I enable monowidth in ConEmu

So, you are wrong here. There would be no spaces. Only CJK glyphs which are (unfortunately) doubled in conhost internal buffer (taking two cells) but are displayed as one wide glyph.

Owner

Maximus5 commented Jun 27, 2016

If so, it can fit width/2 characters for any given range of cells on virtual console screen. And there will be spaces, like when I enable monowidth in ConEmu

So, you are wrong here. There would be no spaces. Only CJK glyphs which are (unfortunately) doubled in conhost internal buffer (taking two cells) but are displayed as one wide glyph.

@Maximus5

This comment has been minimized.

Show comment
Hide comment
@Maximus5

Maximus5 Jun 27, 2016

Owner

Simple test in CPP coming

  1. Obtain current cursor position
  2. Write two unicode glyphs L"世 " (CJK + 0x20) using WriteConsoleW
  3. Read three wide chars from cursor pos (from step 1) using ReadConsoleOutputW
  4. Go crazy :(
Owner

Maximus5 commented Jun 27, 2016

Simple test in CPP coming

  1. Obtain current cursor position
  2. Write two unicode glyphs L"世 " (CJK + 0x20) using WriteConsoleW
  3. Read three wide chars from cursor pos (from step 1) using ReadConsoleOutputW
  4. Go crazy :(
@AndyScull

This comment has been minimized.

Show comment
Hide comment
@AndyScull

AndyScull Jun 27, 2016

I'm not into programming, much more in CPP.
I'd like to hear specific answers - if FAR was fixed and DBCS-aware, how many characters would be shown in 60-cell wide tab in DBCS, and non-DBCS windows. And how would they look mixed with ascii characters? And how many characters would be seen in ConEmu window? And consider it all with non-monowidth font...
And then compare with how many characters I see now in old ConEmu- it could be defined by phrase 'however many glyphs fit into tab space'

So, you are wrong here. There would be no spaces. Only CJK glyphs which are (unfortunately) doubled in conhost internal buffer (taking two cells) but are displayed as one wide glyph.

Well, I mean spaces between glyphs. Console should use monowidth font for those, so there should be more spacing between actual character lines when compared to pure graphic output

AndyScull commented Jun 27, 2016

I'm not into programming, much more in CPP.
I'd like to hear specific answers - if FAR was fixed and DBCS-aware, how many characters would be shown in 60-cell wide tab in DBCS, and non-DBCS windows. And how would they look mixed with ascii characters? And how many characters would be seen in ConEmu window? And consider it all with non-monowidth font...
And then compare with how many characters I see now in old ConEmu- it could be defined by phrase 'however many glyphs fit into tab space'

So, you are wrong here. There would be no spaces. Only CJK glyphs which are (unfortunately) doubled in conhost internal buffer (taking two cells) but are displayed as one wide glyph.

Well, I mean spaces between glyphs. Console should use monowidth font for those, so there should be more spacing between actual character lines when compared to pure graphic output

@Maximus5

This comment has been minimized.

Show comment
Hide comment
@Maximus5

Maximus5 Jun 27, 2016

Owner

I'm preparing tests and screenshots. Later today...

Console should use monowidth font for those,

What do you mean? On DBCS system "monowidth" font has different width for double-width (CJK, full) and single-width (just ASCII) characters.

PS. Is it possible to discuss in Russian to avoid translation problems?

Owner

Maximus5 commented Jun 27, 2016

I'm preparing tests and screenshots. Later today...

Console should use monowidth font for those,

What do you mean? On DBCS system "monowidth" font has different width for double-width (CJK, full) and single-width (just ASCII) characters.

PS. Is it possible to discuss in Russian to avoid translation problems?

@AndyScull

This comment has been minimized.

Show comment
Hide comment
@AndyScull

AndyScull Jun 27, 2016

yep.
По английски было бы более доступно другим юзерам, если когда-нибудь кто-нибудь заморочился как я.
Под monowidth в консоли я имею в виду - как courier шрифты, каждый глиф чара дополняется пустым местом до определенной и одинаковой для всего шрифта ширины. В DBCS это либо ширина ячейки, либо она же умноженная на два. Возможно, сам шрифт не моноширинный, а это консольные проги рисуют символ в середине 1- или 2- ширинного пространства. В результате между самими символами остается достаточно много пустого пространства, что получается шире, чем если б то же самое набили в ворде немоноширинным шрифтом. Текст-то в общем понятен, но на экран влезает меньше символов.

AndyScull commented Jun 27, 2016

yep.
По английски было бы более доступно другим юзерам, если когда-нибудь кто-нибудь заморочился как я.
Под monowidth в консоли я имею в виду - как courier шрифты, каждый глиф чара дополняется пустым местом до определенной и одинаковой для всего шрифта ширины. В DBCS это либо ширина ячейки, либо она же умноженная на два. Возможно, сам шрифт не моноширинный, а это консольные проги рисуют символ в середине 1- или 2- ширинного пространства. В результате между самими символами остается достаточно много пустого пространства, что получается шире, чем если б то же самое набили в ворде немоноширинным шрифтом. Текст-то в общем понятен, но на экран влезает меньше символов.

@Maximus5

This comment has been minimized.

Show comment
Hide comment
@Maximus5

Maximus5 Jun 29, 2016

Owner

Here are some tests: https://github.com/Maximus5/Write-Read-Test

how many characters would be shown in 60-cell wide tab in DBCS, and non-DBCS windows.

RealConsole (conhost.exe) physically can't show more than 30 full-width glyphs in 60-cells console. They are folded to the next line otherwise. Each CJK takes two cells in any case on DBCS OS.

And how would they look mixed with ascii characters?

Exactly as they must. ASCII (half-width) would take single cell.

And how many characters would be seen in ConEmu window?

Same as in RealConsole. There would be no compression at all, because all glyphs would take desired space. On DBCS OS of course.

And consider it all with non-monowidth font...

Arial? Times New Roman? Tahoma? Awful... regardless CJK or not.

Owner

Maximus5 commented Jun 29, 2016

Here are some tests: https://github.com/Maximus5/Write-Read-Test

how many characters would be shown in 60-cell wide tab in DBCS, and non-DBCS windows.

RealConsole (conhost.exe) physically can't show more than 30 full-width glyphs in 60-cells console. They are folded to the next line otherwise. Each CJK takes two cells in any case on DBCS OS.

And how would they look mixed with ascii characters?

Exactly as they must. ASCII (half-width) would take single cell.

And how many characters would be seen in ConEmu window?

Same as in RealConsole. There would be no compression at all, because all glyphs would take desired space. On DBCS OS of course.

And consider it all with non-monowidth font...

Arial? Times New Roman? Tahoma? Awful... regardless CJK or not.

@Maximus5 Maximus5 reopened this Jun 29, 2016

@AndyScull

This comment has been minimized.

Show comment
Hide comment
@AndyScull

AndyScull Jun 29, 2016

Arial? Times New Roman? Tahoma? Awful... regardless CJK or not.

And here's the answer. It would definitely look awful for me if this bug in FAR was fixed. That's why I won't report it and prefer you'd not do it too.
I can live with older ConEmu version, but new versions of FAR are a must, they fix and add a lot of things. If at some point they'd 'fix' it, I'd either have much shorter CJK strings (compared to before), or would stick with outdated FAR version.
So just forget all this issue please. Anyway, I may be forced to switch to linux in 5 years...

AndyScull commented Jun 29, 2016

Arial? Times New Roman? Tahoma? Awful... regardless CJK or not.

And here's the answer. It would definitely look awful for me if this bug in FAR was fixed. That's why I won't report it and prefer you'd not do it too.
I can live with older ConEmu version, but new versions of FAR are a must, they fix and add a lot of things. If at some point they'd 'fix' it, I'd either have much shorter CJK strings (compared to before), or would stick with outdated FAR version.
So just forget all this issue please. Anyway, I may be forced to switch to linux in 5 years...

Maximus5 added a commit that referenced this issue Jul 8, 2016

gh-739: New option ‘Compress long strings to fit space’ is turned on …
…by default.

  By unchecking that option you'll get ‘old’ behavior, when ConEmu just trims text,
  which overruns dedicated space. Read comments in the issue for details:
  #739
@AndyScull

This comment has been minimized.

Show comment
Hide comment
@AndyScull

AndyScull Mar 20, 2017

Thanks for an option! Only now I noticed it and at last updated my conemu version

There's still some minor difference in character output from old version though - for some fonts, characters are not centered in their cell, like there's very little space on left and a lot on right. Since Monowidth doesnt do anything for them, I assume it has to do something with font itself
(happens with 'MS UI Gothic' but not 'MS Gothic').
It's not something that really needs fixing since I'm alright with changing font, but who knows, maybe it can be fixed with a single line of code...

This is how text looks in old version of conemu (size 20 MS UI Gothic, cell 0, I was using it since I found the combination):msuigothic_oldconemu_cell0

This is new version, same settings (I copied conemu.xml and disabled 'compress long strings' option):
msuigothic_cell0
リ character is obviously moved a little to the left, it is clearly seen if I edit filename and select this character to see where it's glyph ends
This is the same string with cell=12:
msuigothic_cell12
Readable but not very pretty : ) could be mistaken for space character
So I tried other fonts and surprisingly it displays correctly with MS Gothic:
msgothic_cell0

I noticed it seems to be happening only to fonts which aren't monowidth.
Here we have MS Gothic japanese characters aligned as 2 ascii cells:
image and selection to show actual placeholder:image

And here;s MS UI Gothic:
image and selection:image

That just feels like an error somewhere in text handling code, so I am going to close this issue (since everything else works as intended) and if you want you can look into this further at your own pace

AndyScull commented Mar 20, 2017

Thanks for an option! Only now I noticed it and at last updated my conemu version

There's still some minor difference in character output from old version though - for some fonts, characters are not centered in their cell, like there's very little space on left and a lot on right. Since Monowidth doesnt do anything for them, I assume it has to do something with font itself
(happens with 'MS UI Gothic' but not 'MS Gothic').
It's not something that really needs fixing since I'm alright with changing font, but who knows, maybe it can be fixed with a single line of code...

This is how text looks in old version of conemu (size 20 MS UI Gothic, cell 0, I was using it since I found the combination):msuigothic_oldconemu_cell0

This is new version, same settings (I copied conemu.xml and disabled 'compress long strings' option):
msuigothic_cell0
リ character is obviously moved a little to the left, it is clearly seen if I edit filename and select this character to see where it's glyph ends
This is the same string with cell=12:
msuigothic_cell12
Readable but not very pretty : ) could be mistaken for space character
So I tried other fonts and surprisingly it displays correctly with MS Gothic:
msgothic_cell0

I noticed it seems to be happening only to fonts which aren't monowidth.
Here we have MS Gothic japanese characters aligned as 2 ascii cells:
image and selection to show actual placeholder:image

And here;s MS UI Gothic:
image and selection:image

That just feels like an error somewhere in text handling code, so I am going to close this issue (since everything else works as intended) and if you want you can look into this further at your own pace

@AndyScull AndyScull closed this Mar 20, 2017

@Maximus5

This comment has been minimized.

Show comment
Hide comment
@Maximus5

Maximus5 Mar 20, 2017

Owner

If you think output nay be improved (looks like so), reopen the issue and put here the file/text where problem occurs.

Owner

Maximus5 commented Mar 20, 2017

If you think output nay be improved (looks like so), reopen the issue and put here the file/text where problem occurs.

@AndyScull

This comment has been minimized.

Show comment
Hide comment
@AndyScull

AndyScull Mar 23, 2017

Well, if you have the time to fix it :) I can live with current situation though

Examples of broken text (all with font MS UI Gothic, size=20, width=0, cell=0, all checkboxes are unchecked):
淫行 - complex kanji often overlap, I'll give one example but almost all of them have wrong width:
image
Easier to see it if you select one character in text, selection immediately crops that character

Space is too thin (and is not affected when I enable monospace and cell=12):
cell=0, monospace disabled: image
This is with cell=12:image
String is the same as in first example, I just quickly typed space between characters to make screenshots

リ - image
From what I can find, same problem (too little space on left and too much on right) with ッ, ク , タ, イ, ド, し
No problem with ム, ー, い, ん, ス, chars. Maybe they're too wide to have this problem or it somehow depends on unicode number

CJK exclamation mark !(U+ff01): image
Though even in my current browser it isn't centered in it's placeholder

AndyScull commented Mar 23, 2017

Well, if you have the time to fix it :) I can live with current situation though

Examples of broken text (all with font MS UI Gothic, size=20, width=0, cell=0, all checkboxes are unchecked):
淫行 - complex kanji often overlap, I'll give one example but almost all of them have wrong width:
image
Easier to see it if you select one character in text, selection immediately crops that character

Space is too thin (and is not affected when I enable monospace and cell=12):
cell=0, monospace disabled: image
This is with cell=12:image
String is the same as in first example, I just quickly typed space between characters to make screenshots

リ - image
From what I can find, same problem (too little space on left and too much on right) with ッ, ク , タ, イ, ド, し
No problem with ム, ー, い, ん, ス, chars. Maybe they're too wide to have this problem or it somehow depends on unicode number

CJK exclamation mark !(U+ff01): image
Though even in my current browser it isn't centered in it's placeholder

@AndyScull AndyScull reopened this Mar 23, 2017

@TGhoul

This comment has been minimized.

Show comment
Hide comment
@TGhoul

TGhoul Jul 20, 2018

@AndyScull Chinese character hava the same problem. Is there any solution now?

1532072886 1

TGhoul commented Jul 20, 2018

@AndyScull Chinese character hava the same problem. Is there any solution now?

1532072886 1

@AndyScull

This comment has been minimized.

Show comment
Hide comment
@AndyScull

AndyScull Jul 20, 2018

Sadly, no, I just use old version of conemu, from 2014. I don't use newer features, all I need is correct display of unicode characters and it does it.
The exact version is 141221 [32bit]

AndyScull commented Jul 20, 2018

Sadly, no, I just use old version of conemu, from 2014. I don't use newer features, all I need is correct display of unicode characters and it does it.
The exact version is 141221 [32bit]

@TGhoul

This comment has been minimized.

Show comment
Hide comment
@TGhoul

TGhoul Jul 20, 2018

@AndyScull Thank you very much, your answer is very helpful to me.

TGhoul commented Jul 20, 2018

@AndyScull Thank you very much, your answer is very helpful to me.

@Maximus5

This comment has been minimized.

Show comment
Hide comment
@Maximus5

Maximus5 Jul 23, 2018

Owner

I don't think the issue is actual in current ConEmu builds. Option "Compress long strings to fit space" exists for a long time.
There is no sense in using old builds

Owner

Maximus5 commented Jul 23, 2018

I don't think the issue is actual in current ConEmu builds. Option "Compress long strings to fit space" exists for a long time.
There is no sense in using old builds

@TGhoul

This comment has been minimized.

Show comment
Hide comment
@TGhoul

TGhoul Jul 24, 2018

@Maximus5 Unfortunately,Option "Compress long strings to fit space" don't work in current ConEmu builds, this is my screenshot.

1532394967 1

TGhoul commented Jul 24, 2018

@Maximus5 Unfortunately,Option "Compress long strings to fit space" don't work in current ConEmu builds, this is my screenshot.

1532394967 1

@AndyScull

This comment has been minimized.

Show comment
Hide comment
@AndyScull

AndyScull Jul 24, 2018

There is no sense in using old builds

I respectfully disagree with that.
This is from old version of conemu, pure cmd output:
image
And this is from latest version:
image
"Compress long strings to fit space" is irrelevant since strings in my output aren't that long to be affected by it

AndyScull commented Jul 24, 2018

There is no sense in using old builds

I respectfully disagree with that.
This is from old version of conemu, pure cmd output:
image
And this is from latest version:
image
"Compress long strings to fit space" is irrelevant since strings in my output aren't that long to be affected by it

@Maximus5

This comment has been minimized.

Show comment
Hide comment
@Maximus5

Maximus5 Jul 31, 2018

Owner

Isn't it better to ping the issue?

Owner

Maximus5 commented Jul 31, 2018

Isn't it better to ping the issue?

@Maximus5

This comment has been minimized.

Show comment
Hide comment
@Maximus5

Maximus5 Jul 31, 2018

Owner

@TGhoul So, do you prefer to lost completely the tail of the string in favor of CJK not clamping together?

Owner

Maximus5 commented Jul 31, 2018

@TGhoul So, do you prefer to lost completely the tail of the string in favor of CJK not clamping together?

@Maximus5 Maximus5 added this to To Do in Drawing via automation Jul 31, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment