East asian characters are not aligned correctly in console output #604

spooning · 2014-06-29T22:21:42Z

Originally submitted to Google Code by xieyanbo on 1 Aug 2010

The width of some Unicode characters -- East asian -- is 2, that cause pybot's output aligned incorrectly.

Robot Framework 2.5 (Python 2.6.5 on darwin)

Demo:

0$ cat test_east_asian_width.txt
*** test cases ***
汉字应该正确对齐
Log Hello world!
#0$ pybot test_east_asian_width.txt

Test East Asian Width

汉字应该正确对齐 | PASS |

Test East Asian Width | PASS |
1 critical test, 1 passed, 0 failed
1 test total, 1 passed, 0 failed

After patched:
#0$ pybot test_east_asian_width.txt

Test East Asian Width

汉字应该正确对齐 | PASS |

Test East Asian Width | PASS |
1 critical test, 1 passed, 0 failed
1 test total, 1 passed, 0 failed

The patch and testcase attached.

spooning · 2014-06-29T22:21:43Z

Originally submitted to Google Code by xieyanbo on 1 Aug 2010

see also: East Asian Width http://unicode.org/reports/tr11/

spooning · 2014-06-29T22:21:44Z

Originally submitted to Google Code by @pekkaklarck on 4 Aug 2010

Thanks for a bug report and patch. I was both able to verify the problem and test that the patch fixes it.

Imporing the unicodedata module used here is, unfortunately, very slow with Jython:

$ time jython -c "import sys"
real 0m5.243s
user 0m5.664s
sys 0m0.392s

$ time jython -c "from unicodedata import east_asian_width"
real 0m10.867s
user 0m15.545s
sys 0m0.488s

Applying the patch in the current format would thus mean slowing the start-up time with Jython for 5 seconds, which clearly is not acceptable. Do you know is there any other method to find out how long these characters actually are? If there isn't, we need to use this fix only with Python.

Because this problem apparently only affects the console output I consider it relatively low priority.

spooning · 2014-06-29T22:21:45Z

Originally submitted to Google Code by xieyanbo on 5 Aug 2010

I'am sure I can optimize this code with pre-compiled data, and I have dumped all wide chars with a script to do it. I am glad to hear any suggestions, and the script file attached, for anyone if interested.

spooning · 2014-06-29T22:21:47Z

Originally submitted to Google Code by @pekkaklarck on 16 Aug 2010

Pre-compiled data sounds like a good solution. I modified the attached script to print the number of characters and there only were 261 of them. I think it would be best to have a new module that would have both the characters and a single function to cut (and justify) the text correctly. xieyanbo, are you interested to try that out? We are going to do RF 2.5.2 in the near future and getting this in is still possible.

spooning · 2014-06-29T22:21:48Z

Originally submitted to Google Code by xieyanbo on 16 Aug 2010

Actually, that script print 261 range of wild characters, and 45647 is the total number. I have implement a prototype to replace east_asian_width function. The attachment generate_wild_chars.py output a module's source code, which include a function "is_wild_char". "is_wild_char(c)" have the same behaviors as "eaw(c) in 'WF'". You can do more optimize for it, but I think "is_wild_char" is good enough to work in our product. Have a try.

spooning · 2014-06-29T22:21:49Z

Originally submitted to Google Code by @pekkaklarck on 23 Aug 2010

We try to get this into 2.5.2 which we must get out this week. No promises at this point, though.

spooning · 2014-06-29T22:21:51Z

Originally submitted to Google Code by @pekkaklarck on 27 Aug 2010

Unfortunately we don't have time to get this into 2.5.2. =(

spooning · 2014-06-29T22:21:52Z

Originally submitted to Google Code by @jussimalinen on 31 Aug 2010

This is now committed in r4005, r4006, and r4007. We also implemented check for combining characters that have width of 0. (This caused problems in mac, which uses NFD encoding for file names.) Now coming out in 2.5.3.

Thanks for the brilliant patch xieyanbo!

spooning · 2014-06-29T22:21:53Z

Originally submitted to Google Code by xieyanbo on 31 Aug 2010

Great job, thanks to you guys!

spooning · 2014-06-29T22:21:55Z

Originally submitted to Google Code by xieyanbo on 21 Mar 2012

The generate script and east asian chars list in this page are not correct, don't use it. The correct version is in issue #1096 , use that.

spooning added this to the 2.5.3 milestone Jun 29, 2014

spooning closed this as completed Jun 29, 2014

spooning assigned jussimalinen Jun 29, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

East asian characters are not aligned correctly in console output #604

East asian characters are not aligned correctly in console output #604

spooning commented Jun 29, 2014

spooning commented Jun 29, 2014

spooning commented Jun 29, 2014

spooning commented Jun 29, 2014

spooning commented Jun 29, 2014

spooning commented Jun 29, 2014

spooning commented Jun 29, 2014

spooning commented Jun 29, 2014

spooning commented Jun 29, 2014

spooning commented Jun 29, 2014

spooning commented Jun 29, 2014

East asian characters are not aligned correctly in console output #604

East asian characters are not aligned correctly in console output #604

Comments

spooning commented Jun 29, 2014

Test East Asian Width

汉字应该正确对齐 | PASS |

Test East Asian Width

汉字应该正确对齐 | PASS |

spooning commented Jun 29, 2014

spooning commented Jun 29, 2014

spooning commented Jun 29, 2014

spooning commented Jun 29, 2014

spooning commented Jun 29, 2014

spooning commented Jun 29, 2014

spooning commented Jun 29, 2014

spooning commented Jun 29, 2014

spooning commented Jun 29, 2014

spooning commented Jun 29, 2014