cursor behavior for full/ambiguous width characters #9

buganini opened this Issue May 20, 2012 · 19 comments


None yet

4 participants


Currently I'm using following hack in my program:

--- a/pyte/
+++ b/pyte/
@@ -28,6 +28,7 @@ from __future__ import (
     absolute_import, unicode_literals, division

+import bsdconv
 import copy
 import math
 import operator
@@ -36,6 +37,7 @@ from itertools import islice, repeat

 from . import modes as mo, graphics as g, charsets as cs


@@ -392,6 +394,10 @@ class Screen(list):
         char = char.translate([self.g0_charset,

+       width_counter.conv(char.encode("utf-8"))
+       width=width_info['full']*2+width_info['ambi']*2+width_info['half']
         # If this was the last column in a line and auto wrap mode is
         # enabled, move the cursor to the next line. Otherwise replace
         # characters already displayed with newly entered.
@@ -410,9 +416,13 @@ class Screen(list):
         self[self.cursor.y][self.cursor.x] = self.cursor.attrs \

+       if width>1:
+               self[self.cursor.y][self.cursor.x+1] = self.cursor.attrs \
+                   ._replace(data="")
         # .. note:: We can't use :meth:`cursor_forward()`, because that
         #           way, we'll never know when to linefeed.
-        self.cursor.x += 1
+        self.cursor.x += width

     def carriage_return(self):
         """Move the cursor to the beginning of the current line."""

there should be a similar but cleaner way to determine character width, and especially, there is a set of "east-asian ambiguous width" characters, its width depends on locale, I treat it as fullwidth (*2) in my case.

Selectel member

Hello, can you please submit a test case for the patch?


there is also, a port of the same in .c, which i happen to use in similar capacity

Selectel member

Hello, Jeff, thank you for the link. wcwidth sounds like a way to go, but I'd prefer to re-use someone else's code here.

Do you now if there a standalone Python implementation somewhere? If not, did you consider using wcwidth from libc via ctypes?

  1. Regarding ctypes, I prefer pure python for these things, I presume you'd like your emulator to work equally well in windows or jython or whatever. You don't currently link with libc, do you?

  2. This is originally from who hosts many famous utf8 demo files you can test with in its parent directory. The python version is almost directly translated. I think i found it from

  3. its been an open issue in core python for some time, i think perhaps longer than this, but this is a tracking and open issue,


I agree with Jeff, the pure python works wonderfully, but it shouldn't be in the x84 repository. Copy it over to the pyte library, or make a standalone Python package of it.


I'll make a standalone. Just so we're clear, コンニチハ in a terminal, when selected, should highlight two cells, not one, even though its only 5 glyphs, its 10 cells.

In [11]: wcwidth.wcswidth(u'コンニチハ'), len(u'コンニチハ')
Out[11]: (10, 5)

Thanks, @jquast, for me that's perfect. Can you put that on pypi? And are you willing to maintain that package? (if there is any maintenance.) If you could add some tests and docs to the project, that would be even nicer. ;)

Selectel member

Sounds good to me as well. Also, +1 for the tests, there should be some for the libc version, right?


yup, working on it now.


still working on it. Learning a lot about this stuff... I wrote a viewer utility to check my work,

Plan to write some tests first out of what is there, then, do a rewrite of something that should be automatically maintainable by just feeding the latest unicode spec docs. Then I'll publish to pypi, eta sunday or so. I think the wcwidth() and wcswidth() interfaces, as module wcwidth will remain. eta to pypi about 5 days or so.


libc comparison complete, error report here:

@superbobry you should inspect this error report regarding your request for tests compared to libc -- libc is grossly out of date compared to the spec, producing thousands of wrong results! presumably libc conforms to a posix standard that conforms to a unicode specification that is about 10+ years behind today's monospaced fonts. unicode specifications provide accurate report, will continue my efforts to work against those.

how else would you know the true width of a snowman, afterall.

libc=-1, ours=1, name=SNOWMAN WITHOUT SNOW, url --oo⛄oo--

[ I do see a few errors, esp. regarding combining, (report as 1 but should actually be -1), but this is unicodedata.combining() reporting that they are not when they most certainly are. not sure if i want to fix that far in. ]

Selectel member

@jquast thank you, you've done an impressive amount of work with this! I'll try to change pyte to use wcwidth during the next week. When do you plan to make a release on PyPI?

I don't see any issues with being incompatible with libc as long as wcwidth follows the Unicode spec.


I've released to pypi,

Took a lot of (manual, visual) testing -- its all plugged into travis and coveralls -- let me know if you have any suggestions, there is some ambiguous areas regarding control, combining, and emoticon characters.

Selectel member

I've been thinking about integrating wcwidth with pyte for a while and I'm not sure how to best approach this. Here's why: in the current implementation Screen is a just a character matrix, which implies that a character is always of unit width. We can work around this by adding stub characters, i.e.

def draw(self, ch):
    # ...
    width = wcswidth(ch)
    for offset in range(width - 1):
        self.buffer[self.cursor.y][self.cursor.x] = stub_character

but this is obviously a nasty hack, which might require more hacks in the future.

For instance, imagine we have the following character matrix

["", S]
cursor is here

where S denotes a stub character. And then we do


We need to insert an empty character right after the cursor erasing whatever there is, but the current character has width = 2. So the question is: what resulting character matrix do we expect in this case?


I think we should replace the stub character by a space.

Try this:

echo -e "你好\e[Dx" # Print double width characters, move cursor one position left and print an 'x'

You'll see that we get a space there.

(on some terminals, you should use \b instead of \e[D)

Selectel member

I've tried this in Guake with TERM=xterm and probably got something else

$ echo -e "你好\e[Dx" 

Can you post the expected output?


I think it should print:

你 x

(We print "你好", cursor one position left, and print 'x')
That's how I get it in OS X terminal, iterm and xterm.)

Just tried xfce4-terminal. He prints the left half of the 好 character, and prints the letter 'x' on top of the right part.
I would prefer however to replace the left part of the wide character with a space, because if we are going to render the output of pyte again to a terminal, that will work everywhere. (Otherwise, I propose to make it configurable.)

@superbobry superbobry added a commit that closed this issue Jan 9, 2016
@superbobry superbobry Incorporated ``wcwidth``
  closes #9
@superbobry superbobry closed this in 00de944 Jan 9, 2016
@superbobry superbobry added a commit that referenced this issue Jan 9, 2016
@superbobry superbobry Incorporated ``wcwidth``
  closes #9
Selectel member

I've finally managed to add this to pyte. A big thanks to everyone involved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment