Wide unicode characters take two units #18

songgao · 2013-02-20T04:49:03Z

Hey, I was just trying out termbox (and it's great by the way!) and found out a problem for wide unicode characters. Actually it's a problem for wide runes displayed in fix-width font. My terminal (iTerm) prints wide characters twice as wide as ascii characters. As a result, if there's a wide rune, the drawn characters won't be aligned properly. Try this code:

package main

import (
    "github.com/nsf/termbox-go"
    "time"
)

func main() {
    err := termbox.Init()
    if err != nil {
        panic(err)
    }
    defer termbox.Close()

    cn := []rune{'你', '好', '，', '世', '界', '！'}
    en := []rune{'a', 'b', 'c', 'd', 'e', 'f'}
    for i := range cn {
        termbox.SetCell(i, 0, cn[i], termbox.ColorBlack, termbox.ColorWhite)
    }
    for i := range en {
        termbox.SetCell(i, 1, en[i], termbox.ColorRed, termbox.ColorWhite)
    }
    termbox.Flush()
    time.Sleep(5 * time.Second)
}

I'm not sure what should be changed to fix it yet. Should it

count wide runes as two characters in the buffer? (That would possibly produce overlapped characters)
Or if a wide rune is in buffer, then all non-wide runes are also drawn 2x wide?
Or provide an option on how runes are drawn?

nsf · 2013-02-20T18:29:39Z

The problem is that with termbox it has no control over the terminal which in turn draws runes. It's possible to make a termbox implementation which will be drawing runes by itself, but it was never done so far, at the moment I have only a terminal-based backend. So, termbox simply assumes that each rune takes exactly one terminal cell and I know this assumption is false for some CJK rune cases. But I will need to read more about it, when and why terminals draw double-width runes. Will think what can I do about this.

nsf · 2013-03-03T17:32:14Z

Here's what I think. Even though it's possible to hack termbox so that it understands that CJK runes take 2 termbox cells, this approach isn't portable (on windows probably it doesn't work like that) and every termbox-based application should be aware of that fact too and handle the stuff in a special manner. So, I don't think I will do anything about it. If you want CJK support, hack termbox to suit your needs (it's a very simple library after all).

Honestly I just don't know how to make it work using current termbox abstractions. Termbox was designed around a notion of "cells" and this thing breaks it really hard. Perhaps there are terminals which render CJK runes with width == 1, but not sure how readable is that.

songgao · 2013-03-03T18:41:46Z

Thanks for getting back to it. Sure I'll try hacking it and send a pull request or something if I work out.

I don't think terminals that render CJK with width == 1 would be readable at all...

nsf · 2013-05-04T12:51:55Z

See also: #21

Should work now on linux/darwin. You can install https://github.com/nsf/godit to test it out.

axgle · 2014-05-09T06:04:29Z

work on winxp,and ssh on ubuntu.

package main

import (
    "github.com/nsf/termbox-go"
)

func rune_width(r rune) int {
    if r >= 0x1100 &&
        (r <= 0x115f || r == 0x2329 || r == 0x232a ||
            (r >= 0x2e80 && r <= 0xa4cf && r != 0x303f) ||
            (r >= 0xac00 && r <= 0xd7a3) ||
            (r >= 0xf900 && r <= 0xfaff) ||
            (r >= 0xfe30 && r <= 0xfe6f) ||
            (r >= 0xff00 && r <= 0xff60) ||
            (r >= 0xffe0 && r <= 0xffe6) ||
            (r >= 0x20000 && r <= 0x2fffd) ||
            (r >= 0x30000 && r <= 0x3fffd)) {
        return 2
    }
    return 1
}
func main() {
    err := termbox.Init()
    if err != nil {
        panic(err)
    }
    defer termbox.Close()

    //cn := []rune{'你', '好', '，', '世', '界', '！'}
    cn := "你好, 世界！"
    en := []rune{'a', 'b', 'c', 'd', 'e', 'f'}
    x := 0

    for _, v := range cn {
        termbox.SetCell(x, 0, v, termbox.ColorBlack, termbox.ColorWhite)
        x += rune_width(v)
    }
    for i := range en {
        termbox.SetCell(i, 1, en[i], termbox.ColorRed, termbox.ColorWhite)
    }
    termbox.Flush()
    termbox.PollEvent()
}

songgao · 2014-06-06T16:39:43Z

Looks like it's working now. Just tested on OS X with iTerm2. Thanks!

wang0z · 2014-11-07T09:36:38Z

I have to say the real output sequence is somthing like '\033[1;1H你\033[1;3H好\033[1;5H世\033[1;7H界' which inlcudes unnecessary CSI code. Anyway it is better than nothing.

shia86 · 2018-01-08T07:11:01Z

I recommend to use go-runewidth

karlek mentioned this issue Mar 7, 2014

Problem with unicode characters on name input. karlek/reason#30

Closed

songgao closed this as completed Jun 6, 2014

gizak mentioned this issue Mar 25, 2015

utf8 characters problem gizak/termui#10

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Wide unicode characters take two units #18

Wide unicode characters take two units #18

songgao commented Feb 20, 2013

nsf commented Feb 20, 2013

nsf commented Mar 3, 2013

songgao commented Mar 3, 2013

nsf commented May 4, 2013

axgle commented May 9, 2014

songgao commented Jun 6, 2014

wang0z commented Nov 7, 2014

shia86 commented Jan 8, 2018

Wide unicode characters take two units #18

Wide unicode characters take two units #18

Comments

songgao commented Feb 20, 2013

nsf commented Feb 20, 2013

nsf commented Mar 3, 2013

songgao commented Mar 3, 2013

nsf commented May 4, 2013

axgle commented May 9, 2014

songgao commented Jun 6, 2014

wang0z commented Nov 7, 2014

shia86 commented Jan 8, 2018