Skip to content

fix(wasm): handle zero-width characters in batch.text()#79

Merged
junkdog merged 1 commit into
junkdog:mainfrom
kofany:fix/zero-width-grapheme-positioning
Jan 19, 2026
Merged

fix(wasm): handle zero-width characters in batch.text()#79
junkdog merged 1 commit into
junkdog:mainfrom
kofany:fix/zero-width-grapheme-positioning

Conversation

@kofany

@kofany kofany commented Jan 18, 2026

Copy link
Copy Markdown
Contributor

Summary

Found this issue accidentally while debugging a different problem. The batch.text() function in the JS API was calculating column positions using the grapheme enumeration index, which caused zero-width characters (like U+200B ZERO WIDTH SPACE) to incorrectly occupy terminal cells.

Before

for (i, ch) in text.graphemes(true).enumerate() {
    let current_col = x + width_offset + i as u16;  // i always increments
    // ...
    if is_double_width(ch) {
        width_offset += 1;
    }
}

Each grapheme advanced column position by at least 1, even zero-width characters.

After

for ch in text.graphemes(true) {
    let char_width = ch.width();
    if char_width == 0 {
        continue;  // skip zero-width
    }
    let current_col = x + col_offset;
    // ...
    col_offset += char_width as u16;
}

Now uses unicode_width::UnicodeWidthStr to determine actual display width.

Question

Do you think this fix is needed in this form? Happy to adjust if you prefer a different approach.


kofany

@junkdog junkdog left a comment

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking good, just need an ascii fast path. see comment.

Comment thread beamterm-renderer/src/wasm.rs Outdated
let current_col = x + width_offset + i as u16;
let mut col_offset: u16 = 0;
for ch in text.graphemes(true) {
let char_width = ch.width();

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a performance regression for the default case; is_double_width() calls width() as a last resort.

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when calculating col advancement, increment by one if ch.len() is 1, otherwise call ch.width()

Previously, the text() function calculated column positions using the
grapheme enumeration index, which caused zero-width characters (like
U+200B ZERO WIDTH SPACE) to incorrectly occupy terminal cells and shift
all subsequent characters to the right.

Now uses unicode_width::UnicodeWidthStr to determine the actual display
width of each grapheme:
- Zero-width characters (width=0) are skipped entirely
- Single-width characters advance by 1 column
- Double-width characters (CJK, emoji) advance by 2 columns

This fixes rendering artifacts in terminal applications that use
zero-width characters (e.g., weechat with GNU screen).
@kofany kofany force-pushed the fix/zero-width-grapheme-positioning branch from ef849ed to e6f738e Compare January 19, 2026 22:01
@kofany kofany requested a review from junkdog January 19, 2026 22:02
@junkdog

junkdog commented Jan 19, 2026

Copy link
Copy Markdown
Owner

🍪 thanks!

@junkdog junkdog merged commit 03628e9 into junkdog:main Jan 19, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants