Skip to content

Unicode paste formatter truncates Chinese text containing 公 or 破 #3152

@ZentZo86

Description

@ZentZo86

Bug Description

ForgeCode's paste formatter truncates pasted Chinese text when the text contains certain Unicode characters such as or .

This appears to affect both:

  • the zsh : prompt paste hook, because it calls forge zsh format --buffer "$BUFFER"
  • the interactive Forge TUI paste path, because paste events also call the same paste formatting logic

Steps to Reproduce

Run:

forge --version
forge zsh format --buffer ': abc公def'
forge zsh format --buffer ': abc破def'
forge zsh format --buffer ': abc布def'
forge zsh format --buffer ': abc突def'
forge zsh format --buffer ': 2026 春季 BOOOMJam 开发主题公布让我们一起探索,突破「视界限」!'

Expected Behavior

All non-path pasted text should be preserved unchanged.

Expected examples:

: abc公def
: abc破def
: 2026 春季 BOOOMJam 开发主题公布让我们一起探索,突破「视界限」!

Actual Behavior

On ForgeCode 2.12.5, the text is truncated. For example:

forge zsh format --buffer ': abc公def'
# outputs: :

forge zsh format --buffer ': abc破def'
# outputs: :

Characters like and do not trigger the same truncation in my test.

Suspected Cause

The likely issue is in crates/forge_main/src/zsh/paste.rs, specifically find_token_end() / wrap_tokens().

The code scans input.as_bytes(), casts each byte to char, and then calls is_whitespace():

.map(|b| (*b as char).is_whitespace())

For UTF-8 Chinese characters:

  • = e5 85 ac
  • = e7 a0 b4

The continuation bytes 0x85 / 0xA0 can be interpreted as whitespace-like characters when cast independently, so the formatter incorrectly treats the middle of a UTF-8 character as a token boundary. The later safe .get() calls prevent a crash, but they still cause truncation.

A char-boundary-safe implementation should probably iterate with char_indices() instead of scanning individual bytes.

Related Issues / PRs

This looks related to, but not fully fixed by:

Those fixed or reduced UTF-8 byte-boundary crashes, but this current case still reproduces on 2.12.5 as silent truncation.

Forge Version

forge 2.12.5

Operating System & Version

macOS, accessed from a PC over SSH.

Installation Method

Installed binary at ~/.local/bin/forge.

Configuration

No special configuration required to reproduce. The CLI command forge zsh format --buffer ... is enough.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions