Skip to content

Fix UTF-8 char boundary handling in cursor movement and rendering#48

Merged
jvanderberg merged 1 commit into
mainfrom
claude/investigate-github-notifications-SPS2o
May 1, 2026
Merged

Fix UTF-8 char boundary handling in cursor movement and rendering#48
jvanderberg merged 1 commit into
mainfrom
claude/investigate-github-notifications-SPS2o

Conversation

@jvanderberg
Copy link
Copy Markdown
Owner

Summary

This PR fixes potential panics and incorrect cursor positioning when working with multibyte UTF-8 characters. The changes ensure that cursor positions are always snapped to valid UTF-8 character boundaries during vertical movement and rendering.

Key Changes

  • Buffer vertical movement: Added snap_to_char_boundary() helper method that walks backwards from a byte offset to find the nearest valid UTF-8 character boundary. This method is now called in move_up() and move_down() to ensure the cursor never lands in the middle of a multibyte character.

  • Cursor rendering: Updated render_editor() in the UI layer to snap the cursor column to a valid char boundary before rendering. Also fixed the cursor character extraction to properly handle multibyte characters by using chars().next() instead of byte-based slicing, which could panic or produce incorrect output on non-ASCII text.

  • Parser safety: Added bounds checking in fixup_link_lines() to use .get() instead of direct indexing, preventing potential panics on edge cases.

Notable Implementation Details

  • The snap_to_char_boundary() method uses line.is_char_boundary() to safely identify valid UTF-8 boundaries, walking backwards one byte at a time until a boundary is found.
  • Cursor character rendering now correctly handles multibyte characters by extracting the full character width using chars().next() and to_string(), then calculating the proper offset for the "after" portion.
  • Comprehensive test coverage added for both move_up() and move_down() with multibyte UTF-8 characters (Greek letters) to prevent regression.

https://claude.ai/code/session_01J88TRUMkjBFCGXeBwcMNSk

Vertical cursor movement (up/down) could place the cursor at a byte
offset that falls inside a multi-byte UTF-8 character, causing panics
when rendering the editor or slicing strings. Added snap_to_char_boundary
helper and applied defensive .get() slicing in render.rs and parser.rs.

https://claude.ai/code/session_01J88TRUMkjBFCGXeBwcMNSk
@jvanderberg jvanderberg merged commit 9bd67d5 into main May 1, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants