Skip to content

Conversation

@ferhatelmas
Copy link
Contributor

Multi-byte rune at boundary gets corrupted:

  • A string where a multi-byte character spans the 150-byte boundary
  • Chinese character "中" is 3 bytes (0xE4 0xB8 0xAD)
  • If we have 149 ASCII chars + "中", the "中" starts at byte 149 and ends at byte 151
  • cutLongTitle should cut before "中", returning 149 bytes
  • But the current implementation returns 150 bytes, cutting "中" in half and making string invalid

Signed-off-by: ferhat elmas <elmas.ferhat@gmail.com>
@LinkinStars LinkinStars self-requested a review November 25, 2025 09:46
Copy link
Member

@LinkinStars LinkinStars left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! Thank you for your contribution. This is important to us.

@LinkinStars LinkinStars merged commit ce053cc into apache:dev Nov 25, 2025
@LinkinStars LinkinStars self-assigned this Nov 25, 2025
@ferhatelmas ferhatelmas deleted the ferhat/cut-long-title-boundary branch November 25, 2025 10:13
@LinkinStars LinkinStars added this to the v1.7.1 milestone Nov 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants