Skip to content
This repository was archived by the owner on May 15, 2026. It is now read-only.

Fix ASCII encoding issue when updating files with non-ASCII characters#116

Merged
ryanhoangt merged 3 commits into
OpenHands:mainfrom
erkinalp:fix-ascii-encoding-issue
May 2, 2025
Merged

Fix ASCII encoding issue when updating files with non-ASCII characters#116
ryanhoangt merged 3 commits into
OpenHands:mainfrom
erkinalp:fix-ascii-encoding-issue

Conversation

@erkinalp
Copy link
Copy Markdown
Contributor

@erkinalp erkinalp commented May 1, 2025

Description

This PR fixes an issue with file encoding detection that causes errors when trying to add non-ASCII characters (like Chinese text) to files that were initially created with only ASCII content.

Related Issue

This issue was reported in OpenHands issue #8209.

Motivation and Context

When a file is initially created with only ASCII characters, its encoding is detected as 'ascii'. However, when trying to add non-ASCII characters to this file later, the operation fails with a UnicodeEncodeError because the 'ascii' encoding can't handle these characters.

Error message:

UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-3: ordinal not in range(128)

This fix ensures that files initially containing only ASCII characters can later accept non-ASCII content (such as Chinese, Japanese, or other Unicode characters).

How Has This Been Tested?

Added a test case in tests/test_encoding.py that verifies ASCII files are detected as UTF-8, ensuring they can handle non-ASCII characters when edited later.

Does this PR introduce a breaking change?

No, this PR does not introduce a breaking change. It maintains backward compatibility while fixing the encoding issue.

Copy link
Copy Markdown
Contributor

@ryanhoangt ryanhoangt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@ryanhoangt ryanhoangt merged commit 03a1a97 into OpenHands:main May 2, 2025
4 checks passed
@erkinalp erkinalp deleted the fix-ascii-encoding-issue branch May 2, 2025 17:59
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants