Skip to content

Fix ArgumentError: invalid byte sequence in UTF-8 for non-UTF-8 files#47

Merged
dblock merged 1 commit intomasterfrom
fix/invalid-byte-sequence-utf8
Apr 12, 2026
Merged

Fix ArgumentError: invalid byte sequence in UTF-8 for non-UTF-8 files#47
dblock merged 1 commit intomasterfrom
fix/invalid-byte-sequence-utf8

Conversation

@dblock
Copy link
Copy Markdown
Owner

@dblock dblock commented Apr 12, 2026

Fixes #37.

Problem

Files with non-UTF-8 encoding (e.g., GB2312/GB18030 commonly used in Chinese codebases on Windows) caused ArgumentError: invalid byte sequence in UTF-8 when fui tried to match import patterns against file contents.

Fix

Read files in binary mode and transcode to UTF-8, replacing any invalid byte sequences. This allows fui to gracefully handle files with any encoding while still correctly finding #import references.

Changes

  • lib/fui/finder.rb: Read files as binary then encode to UTF-8 (replacing invalid bytes) in process_code and process_xml. Also removed the unnecessary File.open wrapper since File.read was re-opening the file anyway.
  • spec/fui/finder_spec.rb: Added test cases for non-UTF-8 encoded files.
  • spec/fixtures/non_utf8/: Added fixture files including a .m file with GB2312-encoded bytes (invalid UTF-8).

…-UTF-8 files

Files with non-UTF-8 encoding (e.g., GB2312/GB18030 commonly used in
Chinese codebases) would cause an ArgumentError when fui tried to match
import patterns against file contents.

Read files in binary mode and transcode to UTF-8, replacing invalid byte
sequences. This allows fui to gracefully handle files with any encoding
while still finding #import references.

Fixes #37.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions
Copy link
Copy Markdown

Danger Report

No issues found.

View run

@dblock
Copy link
Copy Markdown
Owner Author

dblock commented Apr 12, 2026

Looks similar to #39.

@dblock dblock merged commit 862bc11 into master Apr 12, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

error: invalid byte sequence in UTF-8

1 participant