Skip to content

[bug] Read and Edit tools handle invalid UTF-8 inconsistently: one crashes, the other silently corrupts #33068

Description

@LifetimeVip

Description

The Read tool (packages/core/src/tool/read-filesystem.ts) uses
ew TextDecoder("utf-8", { fatal: true }) at both paginated and non-paginated read paths (lines ~131 and ~155). When the first 64KB of a file passes the binary detection check but later content contains invalid UTF-8 sequences (e.g. orphan byte 0x80, truncated multi-byte sequences), decoder.decode() throws a TypeError. This exception is not caught and propagates as an unrecoverable Effect defect.

The Edit tool (packages/core/src/tool/edit.ts, line ~31) uses
ew TextDecoder() WITHOUT { fatal: true }, meaning invalid UTF-8 bytes are silently replaced with U+FFFD replacement characters. When the model provides an oldString containing the correct characters but Edit searches the replaced text, matching fails silently with "Could not find oldString".

Two tools in the same codebase handle invalid UTF-8 differently: one crashes, the other silently corrupts data.

Code location

Steps to reproduce

  1. Create a file whose first 64KB is valid ASCII/UTF-8 but contains orphan byte 0x80 at position ~65000
  2. Call
    ead on this file → TextDecoder crashes with TypeError
  3. Call edit on this file → the invalid byte is silently replaced with U+FFFD, oldString matching fails

Expected behavior

Both tools should handle invalid UTF-8 consistently. Either both use { fatal: true } with proper error handling, or both replace invalid sequences with U+FFFD gracefully.

Environment

OpenCode Desktop 1.17.8, Windows 11

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions