Description
On Windows, the write tool produces .bat / .cmd files that fail in two ways:
-
Line endings: LF-only (\n) instead of CRLF (\r\n). Cmd.exe expects CRLF; LF-only causes the script to exit immediately with no error.
-
Code page: The file contains UTF-8 encoded non-ASCII text, but cmd.exe interprets the bytes using the system's active code page (e.g., 936 GBK for zh-CN, 932 Shift-JIS for ja-JP, 949 for ko-KR, 1251 for ru-RU, 1252 for Western European). Result: non-ASCII characters display as garbage.
Both issues affect all non-English Windows users regardless of language.
Root Cause
Line endings
write.ts passes AI-generated content directly to fs.writeWithDirs() → fs.writeFileString() with no line ending normalization. The AI model generates LF (\n) by default.
edit.ts already handles this correctly at packages/opencode/src/tool/edit.ts:22-33:
function normalizeLineEndings(text: string): string {
return text.replaceAll("\r\n", "\n")
}
function detectLineEnding(text: string): "\n" | "\r\n" {
return text.includes("\r\n") ? "\r\n" : "\n"
}
function convertToLineEnding(text: string, ending: "\n" | "\r\n"): string {
if (ending === "\n") return text
return text.replaceAll("\n", "\r\n")
}
But these are local to edit.ts and are not used in write.ts.
Encoding
write.ts (line 47) reads existing files with TextDecoder("utf-8", { ignoreBOM: true }). writeFileString() on Node.js writes UTF-8. This is correct — the write tool always handles UTF-8 properly.
The problem is that cmd.exe on Windows defaults to the active code page (e.g., code page 936 for Chinese, 932 for Japanese). A .bat file written as UTF-8 will have its non-ASCII bytes misinterpreted. The fix is to instruct cmd.exe to switch to UTF-8 with chcp 65001 >nul as the second line, but the write tool has no mechanism to ensure this.
Global scope: this is NOT a Chinese-specific issue. Every Windows system outside of English/Western European locales uses a non-UTF-8 code page by default. On Japanese Windows (932), Korean Windows (949), Russian Windows (1251), etc., UTF-8 .bat files without chcp 65001 will all show garbled non-ASCII text.
Proposed Fix
Fix 1: Line endings
Move normalizeLineEndings, detectLineEnding, and convertToLineEnding from edit.ts into a shared utility (e.g., packages/opencode/src/util/line-endings.ts).
In write.ts, between reading the existing file (line 47) and writing (line 64), add:
// Preserve existing line endings; for new .bat/.cmd on Windows, use CRLF
if (exists) {
const ending = detectLineEnding(contentOld)
contentNew = convertToLineEnding(normalizeLineEndings(contentNew), ending)
} else if (process.platform === "win32" && /\.(bat|cmd)$/i.test(filepath)) {
contentNew = convertToLineEnding(normalizeLineEndings(contentNew), "\r\n")
}
Logic:
- Existing files: detect and preserve the original file's line ending style. If the file uses CRLF, the new content will use CRLF too. This matches
edit.ts behavior.
- New
.bat/.cmd files on Windows: default to CRLF (Windows batch files require CRLF to avoid crashes).
- All other new files: no transformation (keep LF, which is the cross-platform git standard).
Fix 2: Code page for .bat/.cmd on Windows
When writing .bat/.cmd files on Windows that contain non-ASCII characters, automatically insert chcp 65001 >nul as the second line — unless the file already has a chcp command or the user explicitly opted out.
Implementation in write.ts:
function ensureChcpUtf8(content: string): string {
// Only for .bat/.cmd on Windows with non-ASCII text
const hasNonAscii = /[\x80-\uFFFF]/.test(content)
if (!hasNonAscii) return content
const lines = content.split(/\r?\n/)
// Don't inject if any line already has a chcp command
if (lines.some(l => /^\s*chcp\s+\d+/i.test(l))) return content
// Find the first non-comment, non-empty line to insert after
// If the first line is @echo off/on, insert as line 2
// Otherwise insert at the beginning (becomes line 1)
if (lines.length > 0 && /^@echo\s+(off|on)/i.test(lines[0].trim())) {
lines.splice(1, 0, "chcp 65001 >nul")
} else {
lines.unshift("chcp 65001 >nul")
}
return lines.join("\n") // the caller will handle CRLF via Fix 1
}
Then call contentNew = ensureChcpUtf8(contentNew) before writing.
Why this is safe:
- Only fires for
.bat/.cmd on Windows with non-ASCII text
- Skips if ANY
chcp is already present (no double injection)
- Insertion respects
@echo off positioning (goes on line 2, not before it)
- Line ending normalization (Fix 1) runs after this, so CRLF is still applied
Caveat: if the content has chcp only in a comment block, we'd skip injection. This is a minor edge case that can be refined.
Why not just use UTF-8 BOM?
Adding a UTF-8 BOM (byte order mark) to .bat files would also tell cmd.exe to interpret them as UTF-8 on recent Windows 10/11. However:
- On older Windows, BOM causes cmd.exe to crash (it passes the BOM bytes to
@echo off, which then fails silently)
- BOM before
@echo off violates the well-known Windows batch file convention
- BOM is an invisible character that confuses users and tools
So chcp 65001 >nul is the safer, more compatible approach.
Testing
- On any non-English Windows (zh-CN, ja-JP, ko-KR, ru-RU, etc.):
- Write a
.bat file with non-ASCII characters → should run without crash, Chinese/Japanese/etc. should display correctly
- On English Windows:
- Same test → should work (chcp 65001 is a no-op on UTF-8 codepage systems but harmless)
- On Unix:
- No behavior change (process.platform !== "win32" guards all new logic)
- Editing an existing CRLF file via write:
- CRLF should be preserved (same as edit.ts behavior)
- Editing an existing LF file via write:
- LF should be preserved (no unwanted conversion)
Description
On Windows, the
writetool produces.bat/.cmdfiles that fail in two ways:Line endings: LF-only (
\n) instead of CRLF (\r\n). Cmd.exe expects CRLF; LF-only causes the script to exit immediately with no error.Code page: The file contains UTF-8 encoded non-ASCII text, but cmd.exe interprets the bytes using the system's active code page (e.g., 936 GBK for zh-CN, 932 Shift-JIS for ja-JP, 949 for ko-KR, 1251 for ru-RU, 1252 for Western European). Result: non-ASCII characters display as garbage.
Both issues affect all non-English Windows users regardless of language.
Root Cause
Line endings
write.tspasses AI-generated content directly tofs.writeWithDirs()→fs.writeFileString()with no line ending normalization. The AI model generates LF (\n) by default.edit.tsalready handles this correctly atpackages/opencode/src/tool/edit.ts:22-33:But these are local to
edit.tsand are not used inwrite.ts.Encoding
write.ts(line 47) reads existing files withTextDecoder("utf-8", { ignoreBOM: true }).writeFileString()on Node.js writes UTF-8. This is correct — the write tool always handles UTF-8 properly.The problem is that cmd.exe on Windows defaults to the active code page (e.g., code page 936 for Chinese, 932 for Japanese). A
.batfile written as UTF-8 will have its non-ASCII bytes misinterpreted. The fix is to instruct cmd.exe to switch to UTF-8 withchcp 65001 >nulas the second line, but the write tool has no mechanism to ensure this.Global scope: this is NOT a Chinese-specific issue. Every Windows system outside of English/Western European locales uses a non-UTF-8 code page by default. On Japanese Windows (932), Korean Windows (949), Russian Windows (1251), etc., UTF-8
.batfiles withoutchcp 65001will all show garbled non-ASCII text.Proposed Fix
Fix 1: Line endings
Move
normalizeLineEndings,detectLineEnding, andconvertToLineEndingfromedit.tsinto a shared utility (e.g.,packages/opencode/src/util/line-endings.ts).In
write.ts, between reading the existing file (line 47) and writing (line 64), add:Logic:
edit.tsbehavior..bat/.cmdfiles on Windows: default to CRLF (Windows batch files require CRLF to avoid crashes).Fix 2: Code page for
.bat/.cmdon WindowsWhen writing
.bat/.cmdfiles on Windows that contain non-ASCII characters, automatically insertchcp 65001 >nulas the second line — unless the file already has achcpcommand or the user explicitly opted out.Implementation in
write.ts:Then call
contentNew = ensureChcpUtf8(contentNew)before writing.Why this is safe:
.bat/.cmdon Windows with non-ASCII textchcpis already present (no double injection)@echo offpositioning (goes on line 2, not before it)Caveat: if the content has
chcponly in a comment block, we'd skip injection. This is a minor edge case that can be refined.Why not just use UTF-8 BOM?
Adding a UTF-8 BOM (byte order mark) to
.batfiles would also tell cmd.exe to interpret them as UTF-8 on recent Windows 10/11. However:@echo off, which then fails silently)@echo offviolates the well-known Windows batch file conventionSo
chcp 65001 >nulis the safer, more compatible approach.Testing
.batfile with non-ASCII characters → should run without crash, Chinese/Japanese/etc. should display correctly