Skip to content

cmd/go: detect UTF-16 encoded sources and improve error reporting #71872

@dolmen

Description

@dolmen

Go version

go version go1.24.0 windows/amd64

Output of go env in your module/workspace:

set GOOS=windows

What did you do?

A text file created with the echo command in PowerShell on Windows 11 is encoded in UTF-16.

Windows PowerShell
Copyright (C) Microsoft Corporation. Tous droits réservés.

Installez la dernière version de PowerShell pour de nouvelles fonctionnalités et améliorations ! https://aka.ms/PSWindows

PS C:\Users\dolmen> cd Code
PS C:\Users\dolmen\Code> mkdir hello


    Répertoire : C:\Users\dolmen\Code


Mode                 LastWriteTime         Length Name
----                 -------------         ------ ----
d-----        21/02/2025     13:46                hello


PS C:\Users\dolmen\Code> cd hello
PS C:\Users\dolmen\Code\hello> go mod init github.com/dolmen-go/hello
go: creating new go.mod: module github.com/dolmen-go/hello
PS C:\Users\dolmen\Code\hello> echo "package main" > hello.go
PS C:\Users\dolmen\Code\hello> type hello.go
package main
PS C:\Users\dolmen\Code\hello> go build .
read C:\Users\dolmen\Code\hello\hello.go: unexpected NUL in input
PS C:\Users\dolmen\Code\hello> Format-Hex hello.go


           Chemin d'accès : C:\Users\dolmen\Code\hello\hello.go

           00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F

00000000   FF FE 70 00 61 00 63 00 6B 00 61 00 67 00 65 00  .þp.a.c.k.a.g.e.
00000010   20 00 6D 00 61 00 69 00 6E 00 0D 00 0A 00         .m.a.i.n.....


PS C:\Users\dolmen\Code\hello> go fmt .
read C:\Users\dolmen\Code\hello\hello.go: unexpected NUL in input

What did you see happen?

PS C:\Users\dolmen\Code\hello> go build .
read C:\Users\dolmen\Code\hello\hello.go: unexpected NUL in input
PS C:\Users\dolmen\Code\hello> go fmt .
read C:\Users\dolmen\Code\hello\hello.go: unexpected NUL in input

What did you expect to see?

The Go parser should detect that the NUL byte in the file is due to the file being encoded as UTF-16 (the NUL byte detection should trigger a check of the BOM) and report a specific message about an encoding issue.

This is an usability issue because some popular text editors such as Visual Studio Code do not (yet) signal the incorrect encoding.

Metadata

Metadata

Assignees

Labels

FeatureRequestIssues asking for a new feature that does not need a proposal.GoCommandcmd/goNeedsInvestigationSomeone must examine and confirm this is a valid issue and not a duplicate of an existing one.ToolProposalIssues describing a requested change to a Go tool or command-line program.

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions