Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/go: revisit allowed set of characters in module, import, and file paths #45549

jayconrod opened this issue Apr 13, 2021 · 10 comments
modules NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made.


Copy link

Currently, import paths have the following lexical restrictions (see module.CheckImportPath):

  • Must consist of valid path elements, separated by slashes. Must not begin or end with a slash.
  • A valid path element is a non-empty string that consists of ASCII letters, ASCII digits, and the punctuation characters - . _ ~. Must not end with a dot or contain two dots in a row.
  • A path element prefix up to the first dot must not be a reserved name on Windows, regardless of case (CON, com1, ...). An element must not have a suffix of a tilde followed by ASCII digits (like a Windows short name).

Module paths have the same restrictions as import paths, with additional constraints (see module.CheckPath:

  • The first path element (by convention, a domain name) must const only lower-case ASCII letters, ASCII digits, dots, and dashes. It must contain at least one dot and must not start with a dash.
  • If the path ends with /vN where N consists of ASCII digits and dots, N must not begin with 0, must not be 1, and must not contain any dots (there's a separate special case for module paths).
  • No path element may begin with a dot.

File paths have the same restrictions as import paths, but the set of allowed characters is larger (see module.CheckFilePath):

  • Path elements may consist of Unicode letters, ASCII digits, ASCII spaces, and ASCII punctuation characters ! # $ % & ( ) + , - . = @ [ ] ^ _ { } ~. The remaining ASCII punctuation characters " * < > ? ` ' | / \ : are excluded.

These restrictions are generally in place for good reasons (see Unicode restrictions):

  • Module paths are frequently written and encoded into URLs, and we don't want to allow strings that interfere with that (for example, non-ASCII domain names).
  • Module contents are extracted into directories on a variety of systems. We don't want to allow strings that aren't valid file names or might collide with a different string (on case-insensitive or Unicode normalizing systems). We don't want to allow strings that are reserved, might be interpreted by the shell, might be interpreted as a flag (starting with -), or might be interpreted as a repository (.git).

That being said, these restrictions more English-centric than necessary (#45507). They're also more restrictive than GOPATH (#29101).

We should come up with a wider set of characters that may be allowed without causing compatibility problems, particularly for import and file paths.

cc @bcmills @matloob

Copy link

Please support Chinese characters

Copy link

ddbxyrj commented Jan 20, 2022

For culture diversity, maybe we should take more uncode tyep into consideration.

@golang golang deleted a comment from yangyile1990 Mar 16, 2022
MawKKe added a commit to MawKKe/audiobook-split-ffmpeg-go that referenced this issue Mar 31, 2022
The file in question is not a Go file, but a file for testing. The
filename has quotes in it, causing error during install:

$ go install
create zip: test/beep with spaces and some' quotes" in name.m4a:
malformed file path "test/beep with spaces and some' quotes\" in
name.m4a": invalid char '\''

Perhaps these are related?
- golang/go#50396
- golang/go#45549

Idk, life is too short for dealing with shitty tooling...
Copy link

Related: the handling of punycode domains. #20210

Copy link

Also related, the conclusion that it's up to review tooling to keep homoglyph or LTR/RTL attacks at bay.

Copy link

sxin0 commented Dec 29, 2022

Please support Chinese characters

Copy link

Also related, #44970 discusses spec interactions.

Copy link

yzzd commented Mar 29, 2023

Please support Chinese characters

go1.15.15 (This version is normal, and errors are reported in subsequent versions)

Copy link

Proposal: skip checking resource file names
For example. the package of "" contains a filename named 😻.txt .
The file is not part of the module, but a resource used for tests.
It's path is within Unicode standards.
I would like to think the rules can be more flexible here ;)

Copy link

when I use go 1.15 without go.mod, my go package can name as "ACM题目小马过河"。

while after I use go.mod in go1.20 or go1.21,it says. not support.

I think the "ACM题目小马过河" is easy to be understood for me. easy more than "ACM topic Pony Crossing the River".

So I think it's important to support native languages。

If you think it can make some mistakes. you can use a flag such as "support_native_language", when I open it, my package can not be popular but only for fun.

Copy link

yzzd commented Sep 9, 2023 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
modules NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made.
None yet

No branches or pull requests

8 participants