Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: spec: disallow unicode import paths to avoid punycode attacks #20210

karalabe opened this issue May 2, 2017 · 7 comments

proposal: spec: disallow unicode import paths to avoid punycode attacks #20210

karalabe opened this issue May 2, 2017 · 7 comments


Copy link

@karalabe karalabe commented May 2, 2017

If you take a look at the following snippet, it will look completely benevolent. It just pulls in the terminal package from the extended stdlib and read the user's password.

If you copy paste this snippet into a file and try to run it, you will get an error message along the lines of cannot find package "gο" in any of: [...]. The standard "reaction" from a user to this error message will be either to run go get or perhaps go get gο And this is where things can go horribly wrong.

Thing is, the "gο" domain in the snippet's import path is not ASCII, rather Unicode. All modern URL libraries (including the one used by go and go get) will helpfully convert this URL to punycode, mapping the "gο" package silently to "". At this point, Go will go and fetch whatever package is at that import path, on a completely different domain that the user expected.

To fully demo this attack I've tried to actually register that domain, but the top registrars refused. There are plenty shadier ones that seem to be happy, however I'm not sure I want to give out my credit card to so many companies only to see if I can get the domain registered or not.

For the purpose of this demo, let's assume that I did manage to register it, and instead simulate my doing so by manually resolving to You could do this by adding an entry to your hosts file. On Ubuntu this would be adding to /etc/hosts and possibly also enabling this in the network manager via adding addn-hosts=/etc/hosts to /etc/NetworkManager/dnsmasq.d/hosts.conf and doing a sudo service network-manager restart). Make sure nslookup works and results in before continuing.

Ok, now that we have our domain "registered", we could actually try to to run that playground snippet I provided via go get --insecure && go run snippet.go. Note I'm doing insecure since we only have a "simulated" domain, but a real one can have cloudflare or let's-encrypt in front of it, so the HTTPS part at this point is irrelevant.

The program output will be:

Please enter a secret: <you enter "Hello" for example>
Thank you for sharing your password 'Hello' with us! ~MitM

It should be pretty obvious by now what happened here. Go get resolved silently my unicode domain into punycode, it reached out to wherever that was hosted and downloaded something that for every human looked like a trusted repository. In this particular case, I was running a vanity import path resolver at the above IP, which resolves this single import path to, a demo attack package that simulates a mitm attack for ssh passwords.

In my opinion this is a horribly dangerous social engineering attack. All it takes to break a project (open source are vulnerable the most), is to send a patch to a random project on GitHub, and beside fixing whatever to hide the attack, also helpfully reorg the imports, swapping one of the paths out with a custom homoglyph domain. The domain doesn't even have to attack immediately, it could just lay dormant there redirecting to the real import until someone decides to arm it.

This could allow arbitrary code to be injected at an arbitrary later point in time into a project without anyone being the wiser. For certain projects (e.g. Ethereum, which has its majority client written in Go) this is an end game scenario.

From Go's perspective, a possible solution against this attack vector could be to modify gofmt to swap out Unicode import paths to the final punycode variant go get would download anyway. This could provide a strong enough "hint" to developers that there's something very wrong with an import path without requiring on outside tooling. Beside this, I'd also venture to suggest that the compiler could be modified to reject any import paths that resolve into punycode but that are represented as Unicode in the source.

On the down side of course, my suggestion would be equivalent to dropping support for all internationalized domain names, alas the question is whether it's worth compromising the entire ecosystem for fancy import URLs.

Apparently "" is not so easily spoofable since there's a limitation in IDN domains that only characters from a single character set may be used (at least for the popular TLDs) and the "g" character saves the day, being fairly unique to Latin. However there are other interesting domains that can be fully represented in Cyrillic for example which require a single character set and thus pass all domain verification (e.g. "огео.com" actually being "", free to register at godaddy

Given this constraint, the attack surface is much much smaller than I originally anticipated (most Go packages are hosted on github, which should arguably be harder to spoof), but there's still a potential to break future vanity addresses (e.g. if oreo decides to release a Go package in 5 years). Then again https://www.аррӏе.com

@gopherbot gopherbot added this to the Proposal milestone May 2, 2017
@gopherbot gopherbot added the Proposal label May 2, 2017
Copy link

@josharian josharian commented May 2, 2017

See also #20115

Copy link

@myitcv myitcv commented May 4, 2017

This is just a drive by comment to add another alternative to the section that begins:

From Go's perspective, a possible solution...

Equally we could continue to support unicode import paths but enforce that the user acknowledge the punycode equivalent:

package main

import (

	"gο" //

func main() {
	fmt.Println("Please enter a secret:")

As I said, this comment is not per se a vote one way or the other...

A total aside, GMail routed your original email to the Spam folder @karalabe with some obscure message that I basically interpreted as "suspicious homographs"

Copy link
Contributor Author

@karalabe karalabe commented May 4, 2017

Hah, nice :)

Btw, I think your comment proposal is really nice. Retains all current functionality, but protects users. We could also make gofmt expand to that by default and then problem's solved.

Copy link

@sporkmonger sporkmonger commented May 5, 2017

The gofmt expansion idea strikes me as a very reasonable solution.

Copy link

@randall77 randall77 commented May 5, 2017

See also #20209. TL;DR, I think we should solve this in code review tools, not the language.

Copy link

@rsc rsc commented May 15, 2017

On hold for #20209.

Copy link

@nigeltao nigeltao commented May 20, 2017

As per #20209, the fix might be in the tools instead of the language. For example, we could prohibit (not merely swap out) "go get" from following IDNs (Internationalized Domain Names).

I would then invert the suggestion above so that the 'pretty' IDN would be in the (optional) comment, and the unambiguous ASCII name be in the import path. For example:

import "" // gο
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
8 participants
You can’t perform that action at this time.