Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/go: get incorrectly rejects go-import/go-source meta tags with unquoted name attr #39748

Open
kortschak opened this issue Jun 21, 2020 · 6 comments
Labels
Milestone

Comments

@kortschak
Copy link
Contributor

@kortschak kortschak commented Jun 21, 2020

What version of Go are you using (go version)?

$ go version
go version go1.14.4 linux/amd64

Does this issue reproduce with the latest release?

Yes.

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GOARCH="amd64"
GOBIN="/home/user/bin"
GOCACHE="/home/user/.cache/go-build"
GOENV="/home/user/.config/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/home/user"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/home/user/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/home/user/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD="/dev/null"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build389226641=/tmp/go-build -gno-record-gcc-switches"

What did you do?

This issue affecting us has been fixed via a workaround, so I can't post a working reproducer, but the previously it could have been replicated by doing the following.

GOPROXY=direct go get gonum.org/v1/gonum

What did you expect to see?

Downloading the module.

What did you see instead?

$ GOPROXY=direct go get gonum.org/v1/gonum
go get gonum.org/v1/gonum: unrecognized import path "gonum.org/v1/gonum": reading https://gonum.org/v1/gonum?go-get=1: 404 Not Found

Additional information

This was brought to golang-nuts and Gophers' slack where the issue was suggested to be that the the name attribute of the meta tag was unquoted. Ensuring that the attribute remains quoted did resolve the superficial issue, however on raising this with the author of the minifier that removes the quotes (a dependency of hugo), the quotes are apparently not required. This is confirmed by checking the following html here (note that go-import is not quoted).

<!doctype html><html lang=en-us>
<head>
<meta name=go-import content="domain.tld git https://github.com/domain/pkg">
<title>Title</title>
</head>

Also ref gohugoio/hugo#7415

@tdewolff
Copy link

@tdewolff tdewolff commented Jun 22, 2020

@jayconrod jayconrod added the NeedsFix label Jun 23, 2020
@jayconrod jayconrod added this to the Unplanned milestone Jun 23, 2020
@jayconrod
Copy link
Contributor

@jayconrod jayconrod commented Jun 23, 2020

Relevant code is in discovery.go.

The problem is that we use an XML parser instead of an HTML parser. XML requires quotes. We could potentially use golang.org/x/net/html instead.

@kortschak
Copy link
Contributor Author

@kortschak kortschak commented Jun 24, 2020

@jayconrod golang.org/x/net is already vendored into the tree, but does not include the html package. What is the process for including vendored code into the go tree?

@jayconrod
Copy link
Contributor

@jayconrod jayconrod commented Jun 24, 2020

@kortschak Mostly the same as other modules; import it and run go mod vendor.

html is a very large package though, and we don't need its full functionality. Perhaps instead, we should use a more limited ad hoc parser that just extracts this tag. Remote import paths already warns people not to expect a full parser to be used: in particular, avoid embedded JS or CSS before the meta tag.

@tdewolff
Copy link

@tdewolff tdewolff commented Jun 24, 2020

If it helps, here is an implementation of a low level HTML parser: https://github.com/tdewolff/parse/tree/master/html

But perhaps a simple regular expression could be the easiest ad-hoc solution, although it could give false-positives.

@jayconrod
Copy link
Contributor

@jayconrod jayconrod commented Jun 24, 2020

A regular expression might be good enough, since we just need to match one tag, not matching pairs of open / closing tags. We'd have to be smart about <script> and <style>.

Another thing to consider: any change here needs to test against a large set of Go projects to ensure we don't break anything that was working before. I think that's the hardest part of this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants
You can’t perform that action at this time.