Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/go: get fails on gitlab subgroups #34094

Open
umputun opened this issue Sep 5, 2019 · 39 comments

Comments

@umputun
Copy link

commented Sep 5, 2019

What version of Go are you using (go version)?

$ go version
go version go1.13 darwin/amd64

Does this issue reproduce with the latest release?

yes

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/Users/umputun/Library/Caches/go-build"
GOENV="/Users/umputun/Library/Application Support/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GONOPROXY=""
GONOSUMDB=""
GOOS="darwin"
GOPATH="/Users/umputun/go-home"
GOPRIVATE=""
GOPROXY="direct"
GOROOT="/usr/local/Cellar/go/1.13/libexec"
GOSUMDB="off"
GOTMPDIR=""
GOTOOLDIR="/usr/local/Cellar/go/1.13/libexec/pkg/tool/darwin_amd64"
GCCGO="gccgo"
AR="ar"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD="/Users/umputun/tmp/t/go.mod"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/fx/rzs1_n8137qfktxcbt_2v8pc0000gp/T/go-build570104617=/tmp/go-build -gno-record-gcc-switches -fno-common"

What did you do?

  1. Created a public project on gitlab https://gitlab.com/umputuntests/sub/example with a module gitlab.com/umputuntests/sub/example
  2. Tried to get it, i.e. go get -v gitlab.com/umputuntests/sub/example
  3. go get failed
get "gitlab.com/umputuntests/sub/example": found meta tag get.metaImport{Prefix:"gitlab.com/umputuntests/sub/example", VCS:"git", RepoRoot:"https://gitlab.com/umputuntests/sub/example.git"} at //gitlab.com/umputuntests/sub/example?go-get=1
get "gitlab.com/umputuntests/sub": found meta tag get.metaImport{Prefix:"gitlab.com/umputuntests/sub", VCS:"git", RepoRoot:"https://gitlab.com/umputuntests/sub.git"} at //gitlab.com/umputuntests/sub?go-get=1
go: finding gitlab.com/umputuntests/sub/example v0.0.2
go: downloading gitlab.com/umputuntests/sub/example v0.0.2
go: extracting gitlab.com/umputuntests/sub/example v0.0.2
go get gitlab.com/umputuntests/sub/example: git ls-remote -q https://gitlab.com/umputuntests/sub.git in /Users/umputun/go-home/pkg/mod/cache/vcs/410d993df4daac0579ff6b4402c511af12bd3d00851d05ab6183293d03664961: exit status 128:
	fatal: could not read Username for 'https://gitlab.com': terminal prompts disabled
Confirm the import path was entered correctly.
If this is a private repository, see https://golang.org/doc/faq#git_https for additional information.

What did you expect to see?

with prev versions of go (1.12.x) it works file:

 go get -v gitlab.com/umputuntests/sub/example
Fetching https://gitlab.com/umputuntests/sub/example?go-get=1
Parsing meta tags from https://gitlab.com/umputuntests/sub/example?go-get=1 (status code 200)
get "gitlab.com/umputuntests/sub/example": found meta tag get.metaImport{Prefix:"gitlab.com/umputuntests/sub/example", VCS:"git", RepoRoot:"https://gitlab.com/umputuntests/sub/example.git"} at https://gitlab.com/umputuntests/sub/example?go-get=1
go: finding gitlab.com/umputuntests/sub/example v0.0.2
go: finding github.com/stretchr/testify v1.3.0
go: finding github.com/stretchr/objx v0.1.0
go: finding github.com/davecgh/go-spew v1.1.0
go: finding github.com/pmezard/go-difflib v1.0.0
go: downloading gitlab.com/umputuntests/sub/example v0.0.2
go: extracting gitlab.com/umputuntests/sub/example v0.0.2
gitlab.com/umputuntests/sub/example/strategy
gitlab.com/umputuntests/sub/example

additional notes

I think this is both gitlab problem and go 1.13 change. Gitlab seems to return go-import tag on the parent, which doesn't exist and this seems to confuse "go get", trying to check module's parent.

curl "https://gitlab.com/umputuntests/sub?go-get=1"

<html><head><meta name="go-import" content="gitlab.com/umputuntests/sub git https://gitlab.com/umputuntests/sub.git" /></head></html>

In fact, gitlab returns good-looking go-import on any random path, for example:

curl "https://gitlab.com/random/xyz/something?go-get=1"
<html><head><meta name="go-import" content="gitlab.com/random/xyz git https://gitlab.com/random/xyz.git" /></head></html>
@magodo

This comment was marked as off-topic.

Copy link

commented Sep 5, 2019

Same problem here...

@kinwyb

This comment was marked as off-topic.

Copy link

commented Sep 5, 2019

Same problem here

@leonshaw

This comment has been minimized.

Copy link

commented Sep 5, 2019

Go 1.13 tries all module prefixes in parallel, and if another error other than "not-found" happens, the query fails. Of course the root cause is on gitlab side, but I think it's reasonable to ignore all errors if at least one attempt succeeds.
A quick workaround is clearing err if found is not empty in queryPrefixModules() in cmd/go/internal/modload/query.go.

@agnivade agnivade changed the title go get v1.13 fails on gitlab subgroups cmd/go: get fails on gitlab subgroups Sep 5, 2019

@agnivade agnivade added this to the Go1.14 milestone Sep 5, 2019

@agnivade

This comment has been minimized.

Copy link
Member

commented Sep 5, 2019

@bcmills @jayconrod

EDIT: As a reminder, please note our https://github.com/golang/go/wiki/noplusone policy. Comments like "Me too" do not add anything new to the discussion.

@bcmills

This comment has been minimized.

Copy link
Member

commented Sep 5, 2019

I think it's reasonable to ignore all errors if at least one attempt succeeds.

I could see doing that if the provided Git URL itself returned a 404 or 410 status code, but note that gitlab.com today does not:

~$ curl -sI https://gitlab.com/nonexistentuser/xyz.git | grep HTTP
HTTP/1.1 401 Unauthorized
@tangtony

This comment has been minimized.

Copy link

commented Sep 5, 2019

I'm running into issues with GitLab subgroups as well, but I'm getting a different error:

$ go get -u example.com/group/subgroup/my-go-project.git
get "example.com/group/subgroup": found meta tag get.metaImport{Prefix:"example.com/group/subgroup", VCS:"git", RepoRoot:"https://example.com/group/subgroup.git"} at //example.com/group/subgroup?go-get=1
go: finding example.com/group/subgroup/my-go-project.git latest
go get example.com/group/subgroup/my-go-project.git: git ls-remote -q https://example.com/group/subgroup.git in C:\Users\dev\go\pkg\mod\cache\vcs\5af58fdf9c9bd0fd298253dde7a0add3f4e5d6be34206b0b46f6805ee365d4ea: exit status 128:
        remote: The project you were looking for could not be found.
        fatal: repository 'https://example.com/group/subgroup.git/' not found

It almost seems like it's assuming that the second part of the path must be the project. Going back to 1.12 fixes it.

Edit: Nevermind, re-ran it with -v and it's the same issue mentioned above with GitLab returning the parent.

@bcmills bcmills added the modules label Sep 5, 2019

@leonshaw

This comment has been minimized.

Copy link

commented Sep 5, 2019

@bcmills I mean if a try on gitlab.com/group/project succeeds, we can ignore errors encountered trying gitlab.com/group and gitlab.com/group/project/submodule, no matter what they are.

@bcmills

This comment has been minimized.

Copy link
Member

commented Sep 5, 2019

@leonshaw, maybe..? The problem with that is that it's asymmetric.

If I have credentials for module example.com/foo containing subdirectory ./bar/baz, and you have credentials for example.com/foo/bar containing subdirectory ./baz, and we both run go get example.com/foo/bar/baz, then we'll end up with different results for the exact same query.

The solution to that is to treat responses other than 404/410 as hard errors. Then I'll fail to fetch example.com/foo/bar, and you'll fail to fetch example.com/foo, and we'll both (consistently!) know that we don't have enough credentials to produce a consistent result.

(More generally: if servers want to avoid leaking information to unauthenticated users, then it's more robust to deny the existence of every module rather than claim the existence of every module — especially now that #29888 is fixed.)

@umputun

This comment has been minimized.

Copy link
Author

commented Sep 5, 2019

f I have credentials for module example.com/foo containing subdirectory ./bar/baz, and you have credentials for example.com/foo/bar containing subdirectory ./baz, and we both run go get example.com/foo/bar/baz, then we'll end up with different results for the exact same query.

I don't think this is possible, at least with giltab, as they don't have "directory-level permissions" but repository-level only. Probably the same for gihub.

Even in case, if a user has credentials to every repo on the server, go get will fail to get that non-existing parent, but instead of the authorization failure git ls-remote complains about "Could not read from remote repository."

> go get -v example.com/commons/pkg/multierror
get "example.com/commons/pkg": found meta tag get.metaImport{Prefix:"example.com/commons/pkg", VCS:"git", RepoRoot:"https://example.com/commons/pkg.git"} at //example.com/commons/pkg?go-get=1
get "example.com/commons/pkg/multierror": found meta tag get.metaImport{Prefix:"example.com/commons/pkg/multierror", VCS:"git", RepoRoot:"https://example.com/commons/pkg/multierror.git"} at //example.com/commons/pkg/multierror?go-get=1
go get example.com/commons/pkg/multierror: git ls-remote -q https://example.com/commons/pkg.git in /Users/umputun/go-home/pkg/mod/cache/vcs/882b5ac6f69791979e3abd70af30e871e32fe91173f2e25079ef6a88464ec13e: exit status 128:
	> GitLab: The project you were looking for could not be found.
	fatal: Could not read from remote repository.
@leonshaw

This comment has been minimized.

Copy link

commented Sep 5, 2019

If I have credentials for module example.com/foo containing subdirectory ./bar/baz, and you have credentials for example.com/foo/bar containing subdirectory ./baz, and we both run go get example.com/foo/bar/baz, then we'll end up with different results for the exact same query.

Is this something more fundamental? (A module can't be uniquely located by an import path.)

@bcmills

This comment has been minimized.

Copy link
Member

commented Sep 5, 2019

at least with [gitlab], as they don't have "directory-level permissions" but repository-level only. Probably the same for [github].

GitLab and GitHub are not the only ways to host a module. Ideally, the semantics of the go command should generalize to self-hosted domains with arbitrary structure.

@bcmills

This comment has been minimized.

Copy link
Member

commented Sep 5, 2019

Is this something more fundamental? (A module can't be uniquely located by an import path.)

A module has a unique path. The mapping of package paths to modules is not guaranteed to be 1:1, although the go command fails with an error message if any one package within the package import graph is provided by more than one module in the module graph.

@leonshaw

This comment has been minimized.

Copy link

commented Sep 5, 2019

A module has a unique path. The mapping of package paths to modules is not guaranteed to be 1:1, although the go command fails with an error message if any one package within the package import graph is provided by more than one module in the module graph.

Yes, this is the problem. We need a way to avoid the collision, if possible. If it's simply treated as error, example.com/foo/bar will be unusable if example.com/foo exists.

@umputun

This comment has been minimized.

Copy link
Author

commented Sep 5, 2019

GitLab and GitHub are not the only ways to host a module. Ideally, the semantics of the go command should generalize to self-hosted domains with arbitrary structure.

Probably, self-hosted giltab used by a majority of self-hosted git systems. However, this new behavior seems to conflict not with giltab only, but gitea as well and may conflict with any system surprised by parent discovery calls. I appreciate the generalized approach and concerns about asymmetrical behavior in some theoretical cases, but the current problem if very practical one.

@bcmills bcmills self-assigned this Sep 5, 2019

@bcmills

This comment has been minimized.

Copy link
Member

commented Sep 5, 2019

I have most of a solution coded up, but there is a bad interaction with proxy fallback that I still need to work out.

Even then, my expectation is that a server that wishes to indicate “there is no repository here” (for a repository explicitly mentioned in a go-import <meta> tag) will serve a 404 or a 410 for the URL. I'm not sure whether gitlab.com does that for authenticated users, but for unauthenticated users they're currently serving 401s instead.

@gopherbot

This comment has been minimized.

Copy link

commented Sep 10, 2019

Change https://golang.org/cl/194560 mentions this issue: cmd/go/internal/modfetch: report the module path for errors in (*codeRepo).Versions

@gopherbot

This comment has been minimized.

Copy link

commented Sep 10, 2019

Change https://golang.org/cl/194561 mentions this issue: cmd/go/internal/modfetch/codehost: treat nonexistent repositories as “not found”

gopherbot pushed a commit that referenced this issue Sep 10, 2019
cmd/go/internal/modfetch: report the module path for errors in (*code…
…Repo).Versions

Updates #34094

Change-Id: Ifd10b51c2b4ebe77c4f8f68726e411f54c13b9c9
Reviewed-on: https://go-review.googlesource.com/c/go/+/194560
Run-TryBot: Bryan C. Mills <bcmills@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Jay Conrod <jayconrod@google.com>
@bcmills

This comment has been minimized.

Copy link
Member

commented Sep 10, 2019

@gopherbot, please backport to 1.13: this is a regression, and prevents previously-working modules from being fetched.

@gopherbot

This comment has been minimized.

Copy link

commented Sep 10, 2019

Backport issue(s) opened: #34215 (for 1.13).

Remember to create the cherry-pick CL(s) as soon as the patch is submitted to master, according to https://golang.org/wiki/MinorReleases.

gopherbot pushed a commit that referenced this issue Sep 11, 2019
cmd/go/internal/modfetch/codehost: treat nonexistent repositories as …
…“not found”

If a go-import directive refers to a nonexistent repository, today we
treat that as an error fetching a module that actually exists.
That makes the HTTP server responsible for determining which
repositories do or do not exist, which may in general depend on
the user's separately-stored credentials, and imposes significant
complexity on such a server, which can otherwise be very simple.

Instead, check the repository URL and/or error message to try to
determine whether the repository exists at all. If the repo does not
exist, treat its absence as a “not found” error — as if the server had
not returned it in the first place.

Updates #34094

Change-Id: I142619ff43b96d0de428cdd0b01cca828c9ba234
Reviewed-on: https://go-review.googlesource.com/c/go/+/194561
Reviewed-by: Jay Conrod <jayconrod@google.com>
@bcmills

This comment has been minimized.

Copy link
Member

commented Sep 11, 2019

I've implemented the approach described in #34094 (comment). That should solve the problem if GitLab is correctly serving 404 or 410 responses for nonexistent paths for signed-in users.

However, I don't know whether they're doing that, so please help me confirm.

@umputun (or one of the other folks with this problem): please try a go command built from head and let me know whether your problem is resolved. (The easy way to try a go command built from head is to install golang.org/x/dl/gotip, then run gotip download and use gotip in place of the regular go command.)

@gopherbot

This comment has been minimized.

Copy link

commented Sep 11, 2019

Change https://golang.org/cl/194679 mentions this issue: [release-branch.go1.13] cmd/go/internal/modfetch/codehost: treat nonexistent repositories as “not found”

@umputun

This comment has been minimized.

Copy link
Author

commented Sep 11, 2019

@bcmills I have tried to test it with both public gitlab as well as self-hosted and it doesn't seem to work. I have also tried it with full credentials and git's "insteadOf" mapping from https:// to git@... and it failed either. Both are working fine with 1.12.

You can try it with public repo from the original ticket, i.e. go get -v gitlab.com/umputuntests/sub/example vs gotip get -v gitlab.com/umputuntests/sub/example.

That should solve the problem if GitLab is correctly serving 404 or 410 responses for nonexistent paths for signed-in users.

I don't think giltab does it. To me it looks like they return 401 on any non-existent paths. Probably adding 401 to the list of the errors on the parent discovery you consider as "missing module" may fix it.

@bcmills

This comment has been minimized.

Copy link
Member

commented Sep 11, 2019

Probably adding 401 to the list of the errors on the parent discovery you consider as "missing module" may fix it.

I don't think that would be appropriate. Per RFC 2616 §10.4.2 (emphasis mine):

If the request already included Authorization credentials, then the 401 response indicates that authorization has been refused for those credentials. If the 401 response contains the same challenge as the prior response, and the user agent has already attempted authentication at least once, then the user SHOULD be presented the entity that was given in the response, since that entity might include relevant diagnostic information.

I can certainly make the case that we should interpret a repo 404 as “not found”, but a server that explicitly tells the go command to check a particular repo really should not also tell the go command that it is not authorized to know whether that repo actually exists.

At the very least, if a more RFC-compliant response code is not feasible I'd like to hear the rationale from someone at GitLab.

@umputun

This comment has been minimized.

Copy link
Author

commented Sep 11, 2019

The ticket I have opened on gitlab was closed by them as a duplicate to another, 2y old ticket they refused to address.

From what I understand, gitlab considers 404 on non-existent repos as a security issue, i.e. someone could realize what private repo exists. I don't think it has anything to do with authorized status - 401 seems to be returned on any nonexisting path regardless of the auth status.

Generally speaking, such behavior is a little bit odd, but not obviously wrong. go get is trying to access parent(s) of the repo which may not be repo at all, but some protected page/service. Sure it will be nice if gitlab return no go-import in such a case, but I guess they consider it security issue as well.

@bcmills

This comment has been minimized.

Copy link
Member

commented Sep 11, 2019

gitlab considers 404 on non-existent repos as a security issue, i.e. someone could realize what private repo exists.

Ironically, GitHub seems to take the opposite approach: they return 404 uniformly for unauthenticated users.

Generally speaking, such behavior is a little bit odd, but not obviously wrong. go get is trying to access parent(s) of the repo which may not be repo at all, but some protected page/service.

For the example of gitlab.com/umputuntests/sub/example, the cat is already out of the bag: the non-401 response for https://gitlab.com/umputuntests/sub/example already reveals the existence of the entities gitlab.com/umputuntests and gitlab.com/umputuntests/sub.

The same would seem to be true for private subgroups for appropriately-credentialed users.
So the “security issue” explanation does not seem to hold water.

Moreover, given that #29888 has been addressed in Go 1.13, it seems that they could simply stop serving the go-import meta tags in 401 responses in order to avoid revealing information to unauthenticated users.

@umputun

This comment has been minimized.

Copy link
Author

commented Sep 11, 2019

the non-401 response for https://gitlab.com/umputuntests/sub/example already reveals the existence of the entities gitlab.com/umputuntests and gitlab.com/umputuntests/sub

Not sure I understand. Both gitlab.com/umputuntests/sub and gitlab.com/umputuntests/sub/example are public. If you try anything completely random, you will get 401 for any non-existent as well as any private repo. By doing this they are trying to hide the presence of private repos.

http https://gitlab.com/random-user/very-very-random-repo/sub/blah.git
HTTP/1.1 401 Unauthorized

Http access to the real private repo as well as to non-existent repo redirects to gitlab.com/users/sign_in

that they could simply stop serving the go-import meta tags in 401 response

I see what you mean, but technically it is not entirely accurate - they serve meta tags with 200 status, not 401

@bcmills

This comment has been minimized.

Copy link
Member

commented Sep 11, 2019

If you try anything completely random, you will get 401 for any non-existent as well as any private repo.

Sure, but the go command is not trying “anything completely random”. It is trying prefixes of the requested import path, and repositories to which it was directed via <meta> tags at those prefixes. And we are assuming that the user has sufficient credentials to at least reveal the existence of some module with a matching prefix.

@umputun

This comment has been minimized.

Copy link
Author

commented Sep 11, 2019

Understood. I believe the root cause is meta, gitlab returns unconditionally with 200 on any path, but for the end-user, it looks and feels like a regression in go 1.3 - before update modules worked with gitlab, and after the update, it stopped to work.

I'm not sure what changed with the parent discovery between 1.2 and 1.3, but if this is due to "Go 1.13 tries all module prefixes in parallel, and if another error other than "not-found" happens, the query fails." can this logic be adjusted to succeed if the repo has go.mod regardless of the parrent status? If such behavior is not acceptable generally, maybe some env param to make gitlab happy and allow such forgiving modules discovery?

@bcmills

This comment has been minimized.

Copy link
Member

commented Sep 12, 2019

can this logic be adjusted to succeed if the repo has go.mod regardless of the parrent status?

The presence or absence of a go.mod file seems largely independent. We could perhaps go back to ignoring errors for shorter paths if a longer path succeeded, but I would rather we get the server fixed to return semantically-appropriate responses.

@umputun

This comment has been minimized.

Copy link
Author

commented Sep 12, 2019

Agree, get the server fixed is the right way, but pragmatically, chances they are going to fix it very slim.

From what I see, as of now "1.13 -> gitlab" can be addressed in 4 ways:

  • gitlab change the behavior and makes it correct from go get (>=1.13) point of view. This unlikely to happen soon (ever?)
  • ignoring errors for shorter path - sounds doable, but you the expert.
  • Writing some kind of go-git-proxy sitting between "go get" and gitlab repo. It may work, but this thing is going to be ugly, with url rewrites and extra configuration of git. Doable if no other choice, but ignoring errors seems to be a better one.
  • There is one more workaround - define a bunch of rewrites in go.mod, one for each gitlab's module. This one will do the job, but it is not just an ugly hack but partially defeats the purpose of modules.
replace gitlab.example.com/commons/pkg/mongo => gitlab.example.com/commons/pkg/mongo.git v1.0.3
@bcmills

This comment has been minimized.

Copy link
Member

commented Sep 12, 2019

@umputun, note that if the module is already in the build list, we don't bother searching all possible paths for it. So one workaround, if you already know the module or repo path, is to simply run something like

go mod edit -require gitlab.com/umputuntests/sub/example@master

in order to prime the appropriate paths before running any remaining go get subcommands.

foo.example.com$ go1.13 mod init foo.example.com
go: creating new go.mod: module foo.example.com

foo.example.com$ go1.13 mod edit -require gitlab.com/umputuntests/sub/example@master

foo.example.com$ go1.13 list -m all
go: finding gitlab.com/umputuntests/sub/example master
go: finding github.com/davecgh/go-spew v1.1.0
go: finding github.com/pmezard/go-difflib v1.0.0
go: finding github.com/stretchr/objx v0.1.0
go: finding github.com/stretchr/testify v1.3.0
foo.example.com
github.com/davecgh/go-spew v1.1.0
github.com/pmezard/go-difflib v1.0.0
github.com/stretchr/objx v0.1.0
github.com/stretchr/testify v1.3.0
gitlab.com/umputuntests/sub/example v0.0.2

@bcmills bcmills removed the WaitingForInfo label Sep 12, 2019

@bcmills

This comment has been minimized.

Copy link
Member

commented Sep 12, 2019

I'd still like to hear from someone at @gitlabhq (maybe @stanhu?) to get their take on this.

@jamesramsay

This comment has been minimized.

Copy link

commented Sep 12, 2019

Agree, get the server fixed is the right way, but pragmatically, chances they are going to fix it very slim.

👋 thanks for the ping. I've read https://gitlab.com/gitlab-org/gitlab-ce/issues/67020#note_212856862 and the thread above, and want to confirm I understand the preferred fix.

@umputun by "right way", do you mean returning 404 rather than 401?

Since this used to work in 1.12, could you help me understand the changes made on the go side in 1.13 that means this is problematic? I'm just trying to get a handle on the broader context.

@umputun

This comment has been minimized.

Copy link
Author

commented Sep 12, 2019

@jamesramsay correct. I believe (@bcmills correct if I wrong) 404 will fix the problem.

Another way to fix it - return 404 on GET <path>?go-get=1 for non-repo paths, i.e. groups, subgroups and anything else what is not a repository.

@leonshaw

This comment has been minimized.

Copy link

commented Sep 13, 2019

@jamesramsay correct. I believe (@bcmills correct if I wrong) 404 will fix the problem.

Another way to fix it - return 404 on GET <path>?go-get=1 for non-repo paths, i.e. groups, subgroups and anything else what is not a repository.

Technically, GET <path>.git and GET <path>?go-get=1 should return the same type of error in this case.

@bcmills

This comment has been minimized.

Copy link
Member

commented Sep 13, 2019

@jamesramsay

I've read […], and want to confirm I understand the preferred fix.

With CL 194679 in, there are several possible fixes:

  1. You could continue to serve <meta name="go-import" […]> tags for all URLs that could potentially exist as repositories, but serve 404s or 410s (instead of 401s) for requests for the actual repository (https://[…].git) when that the user is authorized to access any repository that has the requested path as a prefix.

    • For example: if the user is authorized to see https://gitlab.com/umputuntests/sub/example.git, then a response for https://gitlab.com/umputuntests/sub.git or https://gitlab.com/umputuntests.git should have code 404 or 410, not 401.
  2. You could remove the <meta name="go-import" […]> tags from https://[…]?go-get=1 responses for all URLs that don't correspond to actual repositories. (The HTTP response code does not matter as long as it is not a redirect.)

    • To preserve path secrecy, you would presumably want to also remove <meta […]> tags for repositories that the requester is not authorized to see. Unfortunately, that might cause go get to fail for users on Go 1.12 due to #29888 (which is fixed in 1.13).
  3. You could remove the <meta name="go-import" […]> tags from https://[…]?go-get=1 responses for all URLs that don't correspond to actual repositories when the user is authorized to access a repository that has the requested path as a prefix.

    • For example, if the user is authorized to see https://gitlab.com/umputuntests/sub/example.git, then a response for https://gitlab.com/umputuntests/sub?go-get=1 should omit the <meta […]> tag, since the user already knows that that URL does not correspond to an actual repository.
@bcmills

This comment has been minimized.

Copy link
Member

commented Sep 13, 2019

@leonshaw

Technically, GET <path>.git and GET <path>?go-get=1 should return the same type of error in this case.

That's true, but turns out not to matter that much: the go command intentionally looks for <meta name="go-import" […]> tags even within 404 (or 401) responses.

(That is: the content is what matters for the ?go-get=1 request, but the status is what matters for the .git request, and the two are mostly orthogonal.)

@bcmills

This comment has been minimized.

Copy link
Member

commented Sep 13, 2019

@jamesramsay

Since this used to work in 1.12, could you help me understand the changes made on the go side in 1.13 that means this is problematic?

In CL 173017, I changed the go command's algorithm for mapping an unresolved import path to the corresponding module path.

It was previously a sequential search starting from the longest path, working downward to progressively shorter paths, and the error handling was a bit sloppy.

Now, it searches in parallel for modules at all possible prefixes of the import path, which can significantly reduce overall latency (especially when a proxy is involved). However, the fact that the search occurs in parallel means that we end up trying a lot of potential module paths that turn out not to be valid.

We're fine with an invalid module path, as long as the error is that the module does not exist.

  • The main mechanism for a server to indicate “the module does not exist” is to not serve a <meta name="go-import" […]> tag for the ?go-get=1 request for that path.

  • As of CL 194679 (which is merged but not released), we also support a secondary mechanism: if the server provides a <meta […]> tag with an HTTPS URL, and that URL serves status 404 or 410, we interpret that to mean “the module would be at that URL if it existed, but it's not there so it doesn't exist.”

@umputun

This comment has been minimized.

Copy link
Author

commented Sep 16, 2019

@bcmills - I had a chance to test this issue with gitea. For 1.13 the situation is similar to gitlab's, i.e. the same regression (worked with 1.12 and fails on 1.13). However, with gotip it worked fine. gitea seems to return go meta on any path, pretty much the same way as gitlab. However for those non-existing repos gitea, unlike gitlab, returns 404.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
9 participants
You can’t perform that action at this time.