Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/go: prune nested modules from git export zipfile #29987

Open
coconaut opened this issue Jan 30, 2019 · 9 comments

Comments

@coconaut
Copy link

commented Jan 30, 2019

What version of Go are you using (go version)?

$ go version go1.11.5 darwin/amd64

Does this issue reproduce with the latest release?

Yes.

What operating system and processor architecture are you using (go env)?

go env Output
$ go env

GOARCH="amd64"
GOBIN=""
GOCACHE="/Users/alexberry/Library/Caches/go-build"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOOS="darwin"
GOPATH="/Users/alexbarry/go"
GOPROXY=""
GORACE=""
GOROOT="/usr/local/Cellar/go/1.11.5/libexec"
GOTMPDIR=""
GOTOOLDIR="/usr/local/Cellar/go/1.11.5/libexec/pkg/tool/darwin_amd64"
GCCGO="gccgo"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/cd/2sl0rxtn7y9bvdgyj8z7w6c80000gp/T/go-build437162119=/tmp/go-build -gno-record-gcc-switches -fno-common"

What did you do?

I ran go build on a go module that has a dependency on a separate go module which uses git-lfs for large files. This build had been working as of go1.11.

What did you expect to see?

Go mod should fetch the dependency and go build should succeed.

What did you see instead?

The go build errored with: "unknown import path... : module source tree too big."

For some background, I had previously added several files to git-lfs for the sole purpose of getting around the codehost.MaxZipFile limit of 500MB so I could use go modules back when 1.11 was released. Before doing this, I was also hit with "module source tree too big." The large files are not used at all when this repo is used as a dependency here - only one small package from that repo is imported.

After a decent dive into go get back in 1.11, I figured out I also had to add "export-ignore" to my .gitattributes in this repo to exclude the lfs file patterns from the zip when git archive is called. This finally got my builds working and let me use go modules. Everything had been working great after that.

I just upgraded to 1.11.5 (after another developer did and committed some go.sum files that broke our checksums due to the symlink issue) and cleaned my modcache. Then this unrelated build stopped working.

I believe the change was first introduced in go1.11.2 by this commit to src/cmd/go/internal/modfetch/codehost/git.go:
05b2b9b

I understand the reasoning is to guarantee the checksums are consistent, but I really think the git-lfs case needs to be considered. I have to imagine others using git-lfs within go modules would prefer their modules are importable and deal with breaking checksums on their own terms. The original issue there was more focused around export-subst than export-ignore anyway.

If we can't bring back export-ignore, another option might be to allow the MaxZipSize to be passed in, or enable some other method of specifying packages in a go.mod to be considered or excluded for this size requirement?

If there's an alternate workaround here that doesn't keep me on 1.11 or force me to split apart my repo, I'd be happy to give that a shot too. I've been a bit focused on export-ignore, but maybe there's a better flow to getting git-lfs and go mod to play nice?

There are a few smaller items here that might be worth mentioning:

  1. Behind the scenes, as far as I can tell go get actually is downloading the entire package (lfs files now included), before deciding it is too big. Not only do you have to wait, but the work was already done, so it seems strange to error after all this. I guess we save the extraction?

  2. The archive itself is ~300 MB as downloaded, it seems that the uncompressed size is what triggers the error. Does it need to be this way?

  3. go mod tidy reports that it is downloading the dependency, and finishes with no errors. After it's done, it has not only failed to retrieve the module, but it has silently removed the require line altogether from the go.mod file.

  4. go mod vendor also fails silently. This means when you're using go build -mod=vendor, you'll later get a 'cannot find package "." in (path)' error. It's not clear at all this was related to your module being too large.

Thank you for all the help! Just want to add that in general, I've had great success with go modules and love using them.

@coconaut

This comment has been minimized.

Copy link
Author

commented Jan 30, 2019

See #27153 for the issue that caused this change.

@bcmills

This comment has been minimized.

Copy link
Member

commented Jan 30, 2019

The large files are not used at all when this repo is used as a dependency here - only one small package from that repo is imported.

I've been a bit focused on export-ignore, but maybe there's a better flow to getting git-lfs and go mod to play nice?

If the large files aren't needed by the packages in the module, that part of the source tree probably shouldn't be included in the module at all. Note that you can add go.mod files to subtrees to prune them out of the module containing the parent directory; have you tried that?

A concrete example of a repo with this sort of structure would be helpful.

@coconaut

This comment has been minimized.

Copy link
Author

commented Jan 30, 2019

Thank you for the response. I haven't tried adding go.mod files to subtrees yet, I can give that a shot. Would that pruning be performed in time for the size check on the zip? If so, that might be perfect.

I'll work on getting a concrete sample repo up. The general structure looks something like this:

repo_root
  - models
    - big_binary_file_1.bin
    - big_binary_file_2.bin
  - some_library
     - some_library.go
  - client
    - client.go
  - main.go (runs a service that must load the model binaries)
  - go.mod
  - .gitattributes (has pattern for lfs and export-ignore on *.bin)

A second repo then imports both the client and library packages from this repo and communicates with the service.

@bcmills

This comment has been minimized.

Copy link
Member

commented Jan 30, 2019

Would that pruning be performed in time for the size check on the zip?

I'm not sure, but it arguably should be.

(At the very least, that's something we can address without impacting checksums or introducing too much deviation between git checkouts and module contents. Making a previously-unfetchable module work is much less invasive than changing the contents of a module that is already in wide use.)

@bcmills

This comment has been minimized.

Copy link
Member

commented Jan 30, 2019

Sounds like you can probably drop a go.mod file in the models subdirectory to prune it out.

@coconaut

This comment has been minimized.

Copy link
Author

commented Jan 30, 2019

I'm not sure, but it arguably should be.

(At the very least, that's something we can address without impacting checksums or introducing too much deviation between git checkouts and module contents. Making a previously-unfetchable module work is much less invasive than changing the contents of a module that is already in wide use.)

Sounds good to me.

I'll get a test going with another go.mod in the models dir in a bit, but if you want a clean test case for the original setup, I've made two repos. Gobig has the lfs files, and gosmall has a simple program to call a function imported from a package in gobig. Works in go 1.11, doesn't work in go 1.11.5: downloaded zip file too large (I think it may be failing slightly earlier than my original case, but still highlights the issue).

https://github.com/coconaut/gobig

https://github.com/coconaut/gosmall

@coconaut

This comment has been minimized.

Copy link
Author

commented Jan 30, 2019

Okay, set up a third repo called gobigger. It's the same as gobig, but with a go.mod inside the models dir. Unfortunately I'm seeing the same behavior. You can test by using the gosmall repo and changing the go.mod and the import line in main.go to point to gobigger instead of gobig.

Not sure if I setup the models/go.mod correctly? I have it looking like this:

module github.com/coconaut/gobigger/models

https://github.com/coconaut/gobigger

@bcmills bcmills removed the WaitingForInfo label Jan 30, 2019
@bcmills

This comment has been minimized.

Copy link
Member

commented Jan 30, 2019

Thanks for the detail!

@coconaut

This comment has been minimized.

Copy link
Author

commented Jan 31, 2019

Np, thanks for checking it out!

@bcmills bcmills self-assigned this Feb 12, 2019
@andybons andybons modified the milestones: Go1.13, Go1.14 Jul 8, 2019
@bcmills bcmills changed the title cmd/go: ensureGitAttributes breaks git-lfs go modules cmd/go: prune nested modules from git export data Sep 18, 2019
@bcmills bcmills changed the title cmd/go: prune nested modules from git export data cmd/go: prune nested modules from git export zipfile Sep 18, 2019
@rsc rsc modified the milestones: Go1.14, Backlog Oct 9, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.