Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/go: get reports "module source tree too big" #29210

Closed
orange-jacky opened this issue Dec 13, 2018 · 14 comments

Comments

Projects
None yet
4 participants
@orange-jacky
Copy link

commented Dec 13, 2018

macbookpro:tmp fredlee$ go env

GOARCH="amd64" GOBIN="/Users/fredlee/Documents/develop/go/workspace/bin" GOCACHE="/Users/fredlee/Library/Caches/go-build" GOEXE="" GOFLAGS="" GOHOSTARCH="amd64" GOHOSTOS="darwin" GOOS="darwin" GOPATH="/Users/fredlee/Documents/develop/go/workspace" GOPROXY="" GORACE="" GOROOT="/usr/local/go" GOTMPDIR="" GOTOOLDIR="/usr/local/go/pkg/tool/darwin_amd64" GCCGO="gccgo" CC="clang" CXX="clang++" CGO_ENABLED="1" GOMOD="" CGO_CFLAGS="-g -O2" CGO_CPPFLAGS="" CGO_CXXFLAGS="-g -O2" CGO_FFLAGS="-g -O2" CGO_LDFLAGS="-g -O2" PKG_CONFIG="pkg-config" GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/34/z2z_vp1573g5014m8k7wys100000gn/T/go-build454370012=/tmp/go-build -gno-record-gcc-switches -fno-common"
macbookpro:tmp fredlee$ go version
go version go1.11 darwin/amd64


macbookpro:tmp fredlee$ go get -v github.com/therecipe/qt/cmd/...
go: cannot find main module; see 'go help modules'

macbookpro:tmp fredlee$ go mod init a
go: creating new go.mod: module a


macbookpro:tmp fredlee$ go get -v github.com/therecipe/qt/cmd/...
go: finding github.com/therecipe/qt/cmd/... latest
go: finding github.com/therecipe/qt/cmd latest
go: finding github.com/therecipe/qt latest
go: downloading github.com/therecipe/qt v0.0.0-20181205230143-33cc66032318
go get github.com/therecipe/qt/cmd/...: module source tree too big


macbookpro:tmp fredlee$ go get github.com/therecipe/qt/cmd/...
go: finding github.com/therecipe/qt/cmd/... latest
go: finding github.com/therecipe/qt/cmd latest
go: finding github.com/therecipe/qt latest
go: downloading github.com/therecipe/qt v0.0.0-20181205230143-33cc66032318
go get github.com/therecipe/qt/cmd/...: module source tree too big

how to fix it

@agnivade

This comment has been minimized.

Copy link
Member

commented Dec 13, 2018

Looks like we hit #25470 again. The initial size was 100MB, which was raised to 500MB. Can you tell us the size of "github.com/therecipe/qt/cmd" ?

/cc @bcmills @rsc

@agnivade agnivade changed the title go get report "module source tree too big" cmd/go: get report "module source tree too big" Dec 13, 2018

@orange-jacky

This comment has been minimized.

Copy link
Author

commented Dec 13, 2018

macbookpro:github.com fredlee$ du -sm therecipe/
858 therecipe/
macbookpro:github.com fredlee$ cd therecipe/qt/cmd
macbookpro:cmd fredlee$ cd ..
macbookpro:qt fredlee$ du -sm cmd
1 cmd
macbookpro:qt fredlee$

@bcmills

This comment has been minimized.

Copy link
Member

commented Dec 13, 2018

As I noted in https://golang.org/cl/150738:

I had a look at github.com/therecipe/qt, and the vast bulk of the repository is in .index files in versioned subdirectories of internal/binding/files/docs. Those are large XML files, and appear to be completely irrelevant to the Go build and test process.

Those files should be pruned from the module (either by pruning them from the repository itself, or by placing another go.mod file at an appropriate cut point).

At that point we may find that we need to adjust ReadZip to be able to prune nested modules more aggressively, but I don't think simply raising the limit is an appropriate solution.

@bcmills bcmills added the modules label Dec 13, 2018

@bcmills bcmills added this to the Unplanned milestone Dec 13, 2018

@bcmills

This comment has been minimized.

Copy link
Member

commented Dec 13, 2018

(CC @therecipe; see also therecipe/qt#726)

@bcmills bcmills changed the title cmd/go: get report "module source tree too big" cmd/go: get reports "module source tree too big" Dec 13, 2018

@therecipe

This comment has been minimized.

Copy link

commented Dec 13, 2018

I had a look at github.com/therecipe/qt, and the vast bulk of the repository is in .index files in versioned subdirectories of internal/binding/files/docs. Those are large XML files, and appear to be completely irrelevant to the Go build and test process.

Yeah, these files are only necessary for certain builds such as for HomeBrew and MacPorts builds on macOS or MSYS2 builds on Windows, as well as for an backward compatibility feature.

It is planned to prune and segregate the *.index files into a different repo at some point, but its uncompressed size will easily excede the 500MB limit as well.
I could probably create a repo for each (major) version of Qt to work around this, but it's an rather ugly workaround for this limitation.
Also since the compression ratio is great (17:1) for these files, I see no problem in collecting them all in one repo atm.

Is there some technical reason for the size limitation? I did take a look at the patch to work around this and noted that there are also limits for the go.mod and License files in place. Which strikes me as odd tbh, I mean there is usually no reason for License files to exceede this limit, but why also limit the size of go.mod files?

I would propose to remove these limits altogether, if there is no technical reason for them to be in place.

Other people will otherwise rise additional issues down the line, if they try to bundle binaries and build something like https://pypi.org/project/tensorflow-gpu/#files in Go (which is 1GB uncompressed), one project in Go I know of that already does this is https://github.com/dontpanic92/wxGo (which is 400MB uncompressed). And I also planned to distribute binaries sometime in the future, to make the use of go get as simple as the use of pip install.

@bcmills

This comment has been minimized.

Copy link
Member

commented Dec 13, 2018

It is planned to prune and segregate the *.index files into a different repo at some point, but its uncompressed size will easily excede the 500MB limit as well.

Why would those repos be relevant to the module limit? The go command doesn't execute arbitrary commands to build modules, so if something needs to process those files it seems like you'll need some build step outside of the module system anyway. That build step will presumably not have the same constraints as the go command in module mode.

Also since the compression ratio is great (17:1) for these files, I see no problem in collecting them all in one repo atm.

The go tool unzips module contents in order to build the modules. Large files that are not pertinent to the module build and test cycle waste users' disk capacity, even if they compress well for transfer.

@therecipe

This comment has been minimized.

Copy link

commented Dec 13, 2018

Why would those repos be relevant to the module limit? The go command doesn't execute arbitrary commands to build modules, so if something needs to process those files it seems like you'll need some build step outside of the module system anyway. That build step will presumably not have the same constraints as the go command in module mode.

Yeah, one could handle this in an additional build step, but what about C/C++ binaries someone wants to bundle to make ones repo "go get-able"?
The qt repo is currently using a custom build system, so it's not that difficult for me to work around this limitation, but what about other people?
Why force them to create a custom build system, if the underlying issue is just some easy to fix (probably arbitrary) size limit?

If there would be some way to execute arbitrary commands from the "go get" command than this wouldn't be an issue, but since there isn't the only idiomatic way to go about this is to bundle these binaries in the repo or in some platform specific extra dependency repo.

The go tool unzips module contents in order to build the modules. Large files that are not pertinent to the module build and test cycle waste users' disk capacity, even if they compress well for transfer.

Yes, this holds some truth, but the setup requiered for the repo to work is usually a couple GB in size and so the size of these files can be easily neglected in most cases.
But yeah, I planned to replace (or cut down) these index files anyway regardless of the outcome of this issue... to hopefully make the repo easily "go get-able" in the future.

However, is there any technical reason for the size limit of repos?

@bcmills

This comment has been minimized.

Copy link
Member

commented Dec 13, 2018

what about C/C++ binaries someone wants to bundle to make ones repo "go get-able"?

go get in module mode operates on modules, not entire repositories. If you want users to get the whole repository, please instruct them to use the relevant VCS command (such as git clone) instead.

Why force them to create a custom build system, if the underlying issue is just some easy to fix (probably arbitrary) size limit?

If we find concrete examples of otherwise-module-compatible code that hits the size limit due to content actually needed for the module build, we can certainly consider increasing the limit.

However, is there any technical reason for the size limit of repos?

For one, it detects modules that accidentally include large files that aren't relevant to module builds. 🙂
(That's a real, technical issue: large-but-unnecessary files waste users' disk space and bandwidth, both of which are finite.)

@therecipe

This comment has been minimized.

Copy link

commented Dec 13, 2018

go get in module mode operates on modules, not entire repositories. If you want users to get the whole repository, please instruct them to use the relevant VCS command (such as git clone) instead.

Okay, that was poorly phrased from my end.
I don't want my users to need to resort to "git clone" or similar, I simply want them to be able to use go get.

For one, it detects modules that accidentally include large files that aren't relevant to module builds. 🙂
(That's a real, technical issue: large-but-unnecessary files waste users' disk space and bandwidth, both of which are finite.)

Mh, yes that makes sense but wasn't really considered when resolving #25470 it seems

However, you got me thinking.
I haven't really looked into the go module support yet, and just merged the PR that made it work for one of my users (the same that filled the PR to increse the size limit it seems.)

Can I simply create another module (i.e. create another go.mod file in qt/internal/binding dir) and exclude the files dir due the exclude directive, to solve this issue?
Or even directly exclude it from the root go.mod?

If yes, than this would be a trivial fix from my side and I'm sorry that I wasted both of our time.

@bcmills

This comment has been minimized.

Copy link
Member

commented Dec 14, 2018

Can I simply create another module (i.e. create another go.mod file in qt/internal/binding dir) and exclude the files dir due the exclude directive, to solve this issue?

That should work, I think. (If it does not, please file another issue and I'll investigate.)

@therecipe

This comment has been minimized.

Copy link

commented Dec 14, 2018

Okay, I looked into the module support.
And I can't simply use exclude like I initially thought yesterday.
I found hacks to work around the error message, but they completely defeat the purpose of how I planned to use the module support.

I think I start to undestand now how modules are intended to be used and it seems like I want to use them in a way they are not meant to be used.
Mainly to also distribute versioned files that are only indirectly related to the go build process, since they would only be processed by the custom build system to generate go code which is then made available to the module by using the replace directive.

The long term plan is to also distribute versioned C binaries and header files to be used alongside pre-generated go code without the need for a custom build system (i.e. create a "go get-able" module), at that point the current size limit might be a problem as well.

You already said that if I use an custom build system then

That build step will presumably not have the same constraints as the go command in module mode.

yes, I could fetch the indirectly go build related files at some point after "go get", but that makes things only unnecessarily complicated for everybody involved.

I can agree with the reasoning behind these limits, but in my opinion they are only weak hurdles (which are also only weakly enforced #25470) to keep people on track.
Beeing a little opionated is great, but enforcing to many of these small annoyances on people that want to leave the track for a few steps will only alienate people sooner or later.
Please don't get me wrong, I can understand why these limits are in place and see how they align with Go's philosophy in general.
But I will work around them nevertheless if I have to, and so these limits where ineffective and nobody is happy in the end, which is not the outcome I hoped for.

Btw, thanks for your time :)
I should have looked into the module support a lot earlier...

@therecipe

This comment has been minimized.

Copy link

commented Dec 16, 2018

Okay, I worked around the size limit with therecipe/qt@5251d34#diff-4461efdc93245626ff859334a9abf39c for now.
So from my side, feel free to close this issue.

However I still think that the size limit shouldn't be enforced by the go tool and people should be able to decide for themselves.
Ultimately only the people that really host the modules should be able to enforce size limits.
Maybe only print a warning or something instead...

@agnivade

This comment has been minimized.

Copy link
Member

commented Dec 23, 2018

@bcmills - anything else to do here ?

@bcmills

This comment has been minimized.

Copy link
Member

commented Jan 10, 2019

Don't think so. Folks can file separate issues if they discover other repositories that don't fit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.