Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pkg/archive.FileInfoHeader: fill file type bits #33935

Merged
merged 1 commit into from Jul 6, 2017

Conversation

AkihiroSuda
Copy link
Member

@AkihiroSuda AkihiroSuda commented Jul 4, 2017

Signed-off-by: Akihiro Suda suda.akihiro@lab.ntt.co.jp

- What I did

Fix Go 1.9 test failure #33892 (comment)

- How I did it

Go 1.9 (golang/go@66b5a2f) removed file type bits from
archive/tar.FileInfoHeader().

This commit ensures file type bits are filled even on Go 1.9 for
compatibility.

- How to verify it

Execute remotecontext tests on both Go 1.8 and Go 1.9beta2.

$ go test -c ./builder/remotecontext && sudo ./remotecontext.test -test.v
$ go1.9beta2 test -c ./builder/remotecontext && sudo ./remotecontext.test -test.v

go1.9beta2 can be installed by running go get golang.org/x/build/version/go1.9beta2.

- Description for the changelog

- A picture of a cute animal (not mandatory but encouraged)
penguins

@vieux
Copy link
Contributor

vieux commented Jul 4, 2017

LGTM

func FileInfoHeader(name string, fi os.FileInfo, link string) (*tar.Header, error) {
hdr, err := tar.FileInfoHeader(fi, link)
if err != nil {
return nil, err
}
hdr.Mode = int64(chmodTarEntry(os.FileMode(hdr.Mode)))
hdr.Mode = fillFileTypeBits(int64(chmodTarEntry(os.FileMode(hdr.Mode))), fi)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can we make it even more obvious that this is a stdlib workaround by calling it fillGo18FileTypeBits

@tonistiigi
Copy link
Member

Design LGTM

Go 1.9 (golang/go@66b5a2f) removed file type bits from
archive/tar.FileInfoHeader().

This commit ensures file type bits are filled even on Go 1.9 for
compatibility.

Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>
@AkihiroSuda
Copy link
Member Author

done @tonistiigi

@tiborvass
Copy link
Contributor

@AkihiroSuda @tonistiigi why were the tests failing on the 1.9 bump PR and why does this PR fix it?
IOW: I thought that Go1.9 would simply bust the cache from one docker release to another. But this clearly fixes something.

@tonistiigi
Copy link
Member

@tiborvass Build cache hashes are computed from tar headers. go1.9 changes tar.FileInfoHeader so it returns a different header and therefore a different hash. This PR modifies our own archive.FileInfoHeader function (that calls tar.FileInfoHeader internally) to revert the go1.9 changes.

@thaJeztah
Copy link
Member

@tonistiigi what I was wondering is if this only affects the build cache, or also the content-addressable store. If only the build-cache, I'm wondering if a one-time cache-bust would be acceptable (preventing us to having to keep carrying this patch)

@tiborvass
Copy link
Contributor

@tonistiigi @AkihiroSuda but why don't we let the cache be busted? If that's what we decide to do, all we need is to update the expected hashes in tarsum_test.go in the go1.9 PR.

@@ -322,6 +334,31 @@ func FileInfoHeader(name string, fi os.FileInfo, link string) (*tar.Header, erro
return hdr, nil
}

// fillGo18FileTypeBits fills type bits which have been removed on Go 1.9 archive/tar
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this override be in a separate file, fenced by a build flag? (// go1.8)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it were to have a tag it would have to have go1.9 tag, but even then it's not needed since in go1.8 these bits are already set, so it would just set it again (it's OR'd)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh, meant to type 1.9 😅

@tonistiigi
Copy link
Member

@tiborvass because we care for our users :)

I'm with @AkihiroSuda for avoiding the cache bust. Cache bust will probably be unavoidable with https://github.com/moby/buildkit so let's not make it a recurring theme. There are real issues with the cache hash(some listed in moby/buildkit#38) that we haven't fixed because we don't want to break it. Let's do it only one time.

@tiborvass
Copy link
Contributor

I'm with you on trying our best to make it as seldom as possible. I didn't know about coming cache busts, so +1 for this PR. Let's not forget to remove the fix function once we are ready.

LGTM

@stevvooe
Copy link
Contributor

stevvooe commented Sep 1, 2017

@AkihiroSuda Was this reported upstream or are we just forever adding exceptions? Was this change even correct in Go 1.9?

@tonistiigi
Copy link
Member

@stevvooe This is not an upstream issue, we are just keeping the compatibility of a method in our own library. The point of archive.FileInfoHeader() is to list all docker specific exceptions(name, special device handling etc). Like commented above, only use case for these mode bits is the current build cache implementation, so this is one of the things that can be removed when moving builder to a more modern architecture.

@Larsjep
Copy link

Larsjep commented Sep 3, 2017

Sorry I didn't notice this before, but it is very unfortunate that these bits has been added for compatibility reasons.
The change in go-lang was actually made to solve this problem docker/docker-py#998.
I understand the arguments about not wanting to break the customers cache, but in this case I think it's needed, because the old cache is build on incorrect tar headers.
The problem with keeping these bits is that it makes it difficult for 3rd party clients (like docker-py) to create a build context that is compatible with the context used by "docker build", because standard tar libraries (like python/tar) doesn't allow setting these bits.

@tiborvass
Copy link
Contributor

tiborvass commented Sep 4, 2017

@Larsjep It is not exactly about not wanting to break users' cache: it is about limiting as much as possible the number of times we do it since it is very disruptive. Since we know we are going to break the cache again when we start depending more on buildkit, and that the issue you're referring to was always present, we think it is reasonable to not address the issue right away, and prefer to break the cache just once in the foreseeable future instead of twice. But the issue will be solved.

@Larsjep
Copy link

Larsjep commented Sep 11, 2017

@tiborvass Sounds reasonable. We must just remember to remove this patch at that point.
Do you know when it's planned that buildkit should be integrated ?

@edmorley
Copy link

Do we need to open an issue to track reverting this / is there an issue open already for the mentioned buildkit changes? It's just docker/compose#5090 is blocked on these changes :-)

@thaJeztah
Copy link
Member

ping @AkihiroSuda @tonistiigi ^^

@edmorley
Copy link

Do we need to open an issue to track reverting this / is there an issue open already for the mentioned buildkit changes? It's just docker/compose#5090 is blocked on these changes :-)

@tiborvass, @AkihiroSuda @tonistiigi - I don't suppose you saw the above? :-)

@tonistiigi
Copy link
Member

@edmorley BuildKit already doesn't use tar headers for content hash so is not affected by this(and doesn't have this patch anywhere)

@edmorley
Copy link

edmorley commented Dec 28, 2017

@tonistiigi - ah to clarify the question - the change we want to make will break existing caches so needs to be coordinated with other such cache-breaking changes, such as the buildkit related work (see #33935 (comment)). My question was asking if an issue was already filed for that work, so we can ensure the tar file related changes happen at the same time. (It would be incredibly frustrating to find the tar file changes were forgotten and we had to wait yet again for the next cache-breaking event... :-))

@tonistiigi
Copy link
Member

@edmorley When docker build is integrated with BuildKit the current code that does the cache checksum calculation or context extraction is removed and everything will be handled by BuildKit so there is no other change that needs to be made. There is no issue that I know of to revert this specific patch from archive package, but the builder will not be calling these archive package functions anymore.

@edmorley
Copy link

edmorley commented Feb 5, 2018

When docker build is integrated with BuildKit the current code that does the cache checksum calculation or context extraction is removed and everything will be handled by BuildKit so there is no other change that needs to be made.

Is there an issue/PR we can track where this is occurring? I'm just keen to keep momentum on this, so we can get the 3 years old docker/compose#883 closed out :-)

@AkihiroSuda
Copy link
Member Author

containerd content store needs to be integrated first

cc @dmcgowan Do we have a github issue?

@dmcgowan
Copy link
Member

dmcgowan commented Feb 6, 2018

@AkihiroSuda Not yet, expect something soon though

@tmo-trustpilot
Copy link

Any news on reverting the added bits? The long running issue with docker-compose and docker build sharing build caches (best explained here docker/compose#883 (comment)) is still a problem three years on...

@AkihiroSuda
Copy link
Member Author

I think docker-compose should use BuildKit filesync API instead

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet