New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Apply SOURCE_DATE_EPOCH to the files and the whiteouts inside OCI tar layers #3560
base: master
Are you sure you want to change the base?
Conversation
7a4ab1e
to
ac6f97b
Compare
cache/blobs.go
Outdated
| @@ -62,6 +63,8 @@ func (sr *immutableRef) computeBlobChain(ctx context.Context, createIfNeeded boo | |||
| } | |||
|
|
|||
| func computeBlobChain(ctx context.Context, sr *immutableRef, createIfNeeded bool, comp compression.Config, s session.Group, filter map[string]struct{}) error { | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand how this is safe. If the blob already exists, then just the blob is used without any timestamp check. If the first build uses one epoch and the second uses another or no epoch at all then the second build gets epoch from the first build.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now validated in cache/blobs.go:blobExistsWithSourceDateEpoch()
|
Is this and #3263 still blocked? |
Yes, will try to update next week, sorry |
6155189
to
45acbcd
Compare
|
Rebased |
75276f4
to
7121a39
Compare
|
Rebased again |
| if cdAddress := sb.ContainerdAddress(); cdAddress != "" { | ||
| // https://github.com/containerd/containerd/commit/9c9f564a35df7990870a53a125afbf88ac412753 | ||
| integration.CheckContainerdVersion(t, cdAddress, ">= 1.7.0-beta.1") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks good for containerd worker but I'm not sure what we should do for dockerd with containerd snapshotter enabled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe having a factory in the integration sandbox to return the containerd version depending on the worker if that makes sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it can be revisited in a separate PR?
The current code is passing the current CI.
7121a39
to
a274214
Compare
|
Rebased |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Iiuc correctly, this also converts the blobs of base images. I don't think that is correct, at least as the default behavior. I think it may be possible to use the DiffIDs in the image config that the frontend passes to the exporter to figure out which layers are from the base image and which are not.
Not for this PR, but we are also assuming that archive/tar and compress/gzip are deterministic. We at least need to add test coverage that would reveal possible changes in Go updates and give us option to vendor old library if needed.
| }) | ||
| } | ||
|
|
||
| if _, ok := filter[sr.ID()]; ok { | ||
| eg.Go(func() error { | ||
| _, err := g.Do(ctx, fmt.Sprintf("%s-%t", sr.ID(), createIfNeeded), func(ctx context.Context) (interface{}, error) { | ||
| if sr.getBlob() != "" { | ||
| if blobExistsWithSourceDateEpoch(sr, sourceDateEpoch) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean that if there is no blob for epoch but there is one for another one, we will run the differ again? Is it a good idea? I think it would be better to do something similar to compressions there we convert a blob to another one.
| @@ -305,6 +320,9 @@ func (sr *immutableRef) setBlob(ctx context.Context, desc ocispecs.Descriptor) ( | |||
| sr.queueMediaType(desc.MediaType) | |||
| sr.queueBlobSize(desc.Size) | |||
| sr.appendURLs(desc.URLs) | |||
| if sourceDateEpoch != nil { | |||
| sr.queueSourceDateEpoch(*sourceDateEpoch) | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I understand how this works. Doesn't it just fix a ref to a specific epoch forever? Instead, it should track multiple variants of blobs that are all associated with a ref (similar to compressions).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it should track multiple variants of blobs that are all associated with a ref (similar to compressions).
Does that require putting the epoch value into an OCI descriptor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean for the remote build-cache or for something else? In build-cache we already have "created time" but not exactly the same(although both would be epoch value). We might need a separate field for that.
If you think for the case of avoiding regenerating blobs for base image layers if they were already generated with the same epoch, then yes, I think saving epoch would be needed. Otoh, I don't think any user would want us to regenerate the layers for the base image, just to change the timestamps in it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sr.queueSourceDateEpoch() is only called for new diffs, so it seems safe for now
Maybe we can introcue an envvar to experimentally allow this behavour? Possible another way is managing epoch variants as compression variants by making |
300f800
to
415a7ae
Compare
Updated PR to avoid converting the base images, by checking
I don't see a usecase for this, but it can be covered in a follow-up PR if somebody needs it. |
| @@ -65,9 +65,12 @@ See also the [documentation](/frontend/dockerfile/docs/reference.md#buildkit-bui | |||
|
|
|||
| ## Caveats | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: The limitation described in this PR should be documented somewhere?
Limitations:
containerd 1.7 is needed for the containerd worker mode. (containerd/containerd@9c9f564)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mentioned in docs/build-repro.md
… layers Propagate the `build-arg:SOURCE_DATE_EPOCH` opt value to the differ, to limit the upper bound of the file timestamps and set the whiteout timestamps. With this commit, the following workarounds mentioned in `docs/build-repro.md` are no longer needed for reproducible builds: > ```dockerfile > # Limit the timestamp upper bound to SOURCE_DATE_EPOCH. > # Workaround for moby#3180 > ARG SOURCE_DATE_EPOCH > RUN find $( ls / | grep -E -v "^(dev|mnt|proc|sys)$" ) -newermt "@${SOURCE_DATE_EPOCH}" -writable -xdev | xargs touch --date="@${SOURCE_DATE_EPOCH}" --no-dereference > ``` > ```dockerfile > # Squash the entire stage for resetting the whiteout timestamps. > # Workaround for moby#3168 > FROM scratch > COPY --from=0 / / > ``` Limitations: * containerd 1.7 is needed for the containerd worker mode. Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
415a7ae
to
a7bc2f0
Compare
|
@tonistiigi Could you take a look? |
Propagate the
build-arg:SOURCE_DATE_EPOCHopt value to the differ, to limit the upper bound of the file timestamps and set the whiteout timestamps.With this commit, the following workarounds mentioned in
docs/build-repro.mdare no longer needed for reproducible builds:Limitations:
Backup
ac6f97b (withdrew due to dependency on ctx)