Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stage_executor: re-use all the layers from cache for squashed case and commit as late as possible. #3674

Merged
merged 1 commit into from Dec 16, 2021

Conversation

flouthoc
Copy link
Collaborator

@flouthoc flouthoc commented Dec 14, 2021

Current implementation of marking for re-use of cache commits on every
stage when used with --squashed however we should try to re-use as
many layers are possible if --layers is specified and commit only on
last instruction of last stage to perform final squash.

Closes: #3665

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Dec 14, 2021

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: flouthoc

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@flouthoc flouthoc changed the title stage_executor: re-use all the layers from cache for squashed case and commit as late as possible. stage_executor: re-use all the layers from cache for squashed case and commit as late as possible. Dec 14, 2021
@flouthoc
Copy link
Collaborator Author

flouthoc commented Dec 14, 2021

@Romain-Geissler-1A Could you try following PR with buildah build --layers --squash, this should re-use all the layers from cache except the last instruction of last stage where final commit/squash is performed.

As a result final sha would change for last step but all the other layers would be re-used from cache.

@flouthoc
Copy link
Collaborator Author

@nalind @rhatdan PTAL

@rhatdan
Copy link
Member

rhatdan commented Dec 14, 2021

LGTM

@nalind
Copy link
Member

nalind commented Dec 14, 2021

Do we have a test that ensures we don't reuse an image when we shouldn't?

@flouthoc
Copy link
Collaborator Author

@nalind The cache logic is same as a regular build. Following commit just ensures that for --squash builds we commit later than sooner.

Probably a check in this existing test to verify that sha of a regular build is not equivalent to sha of a build with --squash would that work ?

@TomSweeneyRedHat
Copy link
Member

LGTM
once the testing questions are answered.

@Romain-Geissler-1A
Copy link

Romain-Geissler-1A commented Dec 14, 2021

@flouthoc There is something I don't get. Should I test like you by doing a first build WITHOUT --squash to save the intermediate layers ?

Because, if I don't, it doesn't cache anything. I am testing inside docker, started with --privileged and a variant of this Dockerfile: https://github.com/containers/buildah/blob/main/contrib/buildahimage/upstream/Dockerfile

[root@b39d41335c2b src]# buildah version
Version:         1.24.0-dev
Go Version:      go1.16.11
Image Spec:      1.0.2-dev
Runtime Spec:    1.0.2-dev
CNI Spec:        1.0.0
libcni Version:  v1.0.1
image Version:   5.17.0
Git Commit:      700314d6
Built:           Tue Dec 14 20:14:23 2021
OS/Arch:         linux/amd64
BuildPlatform:   linux/amd64

[root@b39d41335c2b src]# cat Dockerfile
FROM registry.access.redhat.com/ubi8/ubi:latest

RUN sleep 20

[root@b39d41335c2b src]# time buildah build --layers --squash .
STEP 1/2: FROM registry.access.redhat.com/ubi8/ubi:latest
STEP 2/2: RUN sleep 20
COMMIT
--> 4bcecbb7680
4bcecbb7680e2475efbe89724567f6859c0580873f580ddc62a0cb523f013dba

real    0m34.544s
user    0m5.576s
sys     0m6.792s
[root@b39d41335c2b src]# time buildah build --layers --squash .
STEP 1/2: FROM registry.access.redhat.com/ubi8/ubi:latest
STEP 2/2: RUN sleep 20
COMMIT
--> c4229a3fa96
c4229a3fa96b93e97634ca3a323478505a17f504c4c796f115820a70e8206091

real    0m29.289s
user    0m5.333s
sys     0m6.301s

In both cases, the "sleep 20" command is ran, while I would expect it could be skipped the second time.

Note that my real use case is obviously more complex, and uses multi stage builds.

@rhatdan
Copy link
Member

rhatdan commented Dec 14, 2021

I would say the tests should be back to back squash, although I would also expect non squash followed by squash or vice versa to work.

@flouthoc flouthoc force-pushed the cache_squashed branch 3 times, most recently from 5a0a5c0 to 41c01a1 Compare December 15, 2021 09:40
@flouthoc
Copy link
Collaborator Author

@Romain-Geissler-1A Could you please try again. I made some changes and tested with your use-case, it should work now.

Note that my real use case is obviously more complex, and uses multi stage builds.

Yes I am aware and i have tested it against various complex squash builds.

@rhatdan @nalind I'll add more tests once @Romain-Geissler-1A confirms if PR is working for him now.

…ilds

Re-use all the layers from cache for squashed case and commit as late as possible

Current implementation of marking for re-use of cache `commits` on every
`stage` when used with `--squashed` however we should try to re-use as
many layers are possible if `--layers` is specified and `commit` only on
`last instruction` of `last stage` to perform final squash.

Also treat all other layers as if they are being written for regular
build and write them cache.

Signed-off-by: Aditya Rajan <arajan@redhat.com>
@Romain-Geissler-1A
Copy link

@flouthoc Thanks, it works on my internal Amadeus case (Dockerfile is unmodified) with this version:

[root@909f6d08cf06 src]# buildah version
Version:         1.24.0-dev
Go Version:      go1.16.11
Image Spec:      1.0.2-dev
Runtime Spec:    1.0.2-dev
CNI Spec:        1.0.0 
libcni Version:  v1.0.1
image Version:   5.17.1-dev
Git Commit:      a74add5b
Built:           Wed Dec 15 13:36:25 2021
OS/Arch:         linux/amd64
BuildPlatform:   linux/amd64

@rhatdan rhatdan merged commit 2d5a299 into containers:main Dec 16, 2021
flouthoc added a commit to flouthoc/podman that referenced this pull request May 23, 2022
Buildah already supports using `--layers` with `--squash` after containers/buildah#3674
if user wants to do so hence podman must honor similar configuration
in `--squash-all` behaviour if user wants to using cache.

PS: We cannot alter behaviour of `podman build --squash` for
docker-compat reasons hence this feature can be easily supported by
`--squash-all`.

Closes: containers/buildah#4011

Signed-off-by: Aditya R <arajan@redhat.com>
flouthoc added a commit to flouthoc/podman that referenced this pull request May 23, 2022
Buildah already supports using `--layers` with `--squash` after containers/buildah#3674
if user wants to do so hence podman must honor similar configuration
in `--squash-all` behaviour if user wants to using cache.

PS: We cannot alter behaviour of `podman build --squash` for
docker-compat reasons hence this feature can be easily supported by
`--squash-all`.

Closes: containers/buildah#4011

Signed-off-by: Aditya R <arajan@redhat.com>
flouthoc added a commit to flouthoc/podman that referenced this pull request May 23, 2022
Buildah already supports using `--layers` with `--squash` after containers/buildah#3674
if user wants to do so hence podman must honor similar configuration
in `--squash-all` behaviour if user wants to using cache.

PS: We cannot alter behaviour of `podman build --squash` for
docker-compat reasons hence this feature can be easily supported by
`--squash-all`.

Closes: containers/buildah#4011

Signed-off-by: Aditya R <arajan@redhat.com>
flouthoc added a commit to flouthoc/podman that referenced this pull request May 24, 2022
Buildah already supports using `--layers` with `--squash` after containers/buildah#3674
if user wants to do so hence podman must honor similar configuration
in `--squash-all` behaviour if user wants to using cache.

PS: We cannot alter behaviour of `podman build --squash` for
docker-compat reasons hence this feature can be easily supported by
`--squash-all`.

Closes: containers/buildah#4011

Signed-off-by: Aditya R <arajan@redhat.com>
flouthoc added a commit to flouthoc/podman that referenced this pull request May 24, 2022
Buildah already supports using `--layers` with `--squash` after containers/buildah#3674
if user wants to do so hence podman must honor similar configuration
in `--squash-all` behaviour if user wants to using cache.

PS: We cannot alter behaviour of `podman build --squash` for
docker-compat reasons hence this feature can be easily supported by
`--squash-all`.

Closes: containers/buildah#4011

Signed-off-by: Aditya R <arajan@redhat.com>
flouthoc added a commit to flouthoc/podman that referenced this pull request May 26, 2022
Buildah already supports using `--layers` with `--squash` after containers/buildah#3674
if user wants to do so hence podman must honor similar configuration
in `--squash-all` behaviour if user wants to using cache.

PS: We cannot alter behaviour of `podman build --squash` for
docker-compat reasons hence this feature can be easily supported by
`--squash-all`.

Closes: containers/buildah#4011

Signed-off-by: Aditya R <arajan@redhat.com>
flouthoc added a commit to flouthoc/podman that referenced this pull request May 26, 2022
Buildah already supports using `--layers` with `--squash` after containers/buildah#3674
if user wants to do so hence podman must honor similar configuration
in `--squash-all` behaviour if user wants to using cache.

PS: We cannot alter behaviour of `podman build --squash` for
docker-compat reasons hence this feature can be easily supported by
`--squash-all`.

Closes: containers/buildah#4011

Signed-off-by: Aditya R <arajan@redhat.com>
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 14, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Building with "--squash" doesn't have any caching mechanism
5 participants