Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

containerd integration: docker diff #44964

Merged
merged 3 commits into from Mar 30, 2023
Merged

Conversation

laurazard
Copy link
Member

- What I did

Implement docker diff with containerd store

- How I did it

Upstream:

- How to verify it

(run a container and create a file)

  • docker run -it --name hello alpine
  • / # echo world > world.txt && exit

(verify diff works)

  • docker diff hello

- Description for the changelog

- A picture of a cute animal (not mandatory but encouraged)

image

@laurazard laurazard marked this pull request as draft February 9, 2023 14:12
@laurazard laurazard marked this pull request as ready for review February 9, 2023 14:32
@rumpl rumpl added area/images containerd-integration Issues and PRs related to containerd integration labels Feb 13, 2023
@laurazard
Copy link
Member Author

/cc @neersighted @thaJeztah @vvoland

Rebased this one (suffered from the same issues as #44934 that were solved by the changes to store the manifest in #44958), PTAL when you get a chance!

@thaJeztah thaJeztah added this to the v-next milestone Mar 8, 2023
daemon/containerd/image_changes.go Outdated Show resolved Hide resolved
daemon/containerd/image_changes.go Outdated Show resolved Hide resolved
Comment on lines 68 to 77
if m.Type == "overlay" {
opts := make([]string, 0, len(m.Options))
upper := ""
for _, o := range m.Options {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I must admit that I'd have to dig deeper into some of this to fully understand, but this part feels scary. Does this mean we need to be aware of semantics of each possible storage driver / snapshotter? (Could other snapshotters such as btrfs, zfs, others..) also need specific handling?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very naive thoughts;

I guess I'm wondering;

  • As "creating a snapshot" involves "creating a diff"
  • Possibly BuildKit also having implementations to calculate diffs
  • How much code does already exist that could provide us "here's what changed?" (could we piggy-back on top of that?).
  • Basically along the lines of give me a snapshot ("docker commit?"), and tell me what's in there

Probably the complication may be in "changed vs new files" (both a file that was updated or a file that was new will be included in the snapshot, but for some snapshotters metadata-only changes could possibly be stored different?)

I guess this would be much out of scope for this PR. Perhaps @tonistiigi or @crazy-max may have more insight into this area and have some ideas on this (maybe others?)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This piece of code might need a comment or something. This is needed so that we can mount the image a second time but in read only. I know that Buildkit has kinda the same code somewhere, @tonistiigi even opened a PR to containerd to add this ability (I think) but was never merged.

We need this to be read only for overlayfs because if we don't the mounts are in an undefined state and you get a lot of "No found" errors, basically nothing works.

As I understand it, this is only an issue for the overlayfs driver, other drivers handle multiple mounts with no issues.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the extra details. Yes, I understood we needed a read-only mount here (but wasn't sure if other drivers could have their own "special needs"). We should definitely have a comment in this part of the code describing some of this, and we should try to resurrect the work in Containerd if possible.

The part that somewhat scares me with such code is that there may be other options to be used when mounting, such as

  • userxattr and index=on/off;
    opts = indexOff + userxattr + "lowerdir=" + diffDir + ":" + strings.Join(absLowers, ":")
  • selinux labels;
    mountData := label.FormatMountLabel(opts, mountLabel)
  • for the mount-paths, "special trickery" to make sure we can mount all layers
    // Use relative paths and mountFrom when the mount data has exceeded
    // the page size. The mount syscall fails if the mount data cannot
    // fit within a page and relative links make the mount data much
    // smaller at the expense of requiring a fork exec to chroot.
    if len(mountData) > pageSize-1 {

Missing some of those may be hard to miss ("it works" in most cases ... until it doesn't), which can make them hard to debug.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I came back to this PR to so much discussion 😅 thanks! I wonder what we should do then. I can try to investigate snapshotter.View here, and in the meantime I guess we're also stuck waiting for that PR in containerd adding the readOnly special handling code. Are we otherwise okay with adding the code for now (possibly with a comment + link to the BuildKit code)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we're "stuck" -- that PR appears to have stalled in review because the contributor lost interest. I think anyone interested could pick it up and get it past the review hurdles; now is probably better than later as well as containerd 1.7 is about to be cut in the next couple weeks.

Copy link
Member Author

@laurazard laurazard Mar 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking back at @thaJeztah's comments, I wonder – a lot of the code between docker diff and docker commit should be the same. I wonder if we can refactor some of that and share it, we're already using rootfs.CreateDiff() in the docker commit code, I think this could be easier and a bit more streamlined and we could reuse some code here Oooooh but we don't really want to put it in the content store, hmm

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like snapshotter.View doesn't really do what we want here – we use snapshotter.Mounts to get the active snapshot we've previously created for the container in PrepareSnapshot w/ snapshotter.Prepare. snapshotter.View just gets a new readonly view for the parent, which isn't what we want in docker diff.

I'll try to get that code merged into containerd, and in the meantime leave this as is with a todo to use the containerd code when it gets merged, if that's alright.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@thaJeztah
Copy link
Member

thaJeztah commented Mar 8, 2023

@laurazard @rumpl ⚠️ I just realised there's another PR to be upstreamed which removes the readOnly function and moves it to oci.ReadonlyMounts() (also refactoring the ImageService.Changes() function;

Looks like we may want to include (at least some of that) in this PR (we could add a separate commit that implements the oci.ReadonlyMounts())

@rumpl
Copy link
Member

rumpl commented Mar 8, 2023

Oh boy… how did I forget that one

@laurazard
Copy link
Member Author

Took a look at rumpl#92 and added a commit moving readOnly() over to oci.ReadonlyMounts(). I still want to look at/play around a little bit with snapshotter.View though, reading it I can't really see any obvious issues.

github.com/containerd/containerd v1.6.19
github.com/containerd/containerd v1.6.20-0.20230322235238-de33abf0547c
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For others reviewing this; the upstream changes in containerd have been accepted and backported to the 1.6 and 1.7 release branches. I discussed options for this PR with @laurazard ;

  1. Stick with the current 1.6.19 version of containerd, keep the local copy of that code, and add a TODO: remove temporary copy of the containerd code
  2. Temporarily upgrade the containerd dependency to "tip of release/1.6"

Given that this PR is targeting master / v-next, and we don't have a GA release for it yet, we chose for 2., anticipating that containerd will have a new patch-release before we tag v-next (and if there's no patch release, we can ask the containerd maintainers to do one).

@@ -60,6 +61,7 @@ type ImageService interface {
GetContainerLayerSize(ctx context.Context, containerID string) (int64, int64, error)
Mount(ctx context.Context, container *container.Container) error
Unmount(ctx context.Context, container *container.Container) error
Changes(ctx context.Context, container *container.Container) ([]archive.Change, error)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a follow-up, we must look into this signature.

When working on the CLI I was wondering why pkg/archive was imported in some location where I wouldn't expect it, and I stumbled upon this code; https://github.com/docker/cli/blob/14482589df194a86b2ee07df643ba3277b40df7d/cli/command/container/formatter_diff.go#L56-L67

func (d *diffContext) Type() string {
	var kind string
	switch d.c.Kind {
	case archive.ChangeModify:
		kind = "C"
	case archive.ChangeAdd:
		kind = "A"
	case archive.ChangeDelete:
		kind = "D"
	}
	return kind
}

It turns out that the Client returns its own type for this, but that type does not have the enum values;

func (cli *Client) ContainerDiff(ctx context.Context, containerID string) ([]container.ContainerChangeResponseItem, error) {

type ContainerChangeResponseItem struct {

We should consider to either

  • add the enums to the API types
  • have the backend return the correct type
  • have a utility to convert the type
  • (possibly alias one for the other if needed)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

return nil, err
}

snapshotter := i.client.SnapshotService(i.snapshotter)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this is also one of those places where we still need to decide if we want to have the container as source of truth for the snapshotter to use;

Suggested change
snapshotter := i.client.SnapshotService(i.snapshotter)
snapshotter := i.client.SnapshotService(container.Driver)

But we can look at those "as a whole" and change them all at once.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, there's been a few fragmented discussions about this, maybe we can bring it up in one of the maintainers calls and get some input/come to a decision about what we'd like to do. For now I'll leave this one like this.

@thaJeztah
Copy link
Member

BuildKit test continues failing a few times now. I wonder if there could be changes in containerd (containerd/containerd@v1.6.19...de33abf) that require the test to be updated (test output mentions something about "the digest may change");

Possible suspect: containerd/containerd@f0dc029

--- FAIL: TestIntegration (1.45s)
    --- SKIP: TestIntegration/TestCmdShell/worker=dockerd-containerd/frontend=client (0.24s)
    --- FAIL: TestIntegration/TestReproSourceDateEpoch/worker=dockerd-containerd/frontend=client (11.91s)
    --- FAIL: TestIntegration/TestReproSourceDateEpoch/worker=dockerd-containerd/frontend=builtin (10.92s)
=== CONT  TestIntegration/TestReproSourceDateEpoch/worker=dockerd-containerd/frontend=client
    dockerfile_test.go:6540: SOURCE_DATE_EPOCH=1673354096
    dockerfile_test.go:6604: OCI archive digest="sha256:e26093cc8a7524089a1d0136457e6c09a34176e2b2efcf99ac471baa729c7dc9"
    dockerfile_test.go:6605: The digest may change depending on the BuildKit version, the snapshotter configuration, etc.
    dockerfile_test.go:6606: 
        	Error Trace:	/src/frontend/dockerfile/dockerfile_test.go:6606
        	            				/src/frontend/dockerfile/run.go:87
        	            				/src/frontend/dockerfile/run.go:196
        	Error:      	Not equal: 
        	            	expected: "sha256:9e36395384d073e711102b13bd0ba4b779ef6afbaf5cadeb77fe77dba8967d1f"
        	            	actual  : "sha256:e26093cc8a7524089a1d0136457e6c09a34176e2b2efcf99ac471baa729c7dc9"
        	            	
        	            	Diff:
        	            	--- Expected
        	            	+++ Actual
        	            	@@ -1 +1 @@
        	            	-sha256:9e36395384d073e711102b13bd0ba4b779ef6afbaf5cadeb77fe77dba8967d1f
        	            	+sha256:e26093cc8a7524089a1d0136457e6c09a34176e2b2efcf99ac471baa729c7dc9
        	Test:       	TestIntegration/TestReproSourceDateEpoch/worker=dockerd-containerd/frontend=client



=== CONT  TestIntegration/TestReproSourceDateEpoch/worker=dockerd-containerd/frontend=builtin
    dockerfile_test.go:6540: SOURCE_DATE_EPOCH=1673354096
    dockerfile_test.go:6604: OCI archive digest="sha256:e26093cc8a7524089a1d0136457e6c09a34176e2b2efcf99ac471baa729c7dc9"
    dockerfile_test.go:6605: The digest may change depending on the BuildKit version, the snapshotter configuration, etc.
    dockerfile_test.go:6606: 
        	Error Trace:	/src/frontend/dockerfile/dockerfile_test.go:6606
        	            				/src/frontend/dockerfile/run.go:87
        	            				/src/frontend/dockerfile/run.go:196
        	Error:      	Not equal: 
        	            	expected: "sha256:9e36395384d073e711102b13bd0ba4b779ef6afbaf5cadeb77fe77dba8967d1f"
        	            	actual  : "sha256:e26093cc8a7524089a1d0136457e6c09a34176e2b2efcf99ac471baa729c7dc9"
        	            	
        	            	Diff:
        	            	--- Expected
        	            	+++ Actual
        	            	@@ -1 +1 @@
        	            	-sha256:9e36395384d073e711102b13bd0ba4b779ef6afbaf5cadeb77fe77dba8967d1f
        	            	+sha256:e26093cc8a7524089a1d0136457e6c09a34176e2b2efcf99ac471baa729c7dc9
        	Test:       	TestIntegration/TestReproSourceDateEpoch/worker=dockerd-containerd/frontend=builtin

@thaJeztah
Copy link
Member

I see the test was updated twice

@thaJeztah
Copy link
Member

@laurazard
Copy link
Member Author

laurazard commented Mar 29, 2023

Added a commit to skip the failing BuildKit tests: see #44964 (comment) - moby/buildkit#3736 (took some inspiration from https://github.com/moby/moby/pull/45112/files)

@thaJeztah
Copy link
Member

vendor validation got stuck twice in a row (but don't see any output) 🤔
https://github.com/moby/moby/actions/runs/4558753083/jobs/8043710755?pr=44964

I thought we had timeouts on GitHub actions, but I see it's been running for nearly 6 hours now 🙀

Screenshot 2023-03-30 at 09 16 43

@thaJeztah
Copy link
Member

Yup, looks like it's using the default 6-hours timeout

Screenshot 2023-03-30 at 11 20 25

…` and vendor

Signed-off-by: Laura Brehm <laurabrehm@hey.com>
Signed-off-by: Laura Brehm <laurabrehm@hey.com>
(see moby/buildkit#3736)

Signed-off-by: Laura Brehm <laurabrehm@hey.com>
Copy link
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like it's going green now 🎉

LGTM

Copy link
Contributor

@vvoland vvoland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM; just have one nit regarding the comment

@@ -119,5 +119,6 @@ jobs:
TEST_DOCKERD: "1"
TEST_DOCKERD_BINARY: "./build/moby/dockerd"
TESTPKGS: "./${{ matrix.pkg }}"
TESTFLAGS: "-v --parallel=1 --timeout=30m --run=//worker=${{ matrix.worker }}$"
# Skip buildkit tests checking the digest (see https://github.com/moby/buildkit/pull/3736)
TESTFLAGS: "-v --parallel=1 --timeout=30m --run=/^Test([^R]|.[^e]|..[^p]|...[^r]|....[^o]|.....[^S])/worker=${{ matrix.worker }}$"
Copy link
Contributor

@vvoland vvoland Mar 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OOF 🙈
Could we also include the test name TestReproSourceDateEpoch in the comment? Just to make it easier to understand what is expected to be skipped without having to scan through the linked PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/images containerd-integration Issues and PRs related to containerd integration status/2-code-review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants