Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: update buildkit adapters to support hardlink merges. #43661

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

sipsma
Copy link
Contributor

@sipsma sipsma commented May 29, 2022

BuildKit's mergeop implementation relies on inspecting mount options in
order to implement the optimization where merged directories are created
using hardlinks from source files instead of copies. This is needed in
the overlay case to ensure that hardlinks are made from underlying
lowerdirs rather than from already mounted overlays.

The main difficulty is that the graphdriver model creates mounts for
callers and then just expects that they bind mount that somewhere, which
is a significant deviation from the containerd model BuildKit uses where
mount options are returned and callers are expected to mount those
themselves.

This update is a quite ugly but solves the problem by bolting on methods
to the relevant graphdriver implementations that enable retrieving mount
options directly instead of having the mounts be made on the callers
behalf.

Signed-off-by: Erik Sipsma erik@sipsma.dev


Ref: #43415

cc @tonistiigi @crazy-max

Rebased and did some minor cleanups (though overall the change is still quite ugly). This worked last time I tried, but when I was just trying to test again using the buildkit integ tests I was getting failures with the error failed to compute cache key: unlazy requires an applier. Is that a known issue? It seems unrelated to my exact change here, but not sure. Haven't dug in yet.

Either way, let me know if the change here is worth cleaning up more. Like I said in the commit message, it's quite ugly. Open to suggestions on other approaches (though I am limited in the time I can spend on this so updates may take a while).

BuildKit's mergeop implementation relies on inspecting mount options in
order to implement the optimization where merged directories are created
using hardlinks from source files instead of copies. This is needed in
the overlay case to ensure that hardlinks are made from underlying
lowerdirs rather than from already mounted overlays.

The main difficulty is that the graphdriver model creates mounts for
callers and then just expects that they bind mount that somewhere, which
is a significant deviation from the containerd model BuildKit uses where
mount options are returned and callers are expected to mount those
themselves.

This update is a quite ugly but solves the problem by bolting on methods
to the relevant graphdriver implementations that enable retrieving mount
options directly instead of having the mounts be made on the callers
behalf.

Signed-off-by: Erik Sipsma <erik@sipsma.dev>
Copy link
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only just glancing over the changes (left some ramblings)

Comment on lines +114 to +117
// Get the direct mounts for the filesystem referred to by this id.
// This will cause calls to Get on the same id in the same driver to
// return an error (until all direct mounts have been released with the
// PutDirectMounts method).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To keep the linters happy (start with GetDirectMounts)

Suggested change
// Get the direct mounts for the filesystem referred to by this id.
// This will cause calls to Get on the same id in the same driver to
// return an error (until all direct mounts have been released with the
// PutDirectMounts method).
// GetDirectMounts returns the direct mounts for the filesystem referred to
// by id.
//
// This will cause calls to Get on the same id in the same driver to
// return an error (until all direct mounts have been released with the
// PutDirectMounts method).

Comment on lines +119 to +120
// Release a mount returned by GetDirectMounts. When all direct mounts
// have been released, the underlying filesystem directories are cleaned up.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Release a mount returned by GetDirectMounts. When all direct mounts
// have been released, the underlying filesystem directories are cleaned up.
// PutDirectMounts releases a mount returned by GetDirectMounts. When all
// direct mounts have been released, the underlying filesystem directories
// are cleaned up.

// have been released, the underlying filesystem directories are cleaned up.
PutDirectMounts(id string) error
}

// Capabilities defines a list of capabilities a driver may implement.
// These capabilities are not required; however, they do determine how a
// graphdriver can be used.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering if this functionality should be reflected in Capabilities as well? (not sure though! my eye just fell on this, so thinking out loud here)

Comment on lines +101 to +104
// this is the name that buildkit checks to enable
// overlay optimizations, better approach would be
// to update buildkit to check for something like
// "moby-overlay2"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would there be a way to extend the Snapshotter interface and/or have an interface for feature detection for this? (string-matching the name feels a bit brittle indeed)

/cc @tonistiigi @crazy-max

Comment on lines +147 to +150
// Get the direct mounts for the filesystem underlying this layer.
// This will cause calls to Mount calls on this layer to return an
// error (until all direct mounts have been released with the
// PutDirectMounts method).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Get the direct mounts for the filesystem underlying this layer.
// This will cause calls to Mount calls on this layer to return an
// error (until all direct mounts have been released with the
// PutDirectMounts method).
// GetDirectMounts returns the direct mounts for the filesystem underlying
// this layer.
// This will cause calls to Mount calls on this layer to return an
// error (until all direct mounts have been released with the
// PutDirectMounts method).

Comment on lines +152 to +153
// Release a mount returned by GetDirectMounts. When all direct mounts
// have been released, the underlying filesystem directories are cleaned up.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Release a mount returned by GetDirectMounts. When all direct mounts
// have been released, the underlying filesystem directories are cleaned up.
// PutDirectMounts releases a mount returned by GetDirectMounts. When all
// direct mounts have been released, the underlying filesystem directories
// are cleaned up.

@@ -13,6 +13,7 @@ import (
"errors"
"io"

ctdmount "github.com/containerd/containerd/mount"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did mount conflict with something in this file? (I was looking at other places where we didn't use an alias 😅)

(basically, I started looking if we already used containerd's mount.Mount type, or if we should be using a local type, but I see it's already used in some places ☺️); also got slightly confused as container had both a types.Mount and mount.Mount (the former having a conversion to the latter).

// does the mount and then return a ContainerFS).
// It's currently only intended to be used with the BuildKit adapter to
// enable optimizations that require knowing the actual underlying mount.
type DirectMountRWLayer interface {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering if this interface should be defined on the receiver end (in builder-next/adapters/snapshotter) 🤔

@thaJeztah thaJeztah added this to the 22.06.0 milestone Jun 23, 2022
@thaJeztah thaJeztah modified the milestones: 22.06.0, v-next Nov 3, 2022
@rumpl rumpl modified the milestones: 24.0.0, v-future Apr 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants