-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support runtime level snapshotter for issue 6657 #6899
Support runtime level snapshotter for issue 6657 #6899
Conversation
Hi @shuaichang. Thanks for your PR. I'm waiting for a containerd member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
a42c17b
to
6aaea06
Compare
6aaea06
to
366d89a
Compare
pkg/cri/server/container_create.go
Outdated
containerd.WithSnapshotter(c.config.ContainerdConfig.Snapshotter), | ||
containerd.WithSnapshotter(c.runtimeSnapshotter(ociRuntime)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does it work with criService.snapshotStore
? In the following-up PRs, changes might be needed for some other APIs that use snapshotStore
(e.g. ImageFsInfo, ContainerStats, etc.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good points.
- for ContainerStats this should be no change of behavior, the snapshot info should still be accurate for different snapshotters.
- For ImageFsInfo, we probably want to do something like the following. BTW, the device mapper snapshotter seems to be using
/var/lib/containerd/io.containerd.snapshotter.v1.devmapper/
as the mount point, which does not really mean anything, but seems to be reasonable to return snapshotter usages separately.
I can address the ImageFsInfo in a following-up PR.
9430e3f
to
afe2ec3
Compare
@AkihiroSuda @ktock @dmcgowan @Random-Liu kindly ping to check if you have more comments? Would appreciate if I can get the action items list to get to |
/ok-to-test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR! Couple comments
I'm going to take a more thorough look at this, but my main concern is that there could be cases that arise in CRI where a snapshotter is implied (or one is needed in a context where we don't have a runtime handler), and those parts may not play nicely with this. For instance:
|
nod.. and a "KEP adding the runtime handler to the image services part of CRI" is a must have for both (n-snapshotters 1/runtime class), and for in VM guest pulled images work (confidential containers / kata / ...), and we'll have to cover the extra aggregation and filtering work for image presence, info, and stats required in the sync/metrics/gc parts of kubelet due to the per runtime expansion. I'm looking at this as experimental work as a part of the larger advancement of the CRI image service. |
IMO, it is easy to handle the bookkeeping in containerd side. But the question The eviction manager relies on this If the main storage path is used for overlayfs but the most of snapshots uses devicemapper, // https://github.com/kubernetes/cri-api/blob/master/pkg/apis/runtime/v1/api.proto#L1400
// FilesystemUsage provides the filesystem usage information.
message FilesystemUsage {
// Timestamp in nanoseconds at which the information were collected. Must be > 0.
int64 timestamp = 1;
// The unique identifier of the filesystem.
FilesystemIdentifier fs_id = 2;
// UsedBytes represents the bytes used for images on the filesystem.
// This may differ from the total bytes used on the filesystem and may not
// equal CapacityBytes - AvailableBytes.
UInt64Value used_bytes = 3;
// InodesUsed represents the inodes used by the images.
// This may not equal InodesCapacity - InodesAvailable because the underlying
// filesystem may also be used for purposes other than storing images.
UInt64Value inodes_used = 4;
}
message ImageFsInfoResponse {
// Information of image filesystem(s).
repeated FilesystemUsage image_filesystems = 1;
} I just bring one case here. But I think it will be big change between kubelet and CRI-container |
@kevpar these are definitely great questions, as @mikebrow and @fuweid the comprehensive picture seems to be a bigger project and should be driven by a Kep. For now if we can provide an experimental feature, that should unblock us and potentially some other use cases.
A side note, I think for many snapshotter other than overlayfs, kubelet usage accounting is already not reliable. For example for devmapper, it counts I think all these deserve a KEP for a re-think on kubelet or CRI interactions with different snapshotters.
When a pod using snapshotter2 is scheduled on the node with image unpacked with snapshotter1, containerd has the logic to identify the image in |
@mikebrow addressed the following comments:
Done
Done |
@fuweid totally agree that CRI is a better place for bookkeeping. I think kubelet has difficulty to get total usage for most snapshotters other than overlayfs. Seems to me |
|
@dmcgowan I think the feature is only related to containerd, could you point me to the one that's owned by containerd? |
a46e4a8
to
6b9307a
Compare
As @dmcgowan suggested, I've updated the PR to use annotation name within the containerd namespace
@mikebrow @fuweid @kevpar @AkihiroSuda please feel free to comment if there's objection/concerns for the naming, thanks! |
6b9307a
to
74adc65
Compare
Signed-off-by: shuaichang <shuai.chang@databricks.com> Updated annotation name
74adc65
to
7b9f1d4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@shuaichang Will this feature be available in release 1.6.9? We are interested in the ability to specify a different snapshotter per runtime, but would like to avail of this feature without having to build containerd ourselves. |
Related with containerd#6899 The patch fixs the handle of sandbox run and container create. Signed-off-by: Yan Song <imeoer@linux.alibaba.com>
Related with containerd#6899 The patch fixs the handle of sandbox run and container create. Signed-off-by: Yan Song <imeoer@linux.alibaba.com>
What is this change
This is the implementation for #6657, the following lists core ideas:
snapshotter
option in CRI per-runtime configio.kubernetes.cri/runtimehandler
to specify runtime during pull time such that the appropriate snapshotter can be used.Pod sandbox config with annotations specifying the runtime, and image_pull will get the correct snapshotter for unpacking
Testings done
"io.kubernetes.cri/runtimehandler": "my-runtime"
, image is unpacked withdevmapper
overlayfs
my-runtime
, sandbox is created withdevmapper
snapshotteroverlayfs
snapshottermy-runtime
pod, container usesdevmapper
snapshotteroverlayfs
snapshotter"io.kubernetes.cri/runtimehandler": "my-runtime"
still used default snapshotter"io.kubernetes.cri/runtimehandler": "my-runtime"
, using default snapshotter"io.kubernetes.cri/runtimehandler": "my-runtime"
, using default snapshotterTest logs
"io.kubernetes.cri/runtimehandler": "my-runtime"
overlayfs
my-runtime
, sandbox is created withdevmapper
snapshotteroverlayfs
snapshotteroverlayfs
snapshotteroverlayfs
snapshotter"io.kubernetes.cri/runtimehandler": "my-runtime"
still used default"io.kubernetes.cri/runtimehandler": "my-runtime"
, using default snapshotter"io.kubernetes.cri/runtimehandler": "my-runtime"
, using default snapshotter