Propagate parent chain ID and diff ID via labels during snapshot preparation#13071
Propagate parent chain ID and diff ID via labels during snapshot preparation#13071samuelkarp merged 1 commit intocontainerd:mainfrom
Conversation
6a78431 to
5849afa
Compare
faf0d68 to
4a55c1f
Compare
| labelSnapshotRef = "containerd.io/snapshot.ref" | ||
| unpackSpanPrefix = "pkg.unpack.unpacker" | ||
| labelSnapshotRef = "containerd.io/snapshot.ref" | ||
| labelSnapshotParent = "containerd.io/snapshot/parent" |
There was a problem hiding this comment.
There was a problem hiding this comment.
How does the new commit look?
There was a problem hiding this comment.
What do you mean by the final chain ID? You have the parent chain ID, the diff ID of the layer, and the chain ID of the layer itself. If you're looking for children, that can change since a single parent layer can easily have multiple children. So recording "a" final chain ID might be true for one unpack and immediately false for the next one. Am I understanding correctly?
There was a problem hiding this comment.
What do you mean by the final chain ID? You have the parent chain ID, the diff ID of the layer, and the chain ID of the layer itself. If you're looking for children, that can change since a single parent layer can easily have multiple children. So recording "a" final chain ID might be true for one unpack and immediately false for the next one. Am I understanding correctly?
I know one layer can still have children if this image is used as the base image.
here labelSnapshotRootfsChainID just means the recoded chain ID belonging to a final layer of some real container image so that we don't need to generate fsmerge image for every snapshot. yes, it can still be a middle layer of another container image, but it doesn't matter since another container image will have its own labelSnapshotRootfsChainID layer and fsmerge too.
The whole point is to identify the special layers so that we only need to generate fsmerge for these layers rather than every snapshot, it's waste of CPU time and do harm to the performance.
There was a problem hiding this comment.
Anyway it's just an idea for better EROFS fsmerge generation performance so that we don't need to strictly generate fsmerge when launching the container instead (we really need to know the special layers which are the final layer of container images)
If it sounds unclean, I'm fine to drop, but I hope there could be alternative way for this (Although I think labels are the only way to pass down information from unpacker to the snapshotter).
There was a problem hiding this comment.
I reverted the changes for labelSnapshotRootfsChainID to unblock the other changes. Let me know if this is okay.
There was a problem hiding this comment.
I reverted the changes for labelSnapshotRootfsChainID to unblock the other changes. Let me know if this is okay.
I'm fine with that, but I don't get some response from @samuelkarp and other folks. I still wonder what's wrong with that.
There was a problem hiding this comment.
I'm fine with that, but I don't get some response from @samuelkarp and other folks. I still wonder what's wrong with that.
Time zones and weekends.
The whole point is to identify the special layers so that we only need to generate fsmerge for these layers rather than every snapshot, it's waste of CPU time and do harm to the performance.
Does this mean you're just looking for "is this layer a top layer"? I think that'd be a more reasonable thing to add than recording a final chain ID for an intermediate snapshot (where there can easily end up being multiple final chain IDs because of inheritance).
But I think we should do that in a separate PR from this one.
There was a problem hiding this comment.
I'm fine with that, but I don't get some response from @samuelkarp and other folks. I still wonder what's wrong with that.
Time zones and weekends.
Yeah, sorry about that.
The whole point is to identify the special layers so that we only need to generate fsmerge for these layers rather than every snapshot, it's waste of CPU time and do harm to the performance.
Does this mean you're just looking for "is this layer a top layer"? I think that'd be a more reasonable thing to add than recording a final chain ID for an intermediate snapshot (where there can easily end up being multiple final chain IDs because of inheritance).
But I think we should do that in a separate PR from this one.
ok, I think "is this layer a top layer" as a boolean value is cleaner too, I will try in this way instead, thanks.
Signed-off-by: Hasan Siddiqui <hasiddiqui@google.com>
|
/retest |
Implement #13070.