-
Notifications
You must be signed in to change notification settings - Fork 3.5k
Transfer service #7320
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Transfer service #7320
Conversation
Skipping CI for Draft Pull Request. |
4abab3b
to
fca4bd4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have a few questions/comments on the new services.
rpc Stream(stream google.protobuf.Any) returns (stream google.protobuf.Any); | ||
} | ||
|
||
message StreamInit { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the use of StreamInit
message? Could find only one usage in the below code. A comment here explaining it would be good.
newbie question: If we use a message for stream init, shouldnt we need a stream terminate also? Or is it somehow handled internally?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this is a message sent on init to notify id
api/types/transfer/imagestore.proto
Outdated
|
||
// Content filters | ||
|
||
repeated string platforms = 3; // Does this need a separate type? MatchComparer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cant we have the platforms as a separate message type of OS/Arch/Variant.? Since that is how we denote the platforms everywhere else in containerd, I think that would be more suitable here.
api/types/transfer/imagestore.proto
Outdated
|
||
// Unpack Configuration | ||
|
||
repeated string unpack_platforms = 6; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whats the difference between platforms and unpack_platforms?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should probably just change unpack_platforms
into a separate type. What it is really indicating is different unpack configurations which may be done in parallel. Normally this would be keyed on platform and snapshotter would be inferred from the platform. The client may want control over that to target the platform and the snapshotter.
api/types/transfer/import.proto
Outdated
@@ -0,0 +1,29 @@ | |||
/* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
file name needs to be changed as we have both import and export messages being defined here.
|
||
option go_package = "github.com/containerd/containerd/api/types/transfer"; | ||
|
||
message Data { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these the messages that will be used by api/streaming service?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need to document this a little more, the streaming API is only responsible for managing and identifying the stream. The protocol of the stream, including the messages sent across the stream are determined by APIs which use them. Inside the API messages, there is just a stream ID string communicated, but the context of where it is used determines the protocol. At the very least the protocol should be defined or mentioned next to where the stream ID message is in the proto files.
docs/transfer.md
Outdated
@@ -0,0 +1,32 @@ | |||
# Transfer Service | |||
|
|||
The transfer service is a simple flexible service which can be used to transfer artifact objects between a source and destination. The service determines whether the transfer between the source and destination is possible rather than the API. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The service determines whether the transfer between the source and destination is possible rather than the API.
Can we have a little more detailed explanation of what is really meant here?
docs/transfer.md
Outdated
| Image Store | Object stream (Archive) | "export" | | ||
| Object stream (Layer) | Mount/Snapshot | "unpack" | | ||
| Mount/Snapshot | Object stream (Layer) | "diff" | | ||
| Image Store | Image Store | "tag" | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does retagging an image involve using the transfer service?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could, today tagging is two API calls, Get and Create. The point is highlighting the flexibility. We will never have a Tag
API call, but you could do the equivalent using the image API or also the transfer API. The transfer API could be more useful if the image stores have separate content backends or in the case where one of the image stores was actually inside a shim or some other sandboxed environment.
docs/transfer.md
Outdated
| Object stream (Layer) | Mount/Snapshot | "unpack" | | ||
| Mount/Snapshot | Object stream (Layer) | "diff" | | ||
| Image Store | Image Store | "tag" | | ||
| Registry | Registry | mirror registry image | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When mirroring registry images, will it be a 2 step process that is abstracted
registry1 -> image store
image store -> registry2
or a direct registry -> registry
transfer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea is that the transfer interface supports it, how it is implemented is up to the transfer plugin which implements it. The caller would just give two registry arguments though
cmd/ctr/commands/images/pull.go
Outdated
@@ -66,6 +73,10 @@ command. As part of this process, we do the following: | |||
Name: "max-concurrent-downloads", | |||
Usage: "Set the max concurrent downloads for each pull", | |||
}, | |||
cli.BoolFlag{ | |||
Name: "local", | |||
Usage: "Print the resulting image's chain ID", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a typo??
6b95547
to
b167c71
Compare
docs/transfer.md
Outdated
|
||
The transfer service is a simple flexible service which can be used to transfer artifact objects between a source and destination. The flexible API allows each implementation of the transfer interface to determines whether the transfer between the source and destination is possible. This allows new functionality to be added directly by implementations without versioning the API or requiring other implementations to handle an interface change. | ||
|
||
The transfer service if built upon the core ideas put forth by the libchan project, that an API with binary streams and data channels as first class objects is more flexible and opens a wider variety of use cases without requiring constant protocol and API updates. To accomplish this, the transfer service makes use of the streaming service to allow binary and object streams to be accessible by transfer objects even when using grpc and ttrpc. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The transfer service if built upon the core ideas put forth by the libchan project, that an API with binary streams and data channels as first class objects is more flexible and opens a wider variety of use cases without requiring constant protocol and API updates. To accomplish this, the transfer service makes use of the streaming service to allow binary and object streams to be accessible by transfer objects even when using grpc and ttrpc. | |
The transfer service is built upon the core ideas put forth by the libchan project, that an API with binary streams and data channels as first class objects is more flexible and opens a wider variety of use cases without requiring constant protocol and API updates. To accomplish this, the transfer service makes use of the streaming service to allow binary and object streams to be accessible by transfer objects even when using grpc and ttrpc. |
(very minor typo 👀)
774fbf3
to
ce1262a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this looks great, only comments are on fit/finish sort of things.
I'd be for bringing this in sooner rather than later so we can iterate on it more easily and start integrating with cri.
transfer.go
Outdated
) | ||
|
||
func (c *Client) Transfer(ctx context.Context, src interface{}, dest interface{}, opts ...transfer.Opt) error { | ||
return proxy.NewTransferer(transferapi.NewTransferClient(c.conn), c.streamCreator()).Transfer(ctx, src, dest, opts...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we make the transfer service something that can be customized in the client like other services? e.g. containerd.New("", containerd.WithServices(containerd.WithTransferService(...)))
, and a way to get at just the transfer service (client.TransferService()
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we likely need the same for the stream service?
a666138
to
56b0b28
Compare
@dmcgowan In our use-case, we need to customize image management for certain pods, whereas use containerd's default implementation for all other pods. To this end, currently we've written our own snapshotter leveraging the "Remote Snapshotter" contract/interface. We still need #6899 so that we can specify a different snapshotter per runtime, to complete our implementation. In the future, do you think our requirements could be better implemented as a Transfer Service? Can we specify a separate service per pod/runtime? |
Confidential computing has the same needs. In the future, can we use Transfer Service to support image pulling inside sandbox? And is it possible to apply to #5742? Or is this just the beginning of the next step? |
Needs rebase due to merge conflict |
e7e0ff1
to
96b1b5e
Compare
15dfe4d
to
b0bf030
Compare
images images.Store | ||
|
||
// semaphore.NewWeighted(int64(rCtx.MaxConcurrentDownloads)) | ||
limiter *semaphore.Weighted |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about making this overridable on per Transfer basic (with transfer.Opt
)?
This would allow API client to choose a desired limit scope, for example:
-
Single operation - for example in one pull operation I user may not want to have more than 3 layers being downloaded concurrently. To achieve this, API client would pass a fresh
semaphore.Weighted
for each transfer. This is would be the equivalent of how current containerd'sWithMaxConcurrentDownloads
andWithMaxConcurrentUploadedLayers
behave. -
Global - user may want the concurrent download/uploads limits to be shared among all operations (within Transfer service) and possibly with other components in my code. API consumer would pass a
semaphore.Weighted
shared with other components. For instance this is what Moby engine would do if used this service in containerd integration to supportmax-concurrent-downloads
andmax-concurrent-uploads
daemon configuration options.
This also allows more specific limits to be applied, for example the API user might dynamically decide the limits, or whether he wants them at all, depending on other conditions like for example if the repository is remote or local.
Is this experimental? |
pkg/transfer/local/import.go
Outdated
tops.Progress(transfer.Progress{ | ||
Event: "saved", | ||
Name: img.Name, | ||
//Digest: img.Target.Digest.String(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
?
pkg/transfer/local/progress.go
Outdated
Event: j.transferState, | ||
Name: job.name, | ||
Parents: job.parents, | ||
//Digest: job.desc.Digest.String(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
?
pkg/transfer/transfer.go
Outdated
// Higher level implementation just takes strings and options | ||
// Lower level implementation takes pusher/fetcher? | ||
|
||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably removable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, can remove these now and refer to the docs
@@ -78,8 +78,12 @@ const ( | |||
EventPlugin Type = "io.containerd.event.v1" | |||
// LeasePlugin implements lease manager | |||
LeasePlugin Type = "io.containerd.lease.v1" | |||
// Streaming implements a stream manager | |||
StreamingPlugin Type = "io.containerd.streaming.v1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this cause confusion with stream processors?
I don't come up with a better name though 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if you are deep enough into containerd's architecture to be using stream processors it won't be too hard to understand this difference and find where it is documented. Confusion is hard to avoid but we can always attempt to address it in documentation if it becomes a frequently asked question. Also interesting here is you could potentially use the streaming API to do client side stream processors, so many there is some relationship in the future.
services/streaming/service.go
Outdated
|
||
func (ss *serviceStream) Recv() (a typeurl.Any, err error) { | ||
a, err = ss.s.Recv() | ||
if err != io.EOF { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
errors.Is
can be used here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM basically and leave some comments.
8750b92
to
c54f3ad
Compare
I'm good with getting this in and making improvements as needed in follow-ups; looks like it needs one last rebase, though. |
c54f3ad
to
d743ca4
Compare
Signed-off-by: Derek McGowan <derek@mcg.dev>
Disable using transfer service by default for now Signed-off-by: Derek McGowan <derek@mcg.dev>
Signed-off-by: Derek McGowan <derek@mcg.dev>
Signed-off-by: Derek McGowan <derek@mcg.dev>
Signed-off-by: Derek McGowan <derek@mcg.dev>
Signed-off-by: Derek McGowan <derek@mcg.dev>
Signed-off-by: Derek McGowan <derek@mcg.dev>
Signed-off-by: Derek McGowan <derek@mcg.dev>
29a3ab0
to
82ce861
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
agree with Phil this is good to merge and iterate on..
Nice!
Signed-off-by: Derek McGowan <derek@mcg.dev>
Signed-off-by: Derek McGowan <derek@mcg.dev>
82ce861
to
f881625
Compare
/ok-to-test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Adds transfer service and associated streaming service
See #7592
Initial version is ready for review, for requested changes which do not change the API, follow ups will be tracked in the transfer service issue