-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support syncing to S3 #704
Conversation
a49657a
to
f199ab0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@justinsb -- AFK for the moment, but can you check if it would instead make sense to implement portions of this within https://github.com/kubernetes-sigs/release-sdk/blob/main/object/store.go?
@justaugustus I'm not sure that API is a great match here, as it looks like in order to copy files across providers we would need providers to know about each other. (We have an pretty broad vfs library here if you want generic functionality , I think the abstractions there are a bit more evolved.) But for this PR, the real nuance is in carefully setting the metadata etc to make the common case fast (i.e. the case when nothing has to be copied). That's why we have to use a non-parallel object upload, so that the ETag is the MD5 hash. I think for integrity purposes we also want to upload additional hashes as metadata, but we're really very optimized for our particular use-case here. I'm happy to evolve the API in release-sdk, but I'd probably steer it towards something with objects paths as first class concepts; WDYT about just sharing the vfs layer with kOps? |
TIL about https://github.com/kubernetes/kops/tree/master/util/pkg/vfs!
@justinsb -- Happy for you to move forward using kOps vfs here, if it's going to dedupe some of the work. Are changes to |
Definitely open to putting vfs somewhere shared :-) It's already shared between etcdadm and kOps, so it is generally useful. It's pretty stable although it's recently seen a flood of activity because we're adding context.Context as part of the great logging/tracing rewrite. I was also thinking though that the API we have here is probably a reasonable minimal API, and we have higher level functionality (like folder-to-folder syncing) in kpromo. So we could also just move the filestores. I think two good options are to put kOps VFS in a standalone repo somewhere (because it's not really only release tooling, either) or to sync kpromo's file abstractions with release-sdk (they are more logically coupled). I would ask though if we can merge this PR in the interim though, it's all part of the S3 mirroring that reduces the CNCF expenditure on egress bandwidth - we finally have AWS buckets :-) |
Yep, let's just import what makes sense here and minimize the pain! |
Thanks @justaugustus ... I was thinking that this and release are actually pretty specific functionality, for example I think we want to upload in a potentially suboptimal way to preserve metadata, we might want to force the metadata always to be created etc, so it belongs in a library that promo-tools and the rest of the release tooling share,it's not a generic VFS layer. So I propose we merge this here, and then I'll submit some refactorings both to release-sdk and here so that we can have a shared library in release-sdk and use it here (i.e. I'll try upstream the code here and refactor release-sdk to avoid calling gsutil where it isn't applicable). I haven't poked around where all the places release-sdk is used yet though, not entirely sure what I'm signing up for yet :-) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@justinsb -- Left some nits. If you want to fix them here, go for it.
If you want to do these as a follow-up, feel free to release the hold.
/hold
promoter/file/filestore.go
Outdated
@@ -62,6 +61,25 @@ func openFilestore( | |||
ctx context.Context, | |||
filestore *api.Filestore, | |||
useServiceAccount, confirm bool, | |||
) (syncFilestore, error) { | |||
if strings.HasPrefix(filestore.Base, "gs://") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if strings.HasPrefix(filestore.Base, "gs://") { | |
if strings.HasPrefix(filestore.Base, object.GcsPrefix) { |
promoter/file/filestore.go
Outdated
if strings.HasPrefix(filestore.Base, "gs://") { | ||
return openGCSFilestore(ctx, filestore, useServiceAccount, confirm) | ||
} | ||
if strings.HasPrefix(filestore.Base, "s3://") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment about defining the scheme as above.
promoter/file/filestore.go
Outdated
return nil, fmt.Errorf( | ||
"unrecognized scheme %q (supported schemes: s3, gs)", | ||
filestore.Base, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's define something like:
var supportedURISchemes := []string{
"s3",
"gs",
}
And then:
return nil, fmt.Errorf(
"unrecognized scheme %q (supported schemes: %s)",
filestore.Base,
strings.Join(supportedURISchemes, ", "),
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created a register of providers, so that everything stays in sync here. I think that's the underlying concern, and I agree with it 👍
promoter/file/filestore.go
Outdated
filestore.Base, | ||
object.GcsPrefix, | ||
) | ||
return nil, fmt.Errorf("unrecognized scheme %q, expected gs://", filestore.Base) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment about object.GcsPrefix
as above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done - put these into the api, as I think they are our values, not the library's values (i.e. if the library changed the value, we would not want to change our values!)
Initial support for kpromo to sync to S3 buckets. Co-authored-by: Stephen Augustus (he/him) <justaugustus@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @justinsb!
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: justaugustus, justinsb The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/hold cancel |
Initial support for kpromo to sync to S3 buckets.