Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support syncing to S3 #704

Merged
merged 1 commit into from
Jan 28, 2023
Merged

Conversation

justinsb
Copy link
Contributor

@justinsb justinsb commented Jan 17, 2023

Initial support for kpromo to sync to S3 buckets.

Added initial support for kpromo file synchronization to S3 buckets.

@k8s-ci-robot k8s-ci-robot added do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jan 17, 2023
@k8s-ci-robot k8s-ci-robot added area/artifacts Issues or PRs related to the hosting of release artifacts for subprojects area/release-eng Issues or PRs related to the Release Engineering subproject sig/release Categorizes an issue or PR as relevant to SIG Release. labels Jan 17, 2023
@justinsb justinsb changed the title Support syncing to S3 WIP: Support syncing to S3 Jan 17, 2023
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. labels Jan 17, 2023
@justinsb justinsb force-pushed the s3_support branch 2 times, most recently from a49657a to f199ab0 Compare January 17, 2023 19:20
Copy link
Member

@justaugustus justaugustus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@justinsb -- AFK for the moment, but can you check if it would instead make sense to implement portions of this within https://github.com/kubernetes-sigs/release-sdk/blob/main/object/store.go?

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 18, 2023
@justinsb
Copy link
Contributor Author

@justaugustus I'm not sure that API is a great match here, as it looks like in order to copy files across providers we would need providers to know about each other. (We have an pretty broad vfs library here if you want generic functionality , I think the abstractions there are a bit more evolved.)

But for this PR, the real nuance is in carefully setting the metadata etc to make the common case fast (i.e. the case when nothing has to be copied). That's why we have to use a non-parallel object upload, so that the ETag is the MD5 hash. I think for integrity purposes we also want to upload additional hashes as metadata, but we're really very optimized for our particular use-case here.

I'm happy to evolve the API in release-sdk, but I'd probably steer it towards something with objects paths as first class concepts; WDYT about just sharing the vfs layer with kOps?

@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 18, 2023
@justaugustus
Copy link
Member

@justaugustus I'm not sure that API is a great match here, as it looks like in order to copy files across providers we would need providers to know about each other. (We have an pretty broad vfs library here if you want generic functionality , I think the abstractions there are a bit more evolved.)

TIL about https://github.com/kubernetes/kops/tree/master/util/pkg/vfs!

I'm happy to evolve the API in release-sdk, but I'd probably steer it towards something with objects paths as first class concepts; WDYT about just sharing the vfs layer with kOps?

@justinsb -- Happy for you to move forward using kOps vfs here, if it's going to dedupe some of the work.

Are changes to kops/util/pkg generally minimal across Kubernetes major versions? (My concern being tying the release engineering tooling to the k/k versioning complexities.)
A way to minimize this concern could be to donate vfs (and/or some larger subset of the util) package to sigs.k8s.io/release-sdk. Would that be something y'all would be open to?

@justinsb
Copy link
Contributor Author

Definitely open to putting vfs somewhere shared :-) It's already shared between etcdadm and kOps, so it is generally useful. It's pretty stable although it's recently seen a flood of activity because we're adding context.Context as part of the great logging/tracing rewrite.

I was also thinking though that the API we have here is probably a reasonable minimal API, and we have higher level functionality (like folder-to-folder syncing) in kpromo. So we could also just move the filestores.

I think two good options are to put kOps VFS in a standalone repo somewhere (because it's not really only release tooling, either) or to sync kpromo's file abstractions with release-sdk (they are more logically coupled).

I would ask though if we can merge this PR in the interim though, it's all part of the S3 mirroring that reduces the CNCF expenditure on egress bandwidth - we finally have AWS buckets :-)

@justaugustus
Copy link
Member

I would ask though if we can merge this PR in the interim though, it's all part of the S3 mirroring that reduces the CNCF expenditure on egress bandwidth - we finally have AWS buckets :-)

Yep, let's just import what makes sense here and minimize the pain!

@justinsb justinsb changed the title WIP: Support syncing to S3 Support syncing to S3 Jan 19, 2023
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 19, 2023
@justinsb
Copy link
Contributor Author

Thanks @justaugustus ... I was thinking that this and release are actually pretty specific functionality, for example I think we want to upload in a potentially suboptimal way to preserve metadata, we might want to force the metadata always to be created etc, so it belongs in a library that promo-tools and the rest of the release tooling share,it's not a generic VFS layer.

So I propose we merge this here, and then I'll submit some refactorings both to release-sdk and here so that we can have a shared library in release-sdk and use it here (i.e. I'll try upstream the code here and refactor release-sdk to avoid calling gsutil where it isn't applicable). I haven't poked around where all the places release-sdk is used yet though, not entirely sure what I'm signing up for yet :-)

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Jan 19, 2023
Copy link
Member

@justaugustus justaugustus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@justinsb -- Left some nits. If you want to fix them here, go for it.

If you want to do these as a follow-up, feel free to release the hold.
/hold

api/files/validation.go Outdated Show resolved Hide resolved
@@ -62,6 +61,25 @@ func openFilestore(
ctx context.Context,
filestore *api.Filestore,
useServiceAccount, confirm bool,
) (syncFilestore, error) {
if strings.HasPrefix(filestore.Base, "gs://") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if strings.HasPrefix(filestore.Base, "gs://") {
if strings.HasPrefix(filestore.Base, object.GcsPrefix) {

if strings.HasPrefix(filestore.Base, "gs://") {
return openGCSFilestore(ctx, filestore, useServiceAccount, confirm)
}
if strings.HasPrefix(filestore.Base, "s3://") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment about defining the scheme as above.

Comment on lines 72 to 87
return nil, fmt.Errorf(
"unrecognized scheme %q (supported schemes: s3, gs)",
filestore.Base,
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's define something like:

var supportedURISchemes := []string{
  "s3",
  "gs",
}

And then:

	return nil, fmt.Errorf(
		"unrecognized scheme %q (supported schemes: %s)",
		filestore.Base,
                strings.Join(supportedURISchemes, ", "),
	)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created a register of providers, so that everything stays in sync here. I think that's the underlying concern, and I agree with it 👍

filestore.Base,
object.GcsPrefix,
)
return nil, fmt.Errorf("unrecognized scheme %q, expected gs://", filestore.Base)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment about object.GcsPrefix as above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done - put these into the api, as I think they are our values, not the library's values (i.e. if the library changed the value, we would not want to change our values!)

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 19, 2023
@k8s-ci-robot k8s-ci-robot added lgtm "Looks good to me", indicates that a PR is ready to be merged. approved Indicates a PR has been approved by an approver from all required OWNERS files. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. and removed lgtm "Looks good to me", indicates that a PR is ready to be merged. labels Jan 19, 2023
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 21, 2023
Initial support for kpromo to sync to S3 buckets.

Co-authored-by: Stephen Augustus (he/him) <justaugustus@users.noreply.github.com>
Copy link
Member

@justaugustus justaugustus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @justinsb!

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 28, 2023
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: justaugustus, justinsb

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@justaugustus
Copy link
Member

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 28, 2023
@k8s-ci-robot k8s-ci-robot merged commit 322d1dc into kubernetes-sigs:main Jan 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/artifacts Issues or PRs related to the hosting of release artifacts for subprojects area/release-eng Issues or PRs related to the Release Engineering subproject cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/release Categorizes an issue or PR as relevant to SIG Release. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants