Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add resource version checks #3993

Merged
merged 13 commits into from
Jul 7, 2023
Merged

Conversation

johnbelamaric
Copy link
Contributor

@johnbelamaric johnbelamaric commented Jun 30, 2023

Adds ResourceVersion handling, including using an annotation to manage it for rpkg pull and rpkg push.

Fixes #3991

@johnbelamaric johnbelamaric requested a review from a team as a code owner June 30, 2023 16:54
@johnbelamaric
Copy link
Contributor Author

Also adds some instrumentation in ensureSameOrigin. Right now, in Nephio, our build has that function disabled. However, I am curious if the problems we saw with that function were due to packages getting corrupted by the concurrency issues, so I am leaving it enabled in this PR.

@johnbelamaric
Copy link
Contributor Author

Ok, I still get the error Error: Internal error occurred: cannot create revision of free5gc-upf with a different origin than other package revisions in the same package even with this fix. So we will still need to disable that for the Nephio build.

Should I just comment it out in this PR?

@mortent
Copy link
Contributor

mortent commented Jul 4, 2023

So this adds the resourceVersion to the Kptfile. Is that to make sure it handles the resourceVersion comparison even if it goes through a pull and then later a pull?

I think there are three separate issues related to this and we should probably make sure we address them all (although not necessarily in this PR):

  • Make sure we correctly check the resourceVersion for normal fields on the PackageRevision resource.
  • Make sure we handle this correctly in terms of concurrency so we don't have races between concurrent updates to the same resource. I think this can be challenging due to the way we split data about each resource across both git/oci and a CRD. But we should at least do what we can to minimize the chance of issues here and make sure we understand the cases where it might not be handled correctly.
  • Handle resourceVersion for updates to PackageRevisionResources (i.e. what this PR is about).

@johnbelamaric
Copy link
Contributor Author

So this adds the resourceVersion to the Kptfile. Is that to make sure it handles the resourceVersion comparison even if it goes through a pull and then later a pull?

So, when using K8s clients, resourceVersion management works as expected simply by setting the metadata.ResourceVersion in the PackageRevision and PackageRevisionResources. However, if you do a pull, that metadata does not exist anywhere in the package. We do not store the source git commit ID from a pull, anywhere that I could see. So, in a pull I now stick it on an annotation in the Kptfile. But this is ephemeral; it is not stored in git. It is just added/removed by the API so that non-K8s clients (well, really just kpt) can work without modification.

Another option would be to have the kpt pull and push commands do this themselves. That may be preferable.

@johnbelamaric
Copy link
Contributor Author

  • Make sure we correctly check the resourceVersion for normal fields on the PackageRevision resource.

This is done in this PR as well.

  • Make sure we handle this correctly in terms of concurrency so we don't have races between concurrent updates to the same resource. I think this can be challenging due to the way we split data about each resource across both git/oci and a CRD. But we should at least do what we can to minimize the chance of issues here and make sure we understand the cases where it might not be handled correctly.

The resourceVersion check done in this PR should handle this in the happy path. The failure paths may be more challenging (if we make a change to the CR, then cannot change Git, or vice-versa, they may get out of sync).

There is nothing we can do about someone out-of-band modifying Git. We could put in a webhook to prevent someone from out-of-band modifying the packagerev.

@johnbelamaric
Copy link
Contributor Author

Another option would be to have the kpt pull and push commands do this themselves. That may be preferable.

I wasn't sure how the UI did things. So, I figured the server-side workflow would work for the UI as well. But maybe it's better to assume that the clients are properly behaving K8s clients. In that case, it should be implemented client side (in push and pull). @ChristopherFry do you know, does the UI do the equivalent of the kpt pull and push commands, OR does it use a normal K8s client to pull out the package contents and push new package contents? Specifically, does it handle metadata on the PackageRevision and PackageRevisionResource objects normally?

@mortent
Copy link
Contributor

mortent commented Jul 5, 2023

Yeah, I am also thinking that implementing the logic for including the resourceVersion as an annotation in the Kptfile would be better implemented client-side. That means that for Porch, the resourceVersion will always be available in the resourceVersion field from either the PackageRevision or PackageRevisionResources resources. Do you think there are any concerns about handling it that way?

@johnbelamaric
Copy link
Contributor Author

Yeah, I am also thinking that implementing the logic for including the resourceVersion as an annotation in the Kptfile would be better implemented client-side. That means that for Porch, the resourceVersion will always be available in the resourceVersion field from either the PackageRevision or PackageRevisionResources resources. Do you think there are any concerns about handling it that way?

No, it's probably better.

@johnbelamaric
Copy link
Contributor Author

/retest

@johnbelamaric
Copy link
Contributor Author

Ok, I think this is ready for review. I updated it to do the resource version annotation in pull and use it in push, removing it before pushing.

I haven't tried the UI yet.

@johnbelamaric
Copy link
Contributor Author

Ok, yeah, this break the UI. After a clone I get: "Internal error occurred: resourceVersion must be specified for an update". So, it's not using just the K8s API or if so it's not stashing away all the metadata. We'll need to drill into that.

Copy link
Contributor

@mortent mortent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few small questions, but looks good.

@@ -374,6 +383,11 @@ func ensureSameOrigin(ctx context.Context, repo repository.Repository, obj *api.
// If there are no tasks, or the first task is not init or clone, then this revision was not
// created from another package revision. That means we expect it to be the first revision
// for this package.
if len(tasks) == 0 {
klog.Warningf("not same origin due to no tasks")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it intentional that we add this logging in this PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can remove it. This PR does not fix ensureSameOrigin; I think I will disable in a separate PR.

@@ -127,7 +131,12 @@ func (p *PackageRevision) KubeObjectName() string {
}

func (p *PackageRevision) GetResources(ctx context.Context) (*api.PackageRevisionResources, error) {
return p.repoPackageRevision.GetResources(ctx)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this change just a leftover from previous drafts of the PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ugh, yes. I guess it's equivalent

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can clean it up though, if you want

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not critical, just wanted to make sure there wasn't something here I was missing.

@@ -626,6 +644,15 @@ func (cad *cadEngine) UpdatePackageRevision(ctx context.Context, repositoryObj *
ctx, span := tracer.Start(ctx, "cadEngine::UpdatePackageRevision", trace.WithAttributes())
defer span.End()

newRV := newObj.GetResourceVersion()
if len(newRV) == 0 {
return nil, fmt.Errorf("resourceVersion must be specified for an update")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the normal behavior for resources?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, standard k8s behavior

@johnbelamaric johnbelamaric merged commit 70c5e27 into kptdev:main Jul 7, 2023
@efiacor
Copy link

efiacor commented Jul 20, 2023

Ok, yeah, this break the UI. After a clone I get: "Internal error occurred: resourceVersion must be specified for an update". So, it's not using just the K8s API or if so it's not stashing away all the metadata. We'll need to drill into that.

Hi @johnbelamaric , do we have an issue to address this? Seeing it now on deploying packages via UI. Also, blocks new revision creation.

@johnbelamaric
Copy link
Contributor Author

The UI is fixed, we just need to use the new build in the install.

@johnbelamaric
Copy link
Contributor Author

Sorry I forgot to bump that. Will send a PR today

@efiacor
Copy link

efiacor commented Jul 20, 2023

Sorry I forgot to bump that. Will send a PR today

No problem. Thank you sir. I should learn to use cli more in fairness.

@johnbelamaric
Copy link
Contributor Author

johnbelamaric added a commit to mortent/kpt that referenced this pull request Sep 18, 2023
* Add resource version checks

* Retrieve the updated object before a second update

* gofmt

* Increase timeout for cleanup wait

* Ignore resource-version in test output

* Oops

* Move annotation handling to the client

* Clean up go.mod

* Do not swallow errors

* Fix return value checking

* Update e2e tests

* Fix empty string issue with reorderYamlStdout

* Revert some unnecessary changes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Porch: concurrency issue
3 participants