Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OCPBUGS-24691: remove all managed fields used by old manager #2114

Merged
merged 1 commit into from Dec 16, 2023

Conversation

jluhrsen
Copy link
Contributor

the old manager "cluster-network-operator" was changed to "cluster-network-operator/operconfig" in 4.11 when server-side-apply was migrated to from client-side-apply. However, the old operator still had it's own managed fields and some interaction between those and the change to remove preStop hooks in ovnkube-master's daemonset containers was causing upgrades to stick in the network operator.

this will just remove the old manager (and it's managed fields) entirely. it's done before the update to apply the object instead of after.

JIRA: https://issues.redhat.com/browse/OCPBUGS-22293

deal w/ deprecated field manager before apply

@jluhrsen jluhrsen changed the title remove all managed fields used by old manager 4.15: remove all managed fields used by old manager Nov 17, 2023
@jluhrsen
Copy link
Contributor Author

/retest

7 similar comments
@jluhrsen
Copy link
Contributor Author

/retest

@jluhrsen
Copy link
Contributor Author

/retest

@jluhrsen
Copy link
Contributor Author

/retest

@jluhrsen
Copy link
Contributor Author

/retest

@jluhrsen
Copy link
Contributor Author

/retest

@jluhrsen
Copy link
Contributor Author

/retest

@jluhrsen
Copy link
Contributor Author

/retest

@jluhrsen
Copy link
Contributor Author

/jira cherrypick OCPBUGS-22293

@openshift-ci-robot
Copy link
Contributor

@jluhrsen: Jira Issue OCPBUGS-22293 has been cloned as Jira Issue OCPBUGS-24036. Will retitle bug to link to clone.

WARNING: Failed to update the target version for the clone. Please update the target version manually. Full error below:

Full error message. customfield_12323140 - Field 'customfield_12323140' cannot be set. It is not on the appropriate screen, or unknown.: request failed. Please analyze the request body for more details. Status code: 400:
/retitle OCPBUGS-24036: 4.15: remove all managed fields used by old manager

In response to this:

/jira cherrypick OCPBUGS-22293

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci openshift-ci bot changed the title 4.15: remove all managed fields used by old manager OCPBUGS-24036: 4.15: remove all managed fields used by old manager Nov 28, 2023
@openshift-ci-robot openshift-ci-robot added jira/severity-important Referenced Jira bug's severity is important for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Nov 28, 2023
@openshift-ci-robot
Copy link
Contributor

@jluhrsen: This pull request references Jira Issue OCPBUGS-24036, which is invalid:

  • expected the bug to target the "4.15.0" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

the old manager "cluster-network-operator" was changed to "cluster-network-operator/operconfig" in 4.11 when server-side-apply was migrated to from client-side-apply. However, the old operator still had it's own managed fields and some interaction between those and the change to remove preStop hooks in ovnkube-master's daemonset containers was causing upgrades to stick in the network operator.

this will just remove the old manager (and it's managed fields) entirely. it's done before the update to apply the object instead of after.

JIRA: https://issues.redhat.com/browse/OCPBUGS-22293

deal w/ deprecated field manager before apply

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@jluhrsen jluhrsen changed the title OCPBUGS-24036: 4.15: remove all managed fields used by old manager OCPBUGS-24036: remove all managed fields used by old manager Nov 28, 2023
@jluhrsen
Copy link
Contributor Author

/jira refresh

@openshift-ci-robot openshift-ci-robot added the jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. label Nov 28, 2023
@openshift-ci-robot
Copy link
Contributor

@jluhrsen: This pull request references Jira Issue OCPBUGS-24036, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.15.0) matches configured target version for branch (4.15.0)
  • bug is in the state New, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @anuragthehatter

The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot openshift-ci-robot removed the jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. label Nov 28, 2023
@martinkennelly
Copy link
Contributor

For other reviewers

The bug attached to this PR failed when apply was performed by CNO and resulted in error must specify a handler type.
Located in kapi validation here: kubernetes/pkg/apis/core/validation/validation.go

Code block:

func validateHandler(handler commonHandler, fldPath *field.Path) field.ErrorList {
	numHandlers := 0
	allErrors := field.ErrorList{}
	if handler.Exec != nil {
		if numHandlers > 0 {
			allErrors = append(allErrors, field.Forbidden(fldPath.Child("exec"), "may not specify more than 1 handler type"))
		} else {
			numHandlers++
			allErrors = append(allErrors, validateExecAction(handler.Exec, fldPath.Child("exec"))...)
		}
	}
	if handler.HTTPGet != nil {
		if numHandlers > 0 {
			allErrors = append(allErrors, field.Forbidden(fldPath.Child("httpGet"), "may not specify more than 1 handler type"))
		} else {
			numHandlers++
			allErrors = append(allErrors, validateHTTPGetAction(handler.HTTPGet, fldPath.Child("httpGet"))...)
		}
	}
	if handler.TCPSocket != nil {
		if numHandlers > 0 {
			allErrors = append(allErrors, field.Forbidden(fldPath.Child("tcpSocket"), "may not specify more than 1 handler type"))
		} else {
			numHandlers++
			allErrors = append(allErrors, validateTCPSocketAction(handler.TCPSocket, fldPath.Child("tcpSocket"))...)
		}
	}
	if handler.GRPC != nil {
		if numHandlers > 0 {
			allErrors = append(allErrors, field.Forbidden(fldPath.Child("grpc"), "may not specify more than 1 handler type"))
		} else {
			numHandlers++
			allErrors = append(allErrors, validateGRPCAction(handler.GRPC, fldPath.Child("grpc"))...)
		}
	}
	if numHandlers == 0 {
		allErrors = append(allErrors, field.Required(fldPath, "must specify a handler type"))
	}
	return allErrors
}

That func above is ONLY called for poststart/prestop hooks when theyre not nil:

func validateLifecycle(lifecycle *core.Lifecycle, fldPath *field.Path) field.ErrorList {
	allErrs := field.ErrorList{}
	if lifecycle.PostStart != nil {
		allErrs = append(allErrs, validateHandler(handlerFromLifecycle(lifecycle.PostStart), fldPath.Child("postStart"))...)
	}
	if lifecycle.PreStop != nil {
		allErrs = append(allErrs, validateHandler(handlerFromLifecycle(lifecycle.PreStop), fldPath.Child("preStop"))...)
	}
	return allErrs
}

PreStop is defined but doesnt have a handler defined (i.e either exec or httpget or tcpsocket). This suggests the hander was removed but not the fields above it (Prestop).
I checked the MG in the case where upgrade stalled (must-gather.local.3228488807846621353).
I can see theres two managers and they own the same prestop fields.
Theres differences though that jamo pointed out yesterday in the managed field section for the preStop fields.
The "new" manager managed fields for prestop look like:

                  f:preStop:
                    f:exec:
                      f:command: {}

and old field manager:

                   f:preStop:
                    .: {}
                    f:exec: {}

Looks like api server bug "merging" the fields when using managed fields and multiple owners. Guessing, when the new manager drops field command, then api server removes it, but doesnt remove prestop & prestop.exec entirely because exec is owned by the old manager but this is an invalid configuration and that fails validation.

So how do we move forward?
We can investigate fixing the kapi validation bug but we need to work around it now for CUs because apply will fail and upgrade will be blocked and thats why Jamo has posted this patch.

Reviewing now.

@@ -47,11 +48,45 @@ func mergeManager(ctx context.Context, clusterClient client.ClusterClient, us *u
if err != nil {
return fmt.Errorf("failed to patch (type %s) for object %s %s: %v", patchType, objGVR.String(), us.GetName(), err)
}
// remove the old manager
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont understand why this line and the reset below is needed.
The previous lines should remove the old manager.
Can you paste here what the managed fields look like at this point following the patch?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are right. the changes you did to deprecate the old fieldmanager were enough and the extra I added
was overkill. However, your changes' code path wasn't executing because it was being called after the APPLY
and the APPLY is where the upgrade failed and got stuck. Here is a PR that adds a GET before the APPLY and
if it sees an old manager, it will then execute your code path. I tested with it, and it works like we want.

@martinkennelly
Copy link
Contributor

martinkennelly commented Nov 30, 2023

I can replicate this with simple pods but also on OCP. I can confirm that the old field manager has invalid field manager configuration and that is why the apply cannot be applied. Its clear we need to remove the old field managers configuration before attempting to apply.

The root of the problem is the old field manager owns the Prestop/PostStart field and the exec field but crucially doesnt own the command field. The new manager drops ownership of all these fields when we remove these hooks which leaves Prestop/PostStart + exec fields owned by the old manager. Those two fields are therefore not removed. However, this is an invalid configuration. There must also be a handler field (i.e. command / httpGet, etc). So, it leaves Prestop/PostStart + exec present in the yaml and then this is fed into the api server validation phase and that is failing - returning back the error must specify a handler type.

To replicate on any OCP ver without having to go through many upgrade jumps

  1. Unmanage CNO
oc patch clusterversion version --type json -p '[{"op":"add","path":"/spec/overrides","value":[{"kind":"Deployment","group":"apps","name":"network-operator","namespace":"openshift-network-operator","unmanaged":true}]}]'
  1. Scale down CNO replicates to 0
oc edit deployment network-operator -n openshift-network-operator

Confirm no CNO pods are running.

  1. We need to hack the managed fields to add in a Prestop/ PostStop hook + "bad" managed field config for the ovnkube-node daemonset
    Save the artifact to file
oc -n openshift-ovn-kubernetes get daemonset ovnkube-node -o yaml > /tmp/ok-node

Add in a postStart hook. I added it to the ovn-acl-logging container. Doesnt matter which container you add it to.

    lifecycle:
        postStart:
           exec:
             command:
               - sh
oc apply -f $file --field-manager=another-field-manager

This will perform a clientside update to the artifact.
Save the artifact to file again and output the managed fields. Confirm the postStop hook is there.

oc -n openshift-ovn-kubernetes get daemonset ovnkube-node --show-managed-fields -o yaml > /tmp/ok-node

Do a server side apply using the existing field manager. We do this so the existing field manager also owns the postStop and its subfields:

oc apply -f /tmp/b --field-manager=cluster-network-operator/operconfig --server-side

Edit the file and change the managed field section the field manager "another-field-manager" for the `postStart`` from:

                f:postStart:
                    f:exec:
                      f:command: {}

to

                f:postStart:
                    .: {}
                    f:exec: {}

This mimics what we see in the MG for the old field manager. It doesnt own the command field.

Clientside apply this update using current field manager.

oc apply -f $file --field-manager=cluster-network-operator/oper

Confirm that manager cluster-network-operator/operconfig owns the postStart, exec and command fields.
Confirm that manager another-field-manager owns just the postStart and exec fields only.

  1. Now scale the CNO back up and it should fail to apply and remove the postStart field.

// consider removing in OCP 4.18 when we know field manager 'cluster-network-operator' no longer possibly
// exists in any object from all upgrade paths
if isDepFieldManagerCleanupNeeded(subcontroller) {
// Retrieve the current state of the resource
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: ideally id put in a comment why we needed to do this and what OCP version it can be removed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't that what's up above in lines 128-130? although correct me if I'm wrong, but I think this can be removed in 4.16 right? 4.15 will have these changes from GA onward and any cluster that eventually upgrades to 4.15 would have this resolved, so 4.16 and above wouldn't need it.

But, does it hurt? I suppose in the future another field manager name change could occur?

in the case that we have a deprecated field manager that we
need to remove, if the Patch() were to fail that code path
would not execute. To move it before the Patch() we also
have to do one Get() to know if the removal is neccessary.

this was uncovered while debugging this upgrade problem:

JIRA: https://issues.redhat.com/browse/OCPBUGS-22293

Signed-off-by: Jamo Luhrsen <jluhrsen@gmail.com>

deal w/ deprecated field manager before apply

Signed-off-by: Jamo Luhrsen <jluhrsen@gmail.com>
@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Dec 12, 2023
@jluhrsen
Copy link
Contributor Author

jluhrsen commented Dec 12, 2023

I've updated this PR with one small change to move the Get() to only happen if isDepFieldManagerCleanupNeeded(subcontroller) is true. I run multiple tests with a TEST PR over the weekend
and the network-migration jobs are much more stable. I'm not sure why this works, but it is a little more efficient
in the end. Otherwise, any time ApplyObject() was getting called this Get() would also be called whether it was needed or not.

I will /retest these network-migration jobs a few times as a sanity check. otherwise, I think this is good now.

@jluhrsen
Copy link
Contributor Author

/retest

@jluhrsen
Copy link
Contributor Author

/test e2e-aws-sdn-network-reverse-migration

} else {
klog.Infof("Depreciated field manager %s for object %q %s %s", depreciatedFieldManager,
gvk.String(), obj.GetNamespace(), obj.GetName())
us, err := clusterClient.Dynamic().Resource(rm.Resource).Namespace(namespace).Get(ctx, name, metav1.GetOptions{})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i was think you could do all this once and when its successful, never do it again for the lifetime of the operator, this would reduce the api calls. Up to you though. Not hard req for me.

@jluhrsen
Copy link
Contributor Author

/test e2e-aws-sdn-network-migration-rollback
/test e2e-aws-sdn-network-reverse-migration
/retest required

Copy link
Contributor

openshift-ci bot commented Dec 13, 2023

@jluhrsen: The /retest command does not accept any targets.
The following commands are available to trigger required jobs:

  • /test 4.15-upgrade-from-stable-4.14-images
  • /test e2e-aws-ovn-network-migration
  • /test e2e-aws-ovn-windows
  • /test e2e-aws-sdn-multi
  • /test e2e-aws-sdn-network-migration-rollback
  • /test e2e-aws-sdn-network-reverse-migration
  • /test e2e-gcp-ovn
  • /test e2e-gcp-ovn-upgrade
  • /test e2e-gcp-sdn
  • /test e2e-hypershift-ovn
  • /test e2e-metal-ipi-ovn-ipv6
  • /test e2e-vsphere-ovn-windows
  • /test images
  • /test lint
  • /test unit
  • /test verify

The following commands are available to trigger optional jobs:

  • /test 4.15-upgrade-from-stable-4.14-e2e-aws-ovn-upgrade
  • /test 4.15-upgrade-from-stable-4.14-e2e-azure-ovn-upgrade
  • /test 4.15-upgrade-from-stable-4.14-e2e-gcp-ovn-upgrade
  • /test e2e-aws-hypershift-ovn-kubevirt
  • /test e2e-aws-ovn-local-to-shared-gateway-mode-migration
  • /test e2e-aws-ovn-serial
  • /test e2e-aws-ovn-shared-to-local-gateway-mode-migration-periodic
  • /test e2e-aws-ovn-single-node
  • /test e2e-aws-sdn-upgrade
  • /test e2e-azure-ovn
  • /test e2e-azure-ovn-dualstack
  • /test e2e-azure-ovn-manual-oidc
  • /test e2e-metal-ipi-ovn-ipv6-ipsec
  • /test e2e-network-mtu-migration-ovn-ipv4
  • /test e2e-network-mtu-migration-ovn-ipv6
  • /test e2e-network-mtu-migration-sdn-ipv4
  • /test e2e-openstack-ovn
  • /test e2e-openstack-sdn
  • /test e2e-ovn-hybrid-step-registry
  • /test e2e-ovn-ipsec-step-registry
  • /test e2e-ovn-step-registry
  • /test e2e-vsphere-ovn
  • /test e2e-vsphere-ovn-dualstack
  • /test e2e-vsphere-ovn-dualstack-primaryv6
  • /test qe-perfscale-aws-ovn-medium-cluster-density
  • /test qe-perfscale-aws-ovn-medium-node-density-cni
  • /test qe-perfscale-aws-ovn-small-cluster-density
  • /test qe-perfscale-aws-ovn-small-node-density-cni

Use /test all to run the following jobs that were automatically triggered:

  • pull-ci-openshift-cluster-network-operator-master-4.15-upgrade-from-stable-4.14-e2e-aws-ovn-upgrade
  • pull-ci-openshift-cluster-network-operator-master-4.15-upgrade-from-stable-4.14-e2e-azure-ovn-upgrade
  • pull-ci-openshift-cluster-network-operator-master-4.15-upgrade-from-stable-4.14-e2e-gcp-ovn-upgrade
  • pull-ci-openshift-cluster-network-operator-master-4.15-upgrade-from-stable-4.14-images
  • pull-ci-openshift-cluster-network-operator-master-e2e-aws-hypershift-ovn-kubevirt
  • pull-ci-openshift-cluster-network-operator-master-e2e-aws-ovn-local-to-shared-gateway-mode-migration
  • pull-ci-openshift-cluster-network-operator-master-e2e-aws-ovn-network-migration
  • pull-ci-openshift-cluster-network-operator-master-e2e-aws-ovn-serial
  • pull-ci-openshift-cluster-network-operator-master-e2e-aws-ovn-shared-to-local-gateway-mode-migration-periodic
  • pull-ci-openshift-cluster-network-operator-master-e2e-aws-ovn-single-node
  • pull-ci-openshift-cluster-network-operator-master-e2e-aws-ovn-windows
  • pull-ci-openshift-cluster-network-operator-master-e2e-aws-sdn-multi
  • pull-ci-openshift-cluster-network-operator-master-e2e-aws-sdn-network-migration-rollback
  • pull-ci-openshift-cluster-network-operator-master-e2e-aws-sdn-network-reverse-migration
  • pull-ci-openshift-cluster-network-operator-master-e2e-aws-sdn-upgrade
  • pull-ci-openshift-cluster-network-operator-master-e2e-azure-ovn
  • pull-ci-openshift-cluster-network-operator-master-e2e-azure-ovn-dualstack
  • pull-ci-openshift-cluster-network-operator-master-e2e-gcp-ovn
  • pull-ci-openshift-cluster-network-operator-master-e2e-gcp-ovn-upgrade
  • pull-ci-openshift-cluster-network-operator-master-e2e-gcp-sdn
  • pull-ci-openshift-cluster-network-operator-master-e2e-hypershift-ovn
  • pull-ci-openshift-cluster-network-operator-master-e2e-metal-ipi-ovn-ipv6
  • pull-ci-openshift-cluster-network-operator-master-e2e-metal-ipi-ovn-ipv6-ipsec
  • pull-ci-openshift-cluster-network-operator-master-e2e-network-mtu-migration-ovn-ipv4
  • pull-ci-openshift-cluster-network-operator-master-e2e-network-mtu-migration-ovn-ipv6
  • pull-ci-openshift-cluster-network-operator-master-e2e-network-mtu-migration-sdn-ipv4
  • pull-ci-openshift-cluster-network-operator-master-e2e-openstack-ovn
  • pull-ci-openshift-cluster-network-operator-master-e2e-openstack-sdn
  • pull-ci-openshift-cluster-network-operator-master-e2e-ovn-hybrid-step-registry
  • pull-ci-openshift-cluster-network-operator-master-e2e-ovn-ipsec-step-registry
  • pull-ci-openshift-cluster-network-operator-master-e2e-ovn-step-registry
  • pull-ci-openshift-cluster-network-operator-master-e2e-vsphere-ovn
  • pull-ci-openshift-cluster-network-operator-master-e2e-vsphere-ovn-dualstack
  • pull-ci-openshift-cluster-network-operator-master-e2e-vsphere-ovn-dualstack-primaryv6
  • pull-ci-openshift-cluster-network-operator-master-e2e-vsphere-ovn-windows
  • pull-ci-openshift-cluster-network-operator-master-images
  • pull-ci-openshift-cluster-network-operator-master-lint
  • pull-ci-openshift-cluster-network-operator-master-unit
  • pull-ci-openshift-cluster-network-operator-master-verify

In response to this:

/test e2e-aws-sdn-network-migration-rollback
/test e2e-aws-sdn-network-reverse-migration
/retest required

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@jluhrsen
Copy link
Contributor Author

/test e2e-aws-sdn-network-migration-rollback
/test e2e-aws-sdn-network-reverse-migration

@jluhrsen
Copy link
Contributor Author

/retest-required

Copy link
Contributor

@trozet trozet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no idea how the Get was causing that much delay...but it seems to have fixed the problem somehow...

/hold cancel
/lgtm

@openshift-ci openshift-ci bot added lgtm Indicates that a PR is ready to be merged. and removed do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. labels Dec 15, 2023
Copy link
Contributor

openshift-ci bot commented Dec 15, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jluhrsen, martinkennelly, trozet

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot
Copy link
Contributor

/retest-required

Remaining retests: 0 against base HEAD 0d15e45 and 2 for PR HEAD 711f8ff in total

Copy link
Contributor

openshift-ci bot commented Dec 15, 2023

@jluhrsen: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-azure-ovn-dualstack 711f8ff link false /test e2e-azure-ovn-dualstack
ci/prow/e2e-vsphere-ovn-dualstack-primaryv6 711f8ff link false /test e2e-vsphere-ovn-dualstack-primaryv6
ci/prow/e2e-metal-ipi-ovn-ipv6-ipsec 711f8ff link false /test e2e-metal-ipi-ovn-ipv6-ipsec

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@sdodson
Copy link
Member

sdodson commented Dec 16, 2023

/retest-required

@openshift-merge-bot openshift-merge-bot bot merged commit 95e2d18 into openshift:master Dec 16, 2023
38 of 41 checks passed
@openshift-ci-robot
Copy link
Contributor

@jluhrsen: Jira Issue OCPBUGS-24036: All pull requests linked via external trackers have merged:

Jira Issue OCPBUGS-24036 has been moved to the MODIFIED state.

In response to this:

the old manager "cluster-network-operator" was changed to "cluster-network-operator/operconfig" in 4.11 when server-side-apply was migrated to from client-side-apply. However, the old operator still had it's own managed fields and some interaction between those and the change to remove preStop hooks in ovnkube-master's daemonset containers was causing upgrades to stick in the network operator.

this will just remove the old manager (and it's managed fields) entirely. it's done before the update to apply the object instead of after.

JIRA: https://issues.redhat.com/browse/OCPBUGS-22293

deal w/ deprecated field manager before apply

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-bot
Copy link
Contributor

[ART PR BUILD NOTIFIER]

This PR has been included in build cluster-network-operator-container-v4.16.0-202312160833.p0.g95e2d18.assembly.stream for distgit cluster-network-operator.
All builds following this will include this PR.

@sdodson
Copy link
Member

sdodson commented Dec 18, 2023

/retitle OCPBUGS-24691: remove all managed fields used by old manager
Move to the 4.16.0 bug after we branched and now master is 4.16

@openshift-ci openshift-ci bot changed the title OCPBUGS-24036: remove all managed fields used by old manager OCPBUGS-24691: remove all managed fields used by old manager Dec 18, 2023
@openshift-ci-robot
Copy link
Contributor

@jluhrsen: Jira Issue OCPBUGS-24691: All pull requests linked via external trackers have merged:

Jira Issue OCPBUGS-24691 has been moved to the MODIFIED state.

In response to this:

the old manager "cluster-network-operator" was changed to "cluster-network-operator/operconfig" in 4.11 when server-side-apply was migrated to from client-side-apply. However, the old operator still had it's own managed fields and some interaction between those and the change to remove preStop hooks in ovnkube-master's daemonset containers was causing upgrades to stick in the network operator.

this will just remove the old manager (and it's managed fields) entirely. it's done before the update to apply the object instead of after.

JIRA: https://issues.redhat.com/browse/OCPBUGS-22293

deal w/ deprecated field manager before apply

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@sdodson
Copy link
Member

sdodson commented Dec 18, 2023

/cherry-pick release-4.15

@openshift-cherrypick-robot

@sdodson: new pull request created: #2167

In response to this:

/cherry-pick release-4.15

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@jluhrsen
Copy link
Contributor Author

/cherry-pick release-4.15

@sdodson , thanks for doing this. just FYI that I already have 4.14, 4.13 and 4.12 PRs up and hoping to get those all green with CI so maybe things will move faster.

@jluhrsen jluhrsen deleted the OCPBUGS-22293-4.15 branch April 19, 2024 22:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/severity-important Referenced Jira bug's severity is important for the branch this PR is targeting. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants