delete PropagationPolicy and OverridePolicy at the same time, lead to workload recover to origin state #3873

wu0407 · 2023-08-01T09:39:00Z

What happened:
I want to clean the member resource, so I delete OverridePolicy and PropagationPolicy on the same time.
The OverridePolicy is to Override deployment spec.replicas to 0.
Many pod create by deployment controller on member cluster after delete action.

rule.yaml:

apiVersion: policy.karmada.io/v1alpha1
kind: ClusterPropagationPolicy
metadata:
  name: workload-propagation-policy
spec:
  resourceSelectors:
    - apiVersion: apps/v1
      kind: Deployment
    - apiVersion: apps/v1
      kind: StatefulSet
  propagateDeps: true
  placement:
    clusterAffinity:
      clusterNames:
        - sh5-online
---
apiVersion: policy.karmada.io/v1alpha1
kind: ClusterOverridePolicy
metadata:
  name: workload-override-policy
spec:
  resourceSelectors:
    - apiVersion: apps/v1
      kind: Deployment
    - apiVersion: apps/v1
      kind: StatefulSet
  overrideRules:
  - overriders:  
      plaintext:
      - path: /spec/replicas
        operator: replace
        value: 0
      - path: /spec/template/spec/tolerations
        operator: add
        value:
        - key: "eks.tke.cloud.tencent.com/eklet"
          operator: "Exists"
          effect: "NoSchedule"
      - path: /spec/template/spec/schedulerName
        operator: replace
        value: "tke-scheduler"

What you expected to happen:
no pod be created and deployment is deleted.
How to reproduce it (as minimally and precisely as possible):

kubectl apply -f rule.yaml
wait resouce is synced
kubectl delete -f rule.yaml
kubectl get pod -w -A
Anything else we need to know?:

Environment:

Karmada version:
kubectl-karmada or karmadactl version (the result of kubectl-karmada version or karmadactl version):
Others:

kubectl karmada version: version.Info{GitVersion:"v1.6.1", GitCommit:"fdc7ac62c70b571d091a795cbe9b9fceac5f1c2c", GitTreeState:"clean", BuildDate:"2023-07-06T03:35:37Z", GoVersion:"go1.20.4", Compiler:"gc", Platform:"linux/amd64"}

The text was updated successfully, but these errors were encountered:

wu0407 · 2023-08-01T09:56:36Z

Similar behavior also occurs during creation OverridePolicy and PropagationPolicy.

XiShanYongYe-Chang · 2023-08-02T08:03:25Z

Hi @wu0407 Thank you for your feedback.

The deletion of PropagationPolicy and the deletion of OverridePolicy may need to be considered separately.

When users delete PropagationPolicy, Karmada will disconnect the resource template from pp, but ResourceBinding will still exist. At this time, when users delete op again, ResourceBinding will still synchronize down to the member cluster. Therefore, the resources in the member cluster will remain consistent with the template.

I would like to understand why you need to delete PropagationPolicy and OverridePolicy, but keep the resource template.

wu0407 · 2023-08-02T09:28:38Z

I don't want deployment running on that cluster, but continue run on karmada cluster. Because karmada apiserver is reuese exist apiserver.

relate issue:
The deployment will be sync to member cluster after cluster is rejoin, but that cluster not have any PropagationPolicy.
The reason is that ResourceBinding has been retained when PropagationPolicy is deleted even the cluster is removed.

wu0407 · 2023-08-02T10:06:48Z

ResourceBinding should be delete or relate info need to be clean.

spec:
  placement:
    clusterAffinity:
      clusterNames:
      - sh5-online
    clusterTolerations:
    - effect: NoExecute
      key: cluster.karmada.io/not-ready
      operator: Exists
      tolerationSeconds: 300
    - effect: NoExecute
      key: cluster.karmada.io/unreachable
      operator: Exists
      tolerationSeconds: 300

chaunceyjiang · 2023-08-02T10:30:58Z

ResourceBinding should be delete

ResourceBinding currently has the same lifecycle as the resource template. If the resource template is not deleted, then ResourceBinding will also not be deleted.

or relate info need to be clean.

I have created an issue to track this problem. #3886

XiShanYongYe-Chang · 2023-08-03T02:17:31Z

@wu0407 @chaunceyjiang Thanks~, Your discussion has provided me with a new perspective. In fact, this issue has been discussed before, #3681

One previous conclusion was:

I'm thinking if we can take a step forward and consider how to make this behavior more in line with user expectations. For example, can we give control of this behavior to users and let them decide whether rb needs to be synchronized and deleted?

I think now we can further optimize this conclusion. The lifecycle of rb and resource templates remains consistent, and users can decide whether to clean up the scheduling results on rb with pp information so that they will no longer be synchronized. What do you think?

whitewindmills · 2023-08-03T03:02:52Z

I think now we can further optimize this conclusion. The lifecycle of rb and resource templates remains consistent, and users can decide whether to clean up the scheduling results on rb with pp information so that they will no longer be synchronized. What do you think?

if users choose to clean up the previous scheduling results, should the resource template be matched by other new appropriate policies? I think cleanup conflicts with rescheduling. it would be great if you could explain it to me.

chaunceyjiang · 2023-08-03T03:18:20Z

I think now we can further optimize this conclusion. The lifecycle of rb and resource templates remains consistent, and users can decide whether to clean up the scheduling results on rb with pp information so that they will no longer be synchronized. What do you think?

spec:
  placement:
    clusterAffinity:
      clusterNames:
      - sh5-online
    clusterTolerations:
    - effect: NoExecute
      key: cluster.karmada.io/not-ready
      operator: Exists
      tolerationSeconds: 300
    - effect: NoExecute
      key: cluster.karmada.io/unreachable
      operator: Exists
      tolerationSeconds: 300

I don't think these are scheduling results. I believe they are scheduling information. Since the PP has already been deleted, the related information should also be cleaned up.

XiShanYongYe-Chang · 2023-08-03T03:26:03Z

if users choose to clean up the previous scheduling results, should the resource template be matched by other new appropriate policies?

I think that regardless of how users choose, it should be allowed for resource templates to match with new Propagation Policies. This point may not conflict.

I think cleanup conflicts with rescheduling.

When clearing the resource template of pp matching information, it is necessary to clean up the information on rb.

XiShanYongYe-Chang · 2023-08-03T03:27:40Z

I don't think these are scheduling results. I believe they are scheduling information. Since the PP has already been deleted,

I agree with you. The statement about the scheduling results is incorrect.

the related information should also be cleaned up.

Do you mean to clean directly without requiring user selection?

chaunceyjiang · 2023-08-03T04:39:48Z

Do you mean to clean directly without requiring user selection?

Yes.

XiShanYongYe-Chang · 2023-08-03T06:48:26Z

I tend to agree a little, but I'm not sure if there are users who rely on this. It might be a good idea to hear what others think at the community meeting.

XiShanYongYe-Chang · 2023-08-03T06:48:38Z

/cc @RainbowMango

liangyuanpeng · 2023-08-03T06:51:38Z

I would recommend turning on a featuregate as a first step, the default behavior remains the same as it is now.

Open the featuregate if the user want clean directly.

Then someday in the future, when everything is stable, the default behavior can be modified and cleanup this featuregate.

liangyuanpeng · 2023-08-03T07:20:03Z

Then someday in the future, when everything is stable, the default behavior can be modified and cleanup this featuregate.

A simple version evolution example:
1.7 alpha
1.8 beta and featuregate open by default
1.9 GA

Just let the featuregate open by default and do not need to cleanup.

If a direct cleaning solution is adopted in the end, I suggest notifying users in advance in 1.7, and execute in 1.8

weilaaa · 2023-08-04T06:43:30Z

In another issues #3681 , we've discussed it yet, like "I prefer to add a new gate in pp to determine this behavior, and the default is consistent with the current one to be compatible with the existing data. User can open new gate in pp to change this behavior if he knows what he is doing."

A simple version evolution example:
1.7 alpha
1.8 beta and featuregate open by default
1.9 GA

For compatibility reasons, I recommend using the current behavior as the default. But if we can do some research, and the research results show that users prefer to clean up rb when deleting pp, then I suggest that we can adopt the method that @liangyuanpeng said, just like k8s' strategy for new featuregate.

weilaaa · 2023-08-04T06:48:44Z

By the way, there will be new featuregate in pp that seems better than a global featuregate switch, casue user can control the action for each pp.

I just leave the problem here that how's the default action going.

cc @olderTaoist

RainbowMango · 2023-08-04T08:44:08Z

I'm looking and thinking about the use case:

I don't want deployment running on that cluster, but continue run on karmada cluster. Because karmada apiserver is reuese exist apiserver.

I'm asking a further question, why can't remove it even in case of reusing the API server?

Feel that it'd be better to have a chat at the community meeting.

XiShanYongYe-Chang · 2023-08-10T07:03:54Z

Hello everybody, @wu0407 @chaunceyjiang @liangyuanpeng @whitewindmills @weilaaa @olderTaoist

Can we discuss this issue at the community meeting next Tuesday (2023-08-15)? I hope that we can reach some consensus and make progress towards resolving this issue during the meeting.

wu0407 added the kind/bug Categorizes issue or PR as related to a bug. label Aug 1, 2023

This was referenced Aug 3, 2023

don't reconcile resourcebinding when change resourceselector of propagation policy #3681

Open

After deleting the PropagationPolicy, there is still relevant information in the ResourceBinding. #3886

Open

wu0407 mentioned this issue Aug 7, 2023

workload object status should not be modified in control plane cluster that reuse exist apiserver as karmada-apiserver #3907

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

delete PropagationPolicy and OverridePolicy at the same time, lead to workload recover to origin state #3873

delete PropagationPolicy and OverridePolicy at the same time, lead to workload recover to origin state #3873

wu0407 commented Aug 1, 2023

wu0407 commented Aug 1, 2023

XiShanYongYe-Chang commented Aug 2, 2023

wu0407 commented Aug 2, 2023 •

edited

wu0407 commented Aug 2, 2023

chaunceyjiang commented Aug 2, 2023

XiShanYongYe-Chang commented Aug 3, 2023

whitewindmills commented Aug 3, 2023

chaunceyjiang commented Aug 3, 2023

XiShanYongYe-Chang commented Aug 3, 2023 •

edited

XiShanYongYe-Chang commented Aug 3, 2023

chaunceyjiang commented Aug 3, 2023

XiShanYongYe-Chang commented Aug 3, 2023

XiShanYongYe-Chang commented Aug 3, 2023

liangyuanpeng commented Aug 3, 2023 •

edited

liangyuanpeng commented Aug 3, 2023 •

edited

weilaaa commented Aug 4, 2023

weilaaa commented Aug 4, 2023 •

edited

RainbowMango commented Aug 4, 2023

XiShanYongYe-Chang commented Aug 10, 2023 •

edited

delete PropagationPolicy and OverridePolicy at the same time, lead to workload recover to origin state #3873

delete PropagationPolicy and OverridePolicy at the same time, lead to workload recover to origin state #3873

Comments

wu0407 commented Aug 1, 2023

wu0407 commented Aug 1, 2023

XiShanYongYe-Chang commented Aug 2, 2023

wu0407 commented Aug 2, 2023 • edited

wu0407 commented Aug 2, 2023

chaunceyjiang commented Aug 2, 2023

XiShanYongYe-Chang commented Aug 3, 2023

whitewindmills commented Aug 3, 2023

chaunceyjiang commented Aug 3, 2023

XiShanYongYe-Chang commented Aug 3, 2023 • edited

XiShanYongYe-Chang commented Aug 3, 2023

chaunceyjiang commented Aug 3, 2023

XiShanYongYe-Chang commented Aug 3, 2023

XiShanYongYe-Chang commented Aug 3, 2023

liangyuanpeng commented Aug 3, 2023 • edited

liangyuanpeng commented Aug 3, 2023 • edited

weilaaa commented Aug 4, 2023

weilaaa commented Aug 4, 2023 • edited

RainbowMango commented Aug 4, 2023

XiShanYongYe-Chang commented Aug 10, 2023 • edited

wu0407 commented Aug 2, 2023 •

edited

XiShanYongYe-Chang commented Aug 3, 2023 •

edited

liangyuanpeng commented Aug 3, 2023 •

edited

liangyuanpeng commented Aug 3, 2023 •

edited

weilaaa commented Aug 4, 2023 •

edited

XiShanYongYe-Chang commented Aug 10, 2023 •

edited