Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add policy preemption proposal #3684

Merged
merged 1 commit into from
Jul 21, 2023

Conversation

Poor12
Copy link
Member

@Poor12 Poor12 commented Jun 20, 2023

What type of PR is this?
/kind design

What this PR does / why we need it:
This proposal propose a strategy which allows policies to support preemption by priority at runtime.
Even if workloads have been propagated by a policy, they can be preempted by a high priority of policy.

Which issue(s) this PR fixes:
Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?:
None

@karmada-bot karmada-bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. kind/design Categorizes issue or PR as related to design. labels Jun 20, 2023
@karmada-bot karmada-bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Jun 20, 2023
@Poor12 Poor12 marked this pull request as ready for review June 20, 2023 08:56
@karmada-bot karmada-bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 20, 2023
@RainbowMango
Copy link
Member

Retrying the CI jobs due to the issue tracked by #3674

@codecov-commenter
Copy link

codecov-commenter commented Jun 21, 2023

Codecov Report

Merging #3684 (97500c1) into master (5e14c5a) will decrease coverage by 0.94%.
The diff coverage is n/a.

❗ Your organization is not using the GitHub App Integration. As a result you may experience degraded service beginning May 15th. Please install the Github App Integration for your organization. Read more.

@@            Coverage Diff             @@
##           master    #3684      +/-   ##
==========================================
- Coverage   56.63%   55.70%   -0.94%     
==========================================
  Files         221      223       +2     
  Lines       20834    21236     +402     
==========================================
+ Hits        11799    11829      +30     
- Misses       8412     8775     +363     
- Partials      623      632       +9     
Flag Coverage Δ
unittests 55.70% <ø> (-0.94%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

see 8 files with indirect coverage changes

@Poor12 Poor12 force-pushed the add-preempt-failover branch 2 times, most recently from 8e7be87 to 97500c1 Compare July 3, 2023 07:09
@Poor12 Poor12 force-pushed the add-preempt-failover branch 2 times, most recently from 91f8cbb to bb954a4 Compare July 5, 2023 09:15
@Poor12 Poor12 force-pushed the add-preempt-failover branch 2 times, most recently from 3aa4057 to 3dfd2df Compare July 12, 2023 03:45
@Poor12
Copy link
Member Author

Poor12 commented Jul 12, 2023

/cc @RainbowMango

Copy link
Member

@RainbowMango RainbowMango left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/assign

Copy link
Member

@whitewindmills whitewindmills left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to add feature gate PolicyPreemption to protect this feature.

@Poor12
Copy link
Member Author

Poor12 commented Jul 18, 2023

I think we need to add feature gate PolicyPreemption to protect this feature.

I think it does not need. We already provide this field to enable the feature and it's Never by default.

@RainbowMango
Copy link
Member

Hi @chaunceyjiang @whitewindmills Do you have any further comments or questions?

@whitewindmills
Copy link
Member

I think it does not need. We already provide this field to enable the feature and it's Never by default.

Other than that I have no problem. The new feature may have unexpected impact on the original environment if the administrator enabled this new feature but did not know enough about it. And feature gate as global switch can decide whether the new feature is enabled. For exmaple as PropagateDeps feature which field is also false by default to disable it, but its feature gate is also provided.
/cc @RainbowMango

@RainbowMango
Copy link
Member

@Poor12 asked me for suggestions about the feature gate, I said the introduced preemption field is a kind of feature gate now, so we might not need to introduce a dedicated feature gate.

But as I mentioned on #3805 (comment), we might need a more details design for it, if we are not fully confident, it would be better to provide an additional feature gate, let's take this as an experimental feature.

@whitewindmills
Copy link
Member

thanks, let's continue the previous discussion in this PR.
LGTM

@Poor12
Copy link
Member Author

Poor12 commented Jul 19, 2023

I update the proposal and add more details.

@RainbowMango
Copy link
Member

To be honest, I'm concerned about this feature, mainly because preemption is a dangerous operation that easily lead to unexpected consequences. I can feel the concerns from the comments at #3684 (comment), #3684 (comment), and we do facing the scalability challenges during the implementation, such as #3805 (comment) that might need to list all resource templates during response to preemption. I'm afraid that will have a significant impact on system performance.

In addition, we also need to handle some corner cases like #3684 (comment) which seems is missing from the proposal.

So, after revisiting the user case, when users want to use a PP to preempt a certain resource, the resource is often specific and can be specifically declared in the PP(by resource selector), e.g.

apiVersion: policy.karmada.io/v1alpha1
kind: PropagationPolicy
metadata:
  name: foo
spec:
  resourceSelectors:
    - apiVersion: apps/v1
      kind: Deployment
      name: nginx             # with a concrete name

Or with multiple resources:

apiVersion: policy.karmada.io/v1alpha1
kind: PropagationPolicy
metadata:
  name: nginx-propagation
spec:
  resourceSelectors:
    - apiVersion: apps/v1
      kind: Deployment
      name: nginx            # with a concrete name
    - apiVersion: v1
      kind: ConfigMap
      name: nginx            # with a concrete name

So, I'm thinking if we can narrow down the scope of preemption, that is by introducing a restriction that:
the resource name in ResourceSelecto must be specified. If it is acceptable then it will be a relief for users to adopt the feature and for the burden of implementation.

@RainbowMango
Copy link
Member

@chaunceyjiang @GitHubxsy @whitewindmills What do you think of #3684 (comment)?

@whitewindmills
Copy link
Member

If possible, we can have a brief meeting to discuss, I have some questions about this proposal.

@RainbowMango
Copy link
Member

Sounds great, I'll contact you to schedule the meeting. And will post the meeting here.

@RainbowMango
Copy link
Member

Talked to @whitewindmills.
We are going to call the meeting at 10:00 this morning.
Meeting link: https://zoom.com/my/karmada

@RainbowMango
Copy link
Member

Meeting minutes:

Members Present:

  • @chaunceyjiang @Garrybest @GitHubxsy @RainbowMango @wangyuan249 @whitewindmills

  • Reached an agreement: narrow down the scope of preemption as per Add policy preemption proposal #3684 (comment). That is following restrictions should be added to resource selectors:

    • For PropagationPolicy , the ApiVersion/Kind/Name must be provided when enabling preemption.
    • For PropagationPolcy, the ApiVersion/Kind/Name/Namespace must be provided when enabling preemption.
  • Introducing a feature gate for this feature(Considering the risk of this feature)

  • Recognized a corner case that How to transfer control to another PP/CPP during de-prioritize a PP/CPP?, no ideal solution available yet, talked alternatives:

      1. Hack the PP update handler, such as HandlePropagationPolicyCreationOrUpdate.
      1. Adding an annotation by karmada-webhook to record the priority change, and then reconcile in detector.
      1. maintain a mapping table between PP(with .spec.preemption:always) and resource template in memory.

Signed-off-by: Poor12 <shentiecheng@huawei.com>
Copy link
Member

@RainbowMango RainbowMango left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

We can iterate the proposal during or after the implementation.

@karmada-bot karmada-bot added the lgtm Indicates that a PR is ready to be merged. label Jul 21, 2023
@karmada-bot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: RainbowMango

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@karmada-bot karmada-bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 21, 2023
@karmada-bot karmada-bot merged commit 5a911d5 into karmada-io:master Jul 21, 2023
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. kind/design Categorizes issue or PR as related to design. lgtm Indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants