Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add QueueingStrategy in Capacity api #80

Merged
merged 1 commit into from
Mar 3, 2022

Conversation

denkensk
Copy link
Member

@denkensk denkensk commented Feb 28, 2022

What type of PR is this?

/kind feature
/kind api-change

What this PR does / why we need it:

Add QueueingStrategy in Capacity api

Which issue(s) this PR fixes:

First part #8

Special notes for your reviewer:

@k8s-ci-robot k8s-ci-robot added kind/feature Categorizes issue or PR as related to a new feature. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Feb 28, 2022
@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Feb 28, 2022
Copy link
Contributor

@ahg-g ahg-g left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before reviewing further, I would like to discuss at what level we should have this option. I think we need it at the capacity level to start with since multiple queues can point to the same capacity.

As a followup, and as we get more feedback on how people will be using the system, we could we could add a queue-level option that inherits the capacity level one by default, but can override it within that queue. There is a lot of nuance that we need to discuss in a detailed separate design doc though.

// queue. This field is immutable.
// +kubebuilder:default=createTimeFIFO
// +kubebuilder:validation:Enum=createTimeFIFO
SortingStrategy SortingStrategy `json:"sortingStrategy,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also need to think about order across queues referencing the same capacity. I feel that we should start by having this option in Capacity not queue.


var lessFun func(a, b workload.Info) bool

// TODO(denkensk): Add a Simple Framework to support different queuing
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please use only references to issues, not github handles.

@@ -25,11 +25,24 @@ type QueueSpec struct {
// capacity is a reference to a Capacity object form which resources
// are allocated to this queue.
Capacity CapacityReference `json:"capacity,omitempty"`

// SortingStrategy indicates the queueing strategy of the workloads in the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

naming: if we move this to Capacity, I would name it QueueingStrategy and name the field Queueing.


const (
// CreateTimeFIFO means that workloads are sorted strictly by creation time.
CreateTimeFIFO SortingStrategy = "createTimeFIFO"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would we name the approach based on re-queueing time? Perhaps we can name them StrictFIFO and BestEffortFIFO? I think creation time is implicit in the FIFO.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

API enums should start with a capital letter.

Copy link
Contributor

@alculquicondor alculquicondor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to start at the Capacity level, given that there is a sorting criteria there too.

Although I admit that implementation will be harder, as the Capacity might not exist when the Queue is created. But it's solvable.


const (
// CreateTimeFIFO means that workloads are sorted strictly by creation time.
CreateTimeFIFO SortingStrategy = "createTimeFIFO"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

API enums should start with a capital letter.

@@ -26,7 +28,7 @@ import (
)

func TestFIFOQueue(t *testing.T) {
q := newQueue()
q, _ := newQueue()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't silence the error

@denkensk
Copy link
Member Author

I need to make sure I'm understanding this correctly. By capacity level policy do you mean within or across queues?

If it's within a queue, I think we can set the policy in the capacity and when we want to create a new queue, the queue will inherit it from the capacity.

If it's across queues, that's addressing fairness rather than starvation. It can't solve the problem here in #46

@ahg-g
Copy link
Contributor

ahg-g commented Feb 28, 2022

Think of Capacity as a "global" queue of all the queues pointing to the capacity; so the order should be enforced on all workloads sent to the same capacity even if they were submitted to different queues.

@denkensk
Copy link
Member Author

denkensk commented Mar 1, 2022

I also think a queueing strategy at the capacity level is needed. But there are still some things to discuss here. Could you give more comments for it? @ahg-g @alculquicondor

Personal opinion:
I think the strategy at the capacity level is to sort between multiple Queues, e.g. which Queue's workloads we schedule first. The strategy at the queue level is to sort within the queue, such as which workload needs to be scheduled first.

If the strategy at the capacity level act directly on all workloads belongs to the capacity, This way the queue does not serve the purpose of dividing the tenants. A tenant submits a lot of workloads before others so that no other tenant's workloads can be scheduled. This is no different from the one-to-one correspondence between capacity and queue.

@alculquicondor
Copy link
Contributor

I don't think we should globally sort all the entries related to a capacity.

But we can sort the entries within a queue, and also sort the heads of the queues, when they enter the scheduling cycle, based on the same criteria

@ahg-g
Copy link
Contributor

ahg-g commented Mar 1, 2022

I mentioned there is a lot of nuance and we need a more detailed doc listing pros/cons and use cases, but lets continue to discuss here.

At some point we actually wanted to merge queue and capacity into one API and so fairness was always supposed to happen across all workloads submitted to the same capacity; but then we identified two reasons for continuing to have a namespaced queue API:

  1. discoverability: tenants can simply list the queues that exist in their namespace to find which ones they can submit their workloads to, so it is simply a pointer to Capacity.
  2. address the use case where a tenant is running an experiment and want to define usage limits for that experiment; in this use case an experiment is modeled as a queue; which means tenant should be able to create/delete queues as they see fit.

Given the above, the namespaced Queue API wasn't meant to offer isolation between tenants since tenants could be given the permission to create/delete queues, and it is not something always limited to the Batch Admin.

Ordering within a queue is not going to offer FIFO access to capacity even if we sort the heads based on the same criteria and that could also be a problem/limitation especially if a tenant was given the permission to create queues.

For the time being, I am ok if we sort within queues and then sort the heads; but we may change this in a significant way later based on the following:

  1. In cases where tenants are allowed to create queues at will, what knobs do admins have to prevent tenants from using that to gain unfair access to shared capacity?
  2. How can admins force ordering to happen at the capacity level if they wish to do so?

@denkensk
Copy link
Member Author

denkensk commented Mar 1, 2022

Given the above, the namespaced Queue API wasn't meant to offer isolation between tenants since tenants could be given the permission to create/delete queues, and it is not something always limited to the Batch Admin.

Thanks for your detailed introduction. @ahg-g If this is our design purpose, I got it. The queue is not the isolated unit, the capacity is the isolated unit. Queue may be a classification for jobs under the same tenant. Right?
If so, I wonder if we still need to set up an Internel.queue(heap) for each queue. Shouldn't we rather need to only set an Internel.queue(heap) for each capacity?

@ahg-g
Copy link
Contributor

ahg-g commented Mar 2, 2022

If so, I wonder if we still need to set up an Internel.queue(heap) for each queue. Shouldn't we rather need to only set an Internel.queue(heap) for each capacity?

Yes; perhaps Capacity is confusing as a name, may be we should call it ClusterQueue (synonym with ClusterResourceQuota, the cluster-scoped RQ from openshift).

@ahg-g
Copy link
Contributor

ahg-g commented Mar 2, 2022

If so, I wonder if we still need to set up an Internel.queue(heap) for each queue. Shouldn't we rather need to only set an Internel.queue(heap) for each capacity?

Yes; perhaps Capacity is confusing as a name, may be we should call it ClusterQueue (synonym with ClusterResourceQuota, the cluster-scoped RQ from openshift).

@alculquicondor is this going to be a significant change? if so, can you suggest a simpler path? perhaps the two step ordering aldo suggested earlier could be one step in that direction.

@denkensk
Copy link
Member Author

denkensk commented Mar 2, 2022

Yes; perhaps Capacity is confusing as a name, may be we should call it ClusterQueue (synonym with ClusterResourceQuota, the cluster-scoped RQ from openshift).

Maybe ElasticQuota or ClusterElasticQuota ? elastic indicates resource sharing? Also there is a difference in nomenclature with ResourceQuota or ClusterResourceQuota?

@ahg-g
Copy link
Contributor

ahg-g commented Mar 2, 2022

Quota in the name doesn't clarify that this is a queue (ordered entities waiting to get started); while it is not unusual to associate "Queue" with usage limits. Also, we want to introduce budgets, not just quotas (assuming quota is defined as a limit at a specific point in time)

@alculquicondor
Copy link
Contributor

Also there is a difference in nomenclature with ResourceQuota or ClusterResourceQuota?

ResourceQuotas are namespaced objects, ClusterResourceQuotas are not (in other words, they are cluster-scoped)

is this going to be a significant change?

Do you mean having a single heap (not saying queue to avoid confusion) per Capacity? I think it is a reasonable simplification. We will still need a heap per Queue once we implement limits at that level, but at the moment they are not providing much.

It will be significant code moving around, but not much re-implementation. Better now than later.

@ahg-g
Copy link
Contributor

ahg-g commented Mar 2, 2022

Thanks, then I suggest we do the following:

  1. Move the API to Capacity
  2. Introduce a single heap per capacity to implement the API

@denkensk
Copy link
Member Author

denkensk commented Mar 2, 2022

Thanks, then I suggest we do the following:
Move the API to Capacity
Introduce a single heap per capacity to implement the API

SG. And how about the rename of Capacity? We still have a chance before the official release.

@ahg-g
Copy link
Contributor

ahg-g commented Mar 2, 2022

I created #85, lets discuss there.

@alculquicondor
Copy link
Contributor

  1. Move the API to Capacity
  2. Introduce a single heap per capacity to implement the API

They probably need to happen in the opposite order. I would like to do 2, if you don't mind @denkensk

@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Mar 3, 2022
@denkensk denkensk changed the title Add SortingStrategy in Queue api Add QueueingStrategy in Capacity api Mar 3, 2022
@denkensk denkensk requested a review from ahg-g March 3, 2022 03:05
@denkensk
Copy link
Member Author

denkensk commented Mar 3, 2022

They probably need to happen in the opposite order. I would like to do 2, if you don't mind @denkensk

@alculquicondor Don't mind, and could you provide some basic implementation at your first pr?
That way I can do other work on efficient re-enqueue in parallel. Thanks. 😄

@denkensk
Copy link
Member Author

denkensk commented Mar 3, 2022

They probably need to happen in the opposite order

I think the order may not be so strict, we can merge this pr first (because it's ready) and don't use it.
and then introduce a single heap per capacity after this.

Copy link
Contributor

@alculquicondor alculquicondor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, I missed that you changed the PR to only have the API field.

@@ -132,8 +132,21 @@ type CapacitySpec struct {
// The name style is similar to label keys. These are just names to link QCs
// together, and they are meaningless otherwise.
Cohort string `json:"cohort,omitempty"`

// QueueingStrategy indicates the queueing strategy of the workloads
// across the queues in this Capacity. This field is immutable.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Document the meaning of the values here too. The constants don't form part of the API documentation.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated.

@@ -83,6 +83,11 @@ type Capacity struct {
// Those keys define the affinity terms of a workload
// that can be matched against the flavors.
LabelKeys map[corev1.ResourceName]sets.String

// TODO introduce a single heap for per Capacity.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO(#87)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

default: StrictFIFO
description: 'QueueingStrategy indicates the queueing strategy of
the workloads across the queues in this Capacity. This field is
immutable. Current Supported Strategies: - StrictFIFO: workloads
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you need an extra empty line in the gocomment :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(⊙o⊙)… Added

Copy link
Contributor

@alculquicondor alculquicondor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 3, 2022
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: alculquicondor, denkensk

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 3, 2022
@ahg-g
Copy link
Contributor

ahg-g commented Mar 3, 2022

/retest

@ahg-g
Copy link
Contributor

ahg-g commented Mar 3, 2022

/lgtm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants