Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Graduate API to beta #23

Closed
4 tasks
Tracked by #360
alculquicondor opened this issue Feb 18, 2022 · 32 comments · Fixed by #604
Closed
4 tasks
Tracked by #360

Graduate API to beta #23

alculquicondor opened this issue Feb 18, 2022 · 32 comments · Fixed by #604
Assignees
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/backlog Higher priority than priority/awaiting-more-evidence.

Comments

@alculquicondor
Copy link
Contributor

alculquicondor commented Feb 18, 2022

Currently, this would be very cumbersome due to the lack of support from kubebuilder kubernetes-sigs/controller-tools#656

Once the support is added and we are ready to publish a v1beta1, we should consider renaming the api group. Note that this requires an official api-review kubernetes/enhancements#1111

Summary doc: https://docs.google.com/document/d/1Uu4hfGxux4Wh_laqZMLxXdEVdty06Sb2DwB035hj700/edit?usp=sharing&resourcekey=0-b7mU7mGPCkEfhjyYDsXOBg (join https://groups.google.com/a/kubernetes.io/g/wg-batch to access)

Potential changes when graduating:

  • Move admission from Workload spec into status (from Enforce timeout for podsReady #498)
  • Rename min, max into something easier to understand.
  • Support queue name as a label, in addition to annotation (makes it easier to filter workloads by queue).
  • Add ObjectMeta into each PodSet template.
@alculquicondor
Copy link
Contributor Author

/kind feature
/priority backlog

@k8s-ci-robot k8s-ci-robot added kind/feature Categorizes issue or PR as related to a new feature. priority/backlog Higher priority than priority/awaiting-more-evidence. labels Feb 18, 2022
@ahg-g
Copy link
Contributor

ahg-g commented Apr 9, 2022

Using sigs.k8s.io as domain isn't necessary and doesn't add value. If this ever moves to core k8s, it would likely be under batch.k8s.io anyways.

For the time being, and to avoid the awkward x-, it seems reasonable to use the shorter kueue.sh domain.

@alculquicondor
Copy link
Contributor Author

I think the value is the direct association to the Kubernetes project. I prefer having it in the API name. But what do others think?

@denkensk
Copy link
Member

Anyway, it will be moved to core k8s in the future, so why not choose to use batch.k8s.io directly. This will reduce the work of users to change api groups in the future.

@alculquicondor
Copy link
Contributor Author

alculquicondor commented Apr 11, 2022

Assuming that you don't mean batch, which would require the API to be part of k/k.
Any k8s.io domain name requires an API review. This will take a week or two. Should we do it or just release the alpha APIs as-is? I think I prefer to release them sooner for more chances to get real-world feedback. And then we can go for an API review when we aim for beta.

@ahg-g
Copy link
Contributor

ahg-g commented Apr 11, 2022

I think the value is the direct association to the Kubernetes project. I prefer having it in the API name.

The x- makes it look weird.

Any k8s.io domain name requires an API review. This will take a week or two.

If that is a one time thing, then we can pursue it; but I am not sure if we want it if it requires a review for every change we make.

@alculquicondor
Copy link
Contributor Author

It probably does. But maybe we can just rename to batch.k8s.io and mark it as unapproved for now https://github.com/kubernetes/enhancements/tree/master/keps/sig-api-machinery/2337-k8s.io-group-protection#proposal

@denkensk
Copy link
Member

+1 We can mark it unapproved at the alpha version.

@ahg-g
Copy link
Contributor

ahg-g commented Apr 11, 2022

This is discouraged, and it only adds to the confusion. Users don't look at CRD definitions.

ok, I don't want to delay things, I am fine with keeping the x-k8s.io domain just because it is used by other subprojects (it doesn't make it less weird though) and revisit when we go to Beta;

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 10, 2022
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Aug 9, 2022
@alculquicondor
Copy link
Contributor Author

/remove-lifecycle stale

maybe for 0.3.0 :)

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue or PR with /reopen
  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

@k8s-ci-robot
Copy link
Contributor

@k8s-triage-robot: Closing this issue.

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue or PR with /reopen
  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@alculquicondor
Copy link
Contributor Author

/reopen

@k8s-ci-robot
Copy link
Contributor

@alculquicondor: Reopened this issue.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot reopened this Sep 8, 2022
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

@k8s-ci-robot
Copy link
Contributor

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot closed this as not planned Won't fix, can't repro, duplicate, stale Oct 8, 2022
@alculquicondor
Copy link
Contributor Author

/reopen

/lifecycle frozen

we need to get back to this when targeting a v1beta1 API.

@k8s-ci-robot
Copy link
Contributor

@alculquicondor: Reopened this issue.

In response to this:

/reopen

/lifecycle frozen

we need to get back to this when targeting a v1beta1 API.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot reopened this Oct 11, 2022
@k8s-ci-robot k8s-ci-robot added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. labels Oct 11, 2022
@alculquicondor
Copy link
Contributor Author

I'm starting a list of potential changes to the API. See Issue description.

@alculquicondor
Copy link
Contributor Author

cc @kerthcet for feedback (and anyone already in the thread, of course)

@alculquicondor alculquicondor changed the title Rename API group to kueue.sigs.k8s.io Graduate API to beta and rename API group to kueue.sigs.k8s.io Jan 5, 2023
@kerthcet
Copy link
Contributor

kerthcet commented Jan 6, 2023

Move admission from Workload spec into status (from Enforce timeout for podsReady #498)

I'm ok with this, admission presents the actual state of a workload.

Rename min, max into something easier to understand.

Before, we named them requests/limits? Anyway, I think reading document is always needed, whatever the name. So I'm fine with min/max, another pair maybe guarantee/capacity?

Support queue name as a label, in addition to annotation (makes it easier to filter workloads by queue).

Can you provide some more context why we need this? It looks like looking for workloads via label selector.

@alculquicondor
Copy link
Contributor Author

Before, we named them requests/limits?

No, that would have been very confusing because they already mean something in the pod spec. I'll start a separate doc to discuss options.

Can you provide some more context why we need this? It looks like looking for workloads via label selector.

It's actually about filtering Jobs using the queue name.

@tenzen-y
Copy link
Member

tenzen-y commented Jan 6, 2023

No, that would have been very confusing because they already mean something in the pod spec. I'll start a separate doc to discuss options.

When I joined the kueue project, one of the most confusing of the kueue specification was the relationship between min/max and cohort.

IMO, the guarantee/capacity looks good rather than min/max.

@kerthcet
Copy link
Contributor

kerthcet commented Jan 7, 2023

It's actually about filtering Jobs using the queue name.

It made me think of adding queueName to job's spec kubernetes/enhancements#3397, we can filter jobs with field-selector then.

@alculquicondor
Copy link
Contributor Author

Yes, that would be ideal, but that KEP got significant push back, so I don't see it happening anytime soon.

@alculquicondor
Copy link
Contributor Author

We might also need to add ObjectMeta into each PodSet template. Cluster autoscaler needs the metadata to properly scale up.

@kerthcet
Copy link
Contributor

Is this for scaling up in advance? Or autoscaler only watching the unschedulable pods, who contains the metadata.

@alculquicondor
Copy link
Contributor Author

Yes, to scale up in advance

@alculquicondor
Copy link
Contributor Author

alculquicondor commented Feb 2, 2023

I've created a summary doc with the proposed changes as we graduate to beta (also available in the issue description): https://docs.google.com/document/d/1Uu4hfGxux4Wh_laqZMLxXdEVdty06Sb2DwB035hj700/edit?usp=sharing&resourcekey=0-b7mU7mGPCkEfhjyYDsXOBg

Some of the enhancements come from UX study sessions that we have conducted, see notes here: https://docs.google.com/document/d/1xbN46OLuhsXXHeqZrKrl9I57kpFQ2yqiYdOx0sHZC4Q/edit?usp=sharing

I have a WIP in #532

@alculquicondor alculquicondor changed the title Graduate API to beta and rename API group to kueue.sigs.k8s.io Graduate API to beta Feb 7, 2023
@tenzen-y
Copy link
Member

tenzen-y commented Feb 7, 2023

/assign @alculquicondor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/backlog Higher priority than priority/awaiting-more-evidence.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants