Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementation of coscheduling based on CRD [Part 1] #52

Merged
merged 1 commit into from
Sep 22, 2020

Conversation

cwdsuzhou
Copy link
Member

@cwdsuzhou cwdsuzhou commented Sep 16, 2020

Part 1: we implement the controller and crd
.
/cc @Huang-Wei

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Sep 16, 2020
@denkensk
Copy link
Member

We need to separate this PR to two @cwdsuzhou @Huang-Wei

  1. CRD Define + Controller
  2. Use the PodGroup in the current coscheduling implemention (instead of deleteing the current coscheduling directly)

_ "sigs.k8s.io/scheduler-plugins/pkg/apis/config/scheme"
)

func main() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to separate main function for coscheduling.

@@ -0,0 +1,38 @@
/*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same as scheduler, we don't need separate controller.

@@ -1,7 +1,6 @@
# Overview

This folder holds the coscheduling plugin implementations based on [Lightweight coscheduling based on back-to-back queue
sorting](https://github.com/kubernetes-sigs/scheduler-plugins/tree/master/kep/2-lightweight-coscheduling).
This folder holds the coscheduling plugin implementations based on [Coscheduling based on PodGroup CRD](https://github.com/kubernetes-sigs/scheduler-plugins/tree/master/kep/42-podgroup-coscheduling).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please leave the link of kep https://github.com/kubernetes-sigs/scheduler-plugins/tree/master/kep/2-lightweight-coscheduling here.

@@ -0,0 +1,21 @@
/*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move the apis from pkg/coscheduling/apis to pkg/apis/

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I thinks this apis are only for coscheduling

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PodGroup is a universal concept. It maybe used by other plugins.

// MinResources defines the minimal resource of members/tasks to run the pod group;
// if there's not enough resources to start all tasks, the scheduler
// will not start anyone.
MinResources *v1.ResourceList `json:"minResources,omitempty"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As comments before, MinResource was going to removed.
#42 (comment)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#42 (comment), this it the latest.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused about how to set the MinResource. It's the total minimal resource of the PodGroup or the pod?
yes


Yes? the PodGroup or the pod?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused about how to set the MinResource. It's the total minimal resource of the PodGroup or the pod?
yes

Yes? the PodGroup or the pod?

We have discussed about this question offline.

minimal resource of PodGroup

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. It's better to change like this:

// MinResources defines the minimal resource of members/tasks to run the pod group; ---> // MinResources defines the minimal resource to run the pod group;

@cwdsuzhou
Copy link
Member Author

We need to separate this PR to two @cwdsuzhou @Huang-Wei

  1. CRD Define + Controller
  2. Use the PodGroup in the current coscheduling implemention (instead of deleteing the current coscheduling directly)

Too many difference between the existing version and this one. The existing version even can not pass the integration test. If you do not like this, we would like move it to an independent pkg named coscheduling-with-crd. Actually we just use similar interface, detailed implementations are absolutely different.

@denkensk
Copy link
Member

Please separate this PR to two firstly.
CRD Define + Controller
/hold

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 16, 2020
Copy link
Contributor

@Huang-Wei Huang-Wei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @cwdsuzhou !

A high-level comment is that I'd still prefer not to mixin the implementation into this PR, i.e., it's ideal to have 2 PRs - one for CRD and controller, the 2nd one to include the implementation code. Additionally, the code structure needs to be tweaked a bit as we won't generate binaries for each plugin - all plugins will be compiled into particular type of binaries (scheduler, controller, admission, etc.).

With that said, here are my comments:

  • Retitle the PR to "Implementation of coscheduling based on CRD [Part 1]"
  • Move cmd/main.go and cmd/main_test.go to a new pkg cmd/scheduler/.
  • Make cmd/coscheduling/controller a generic controller, i.e., move cmd/coscheduler/controller/* to cmd/controller
  • Move pkg/coscheduling/apis/podgroup to pkg/apis/podgroup as PodGroup is a generic API, and hence may consumed by multiple plugins. It also follows the upstream code structure.
  • pkg/coscheduling/controller/* should also go to pkg/controller, and named as podgroup.go.
  • Move pkg/coscheduling/generated to pkg/generated
  • IMO pkg/coscheduling/scheduler/*, along with pkg/coscheduling/coscheduler.go can be introduced in the next PR.
  • Add tests to test PodGroup and controller logic only.
  • Modify Dockerfile and Makefile based on latest codebase.

FYI: I've exercised some (not all) of the above comments, you can supplement based on it:

@cwdsuzhou
Copy link
Member Author

Thanks for the PR @cwdsuzhou !

A high-level comment is that I'd still prefer not to mixin the implementation into this PR, i.e., it's ideal to have 2 PRs - one for CRD and controller, the 2nd one to include the implementation code. Additionally, the code structure needs to be tweaked a bit as we won't generate binaries for each plugin - all plugins will be compiled into particular type of binaries (scheduler, controller, admission, etc.).

With that said, here are my comments:

  • Retitle the PR to "Implementation of coscheduling based on CRD [Part 1]"
  • Move cmd/main.go and cmd/main_test.go to a new pkg cmd/scheduler/.
  • Make cmd/coscheduling/controller a generic controller, i.e., move cmd/coscheduler/controller/* to cmd/controller
  • Move pkg/coscheduling/apis/podgroup to pkg/apis/podgroup as PodGroup is a generic API, and hence may consumed by multiple plugins. It also follows the upstream code structure.
  • pkg/coscheduling/controller/* should also go to pkg/controller, and named as podgroup.go.
  • Move pkg/coscheduling/generated to pkg/generated
  • IMO pkg/coscheduling/scheduler/*, along with pkg/coscheduling/coscheduler.go can be introduced in the next PR.
  • Add tests to test PodGroup and controller logic only.
  • Modify Dockerfile and Makefile based on latest codebase.

FYI: I've exercised some (not all) of the above comments, you can supplement based on it:

Thanks!

@cwdsuzhou cwdsuzhou changed the title Implementation of coscheduling based on CRD Implementation of coscheduling based on CRD [Part 1] Sep 17, 2020
Copy link
Contributor

@Huang-Wei Huang-Wei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just ignore the flaky test, #54 is created for tracking that.

cmd/scheduler/main.go Outdated Show resolved Hide resolved
pkg/util/k8s.go Outdated Show resolved Hide resolved
pkg/util/k8s.go Outdated Show resolved Hide resolved
cmd/controller/app/server.go Outdated Show resolved Hide resolved
cmd/controller/app/server.go Outdated Show resolved Hide resolved
pkg/controller/podgroup_test.go Outdated Show resolved Hide resolved
pkg/controller/podgroup_test.go Outdated Show resolved Hide resolved
pkg/controller/podgroup_test.go Outdated Show resolved Hide resolved
pkg/controller/podgroup_test.go Outdated Show resolved Hide resolved
pkg/controller/podgroup_test.go Outdated Show resolved Hide resolved
@cwdsuzhou
Copy link
Member Author

PR updated according to your suggestions @Huang-Wei

// +optional
metav1.ListMeta `json:"metadata,omitempty"`

// items is the list of PodGroup
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// items is the list of PodGroup
// Items is the list of PodGroup

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

@Huang-Wei
Copy link
Contributor

There is one comment unresolved: #52 (comment). LGTM otherwise.

@denkensk do you want to take another round of review?


const (
// PodGroupLabel is the default label of batch scheduler
PodGroupLabel = "podgroup.scheduling.sigs.k8s.io"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

podgroup.scheduling.sigs.k8s.io --> pod-group.scheduling.sigs.k8s.io

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, podgroup is alose OK. WDYT

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. pod-group conforms to the naming standard of k8s. We talk about it before.
  2. Hope to be consistent with the current implementation of the naming

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, done!


var (
// ErrorNotMatched means pod does not match batch scheduling
ErrorNotMatched = fmt.Errorf("not match batch scheduling")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not match batch scheduling --> not match coscheduling

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, thanks

Failed uint32 `json:"failed,omitempty"`

// ScheduleStartTime of the group
ScheduleStartTime metav1.Time `json:"scheduleStartTime,omitempty"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ScheduleStartTime is the create time of PG ?
If So, it's better to rename it to CreateTime?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is field just record when the group starts scheduling

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, It also need to add the field to record the create time of PG.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MetaData of CRD already includes this field!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, Thanks

Change codes according to round 1 suggestions

Update PR based on suggestions
@Huang-Wei
Copy link
Contributor

/approve

leaving /lgtm to @denkensk

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cwdsuzhou, Huang-Wei

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 21, 2020
@cwdsuzhou
Copy link
Member Author

ping @denkensk

@cwdsuzhou
Copy link
Member Author

/unhold

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 22, 2020
@cwdsuzhou
Copy link
Member Author

@denkensk @Huang-Wei

seems not work. May be you should left only /lgtm ?

@denkensk
Copy link
Member

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 22, 2020
@k8s-ci-robot k8s-ci-robot merged commit 96fd011 into kubernetes-sigs:master Sep 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants