New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for single pods #1072
Conversation
Hi @achernevskii. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
✅ Deploy Preview for kubernetes-sigs-kueue canceled.
|
/ok-to-test |
please rebase instead of merge. It might be ok to squash at this point |
b622135
to
31fd8e1
Compare
bbd171f
to
15ca7b3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/approve
/hold for nits
I will leave LGTM to @tenzen-y
pkg/controller/jobs/kubeflow/jobs/pytorchjob/pytorchjob_controller.go
Outdated
Show resolved
Hide resolved
pkg/controller/jobs/kubeflow/jobs/xgboostjob/xgboostjob_controller.go
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, can you rebase this PR and add IsManagingObjectsOwner
to the PaddleJob?
utilruntime.Must(jobframework.RegisterIntegration(FrameworkName, jobframework.IntegrationCallbacks{ | |
SetupIndexes: SetupIndexes, | |
NewReconciler: NewReconciler, | |
SetupWebhook: SetupPaddleJobWebhook, | |
JobType: &kftraining.PaddleJob{}, | |
AddToScheme: kftraining.AddToScheme, | |
})) |
}) | ||
}) | ||
}) | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add test cases when starting manager with the scheduler like other integrations:
var _ = ginkgo.Describe("JobSet controller interacting with scheduler", ginkgo.Ordered, ginkgo.ContinueOnFailure, func() { |
/release-note Add support for single pods |
@achernevskii Can you add a release note? |
* Add namespace/pod label filtering for Default pod webhook. Add PodIntegrationOptions configuration field containing namespace and pod label selectors. * Simplify RunWithPodSetsInfo method for the pod controller. * Create a new interface JobWithFinalize. Jobs implementing it supports custom finalization logic. * Add k8s version check. If the pod integration is enabled on k8s server versions < 1.27, Kueue pod will stop with an error message. * Add JobWithSkip interface. Jobs that implement this interface can introduce custom reconciliation skip logic. * Add IsPodOwnerManagedByQueue function. Defaulting webhook will skip a pod if it's owner is managed by Kueue. Reconciler will skip such a pod even if 'managed' label is set. * Add integration tests for the pod controller and webhook.
* Add tests for pod controller interacting with scheduler * Change IsManagingObjectsOwner functions for Kubeflow jobs. * Update webhook paths for integration tests. * Add parent check for kubeflow PaddleJob resource * Update helm chart. * Replace patchesStrategicMerge with patches in webhook kustomization.yaml * Update unit/integration tests. * Change pod webhook failure policy from Ignore to fail, add webhook namespace selector patch. * Merge tests in validation_test.go into a single test for exported ValidateConfiguration function. * Update JobWithSkip interface. * Add missing licence comments. * Rewrite some of the integration tests messages and value validations. * Update unit tests.
4e65928
to
2d5ba77
Compare
407abd4
to
aac3eb4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@achernevskii This looks great! Thank you!
/lgtm
/approve
LGTM label has been added. Git tree hash: 305aa7f7f714fdb39db5225c75e94406e27ca817
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: achernevskii, alculquicondor, tenzen-y The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/hold cancel |
What type of PR is this?
/kind feature
What this PR does / why we need it:
Add support for pods (k8s 1.26+)
Add
PodIntegrationOptions
configuration field containingnamespace and pod label selectors.
Create a new interface
JobWithFinalize
. Jobs implementingit supports custom finalization logic.
Add namespace/pod label filtering for Default pod webhook.
Which issue(s) this PR fixes:
Related issue: #976
Special notes for your reviewer:
Does this PR introduce a user-facing change?