Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ignore reconciling against unmanaged child jobs in the jobframework #821

Closed
wants to merge 1 commit into from

Conversation

tenzen-y
Copy link
Member

@tenzen-y tenzen-y commented May 29, 2023

What type of PR is this?

/kind bug

What this PR does / why we need it:

I fixed a bug that reconciles against unmanaged child batch/job.

Which issue(s) this PR fixes:

Fixes #800

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Fix a bug that reconciles against unmanaged child batch/job.

@k8s-ci-robot k8s-ci-robot added kind/bug Categorizes issue or PR as related to a bug. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels May 29, 2023
@netlify
Copy link

netlify bot commented May 29, 2023

Deploy Preview for kubernetes-sigs-kueue canceled.

Name Link
🔨 Latest commit 658457e
🔍 Latest deploy log https://app.netlify.com/sites/kubernetes-sigs-kueue/deploys/647f99c868a7df000824aebe

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels May 29, 2023
@tenzen-y
Copy link
Member Author

/assign @alculquicondor

@tenzen-y tenzen-y changed the title Ignore recociling against unmanaged child jobs in the jobframework Ignore reconciling against unmanaged child jobs in the jobframework May 29, 2023
pkg/controller/jobframework/reconciler.go Outdated Show resolved Hide resolved
pkg/controller/jobframework/interface.go Outdated Show resolved Hide resolved
test/integration/controller/job/job_controller_test.go Outdated Show resolved Hide resolved
test/integration/controller/job/job_controller_test.go Outdated Show resolved Hide resolved
@@ -464,100 +461,6 @@ var _ = ginkgo.Describe("Job controller for workloads with no queue set", func()
return k8sClient.Get(ctx, wlLookupKey, createdWorkload)
}, util.Timeout, util.Interval).Should(gomega.Succeed())
})
ginkgo.When("The parent-workload annotation is used", func() {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alculquicondor We can not run integration test cases related to parent-workload annotation since we must create a known workload owner (currently, only mpijobs) as a parent job.

@tenzen-y
Copy link
Member Author

Maybe, the capacity of memory is insufficient :(

...
go fmt ./...
go vet ./...
golang.org/x/net/http2: /usr/local/go/pkg/tool/linux_amd64/compile: signal: killed
github.com/prometheus/procfs: /usr/local/go/pkg/tool/linux_amd64/compile: signal: killed
make: *** [Makefile:144: vet] Error 1

@tenzen-y
Copy link
Member Author

/retest

pkg/controller/jobframework/reconciler.go Outdated Show resolved Hide resolved
pkg/controller/jobframework/reconciler.go Outdated Show resolved Hide resolved
})

ginkgo.It("Should not suspend a child job if the parent job doesn't have a queue name", func() {
ginkgo.By("Creating the child job without ownerReference which uses the parent workload annotation")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 maybe we should not allow a job to have the parent workload annotation if it doesn't have an owner reference?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense. Actually, we verify the ownerReference at a defaulting webhook:

if owner := metav1.GetControllerOf(job); owner != nil && jobframework.KnownWorkloadOwner(owner) {

However, we don't have validation webhooks for that.

Copy link
Member Author

@tenzen-y tenzen-y May 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking of adding a validation at a follow-up PR 🤔

Either way, I will move this case to a unit test.

test/integration/controller/mpijob/suite_test.go Outdated Show resolved Hide resolved
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 31, 2023
@tenzen-y tenzen-y force-pushed the framework-reconciler branch 2 times, most recently from 76e315f to bb1ea47 Compare June 1, 2023 16:37
@@ -14,7 +14,7 @@ See the License for the specific language governing permissions and
limitations under the License.
*/

package jobframework
package constants
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because of avoiding the import cycle.

@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 1, 2023
@tenzen-y
Copy link
Member Author

tenzen-y commented Jun 1, 2023

Rebased to resolve conflicts.

@tenzen-y
Copy link
Member Author

tenzen-y commented Jun 1, 2023

@alculquicondor I've addressed your suggestions. Can you please take another look?

@@ -1293,7 +1293,7 @@ func TestCacheWorkloadOperations(t *testing.T) {
}

gotError := step.operation(cache)
if diff := cmp.Diff(step.wantError, messageOrEmpty(gotError)); diff != "" {
if diff := cmp.Diff(step.wantError, utiltesting.MessageOrEmpty(gotError)); diff != "" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This escaped my review lens in the past 🧐

I'm not really a fan of checking errors by comparing strings. We should do the effort of implementing Is or exporting the relevant errors in the packages, so that we cmp.Diff can handle the rest.

But it can be left for a follow up.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. I will work on cache pkg at follow-up PRs.

pkg/controller/jobframework/reconciler_test.go Outdated Show resolved Hide resolved
pkg/controller/jobframework/reconciler.go Outdated Show resolved Hide resolved
pkg/controller/jobframework/reconciler.go Outdated Show resolved Hide resolved
pkg/util/testing/units.go Outdated Show resolved Hide resolved
@tenzen-y
Copy link
Member Author

tenzen-y commented Jun 2, 2023

@alculquicondor I addressed your comments. PTAL

@tenzen-y tenzen-y force-pushed the framework-reconciler branch 2 times, most recently from c24c743 to 12122e1 Compare June 6, 2023 20:16
@tenzen-y
Copy link
Member Author

tenzen-y commented Jun 6, 2023

@alculquicondor I'm sorry I didn't immediately understand your suggestion correctly.
I updated this PR. Can you re-check this?

Copy link
Contributor

@alculquicondor alculquicondor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

A couple of nits

pkg/controller/jobframework/reconciler.go Outdated Show resolved Hide resolved
pkg/controller/jobframework/reconciler.go Outdated Show resolved Hide resolved
pkg/controller/jobframework/reconciler_test.go Outdated Show resolved Hide resolved
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: alculquicondor, tenzen-y

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 6, 2023
Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>
@tenzen-y
Copy link
Member Author

tenzen-y commented Jun 6, 2023

@alculquicondor I fixed all nits and squashed all commit into one.

@alculquicondor
Copy link
Contributor

Structured logging is typically output as JSON, so camelCase is preferred.

@tenzen-y
Copy link
Member Author

tenzen-y commented Jun 7, 2023

Structured logging is typically output as JSON, so camelCase is preferred.

I see. That makes sense.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 7, 2023
@k8s-ci-robot
Copy link
Contributor

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@tenzen-y
Copy link
Member Author

tenzen-y commented Jun 7, 2023

I have rebased and addressed all suggestions.

@tenzen-y
Copy link
Member Author

tenzen-y commented Jun 7, 2023

GitHub is under disruption...
https://www.githubstatus.com/

@alculquicondor
Copy link
Contributor

rebase needed 😅

@tenzen-y
Copy link
Member Author

tenzen-y commented Jun 7, 2023

rebase needed 😅

Actually, I rebased and pushed the commit... But due to GitHub disruption, this PR isn'n updated...

$ git log --one-line
da57d98 (HEAD -> framework-reconciler, origin/framework-reconciler) Use camelCase for logs
4017e3e Ignore reconciling against unmanaged child jobs in the jobframework
97639b9 (origin/main, origin/HEAD) Merge pull request #831 from epam/kustomize-deprecated-fields
68825ad Merge pull request #833 from cpanato/migrate
7444b27 Partial admission (#771)
5f02e23 Migrate away from google.com gcp project k8s-testimages

tenzen-y@da57d98

@tenzen-y tenzen-y closed this Jun 7, 2023
@tenzen-y
Copy link
Member Author

tenzen-y commented Jun 7, 2023

/reopen

@k8s-ci-robot
Copy link
Contributor

@tenzen-y: Failed to re-open PR: state cannot be changed. The framework-reconciler branch was force-pushed or recreated.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@tenzen-y
Copy link
Member Author

tenzen-y commented Jun 7, 2023

This PR was broken :(

@tenzen-y
Copy link
Member Author

tenzen-y commented Jun 7, 2023

/reopen

@k8s-ci-robot
Copy link
Contributor

@tenzen-y: Failed to re-open PR: state cannot be changed. The framework-reconciler branch was force-pushed or recreated.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bug: framework.Reconciler reconciles against unmanaged child batch/job
3 participants