Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Flaky] Creating a Job In a Twostepadmission Queue [It] Should unsuspend a job only after all checks are cleared #1090

Closed
alculquicondor opened this issue Aug 30, 2023 · 11 comments · Fixed by #1127
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@alculquicondor
Copy link
Contributor

What happened:

Kueue when Creating a Job In a Twostepadmission Queue [It] Should unsuspend a job only after all checks are cleared
/home/prow/go/src/sigs.k8s.io/kueue/test/e2e/e2e_test.go:215
  [FAILED] Timed out after 30.001s.
  Expected
      <[]interface {} | len:2, cap:2>: [
          <bool>true,
          <map[string]string | len:0>nil,
      ]
  to equal
      <[]interface {} | len:2, cap:2>: [
          <bool>false,
          <map[string]string | len:1>{
              "instance-type": "on-demand",
          },
      ]
  In [It] at: /home/prow/go/src/sigs.k8s.io/kueue/test/e2e/e2e_test.go:270 @ 08/30/23 19:13:45.757

How to reproduce it (as minimally and precisely as possible):

https://prow.k8s.io/view/gs/kubernetes-jenkins/pr-logs/pull/kubernetes-sigs_kueue/1031/pull-kueue-test-e2e-main-1-24/1696962388068143104

Anything else we need to know?:

Environment:

  • Kubernetes version (use kubectl version):
  • Kueue version (use git describe --tags --dirty --always):
  • Cloud provider or hardware configuration:
  • OS (e.g: cat /etc/os-release):
  • Kernel (e.g. uname -a):
  • Install tools:
  • Others:
@alculquicondor alculquicondor added the kind/bug Categorizes issue or PR as related to a bug. label Aug 30, 2023
@alculquicondor
Copy link
Contributor Author

@achernevskii could you take a look?

@achernevskii
Copy link
Contributor

/assign

@alculquicondor
Copy link
Contributor Author

/unassign @achernevskii
/assign @trasc

This is flakying a lot, can you PTAL Traian?

@k8s-ci-robot k8s-ci-robot assigned trasc and unassigned achernevskii Sep 13, 2023
@trasc
Copy link
Contributor

trasc commented Sep 13, 2023

Sure

@trasc
Copy link
Contributor

trasc commented Sep 18, 2023

#1127 should fix this particular case , however the main problem looks to be the fact the the eviction is not longer working as expected.

@alculquicondor
Copy link
Contributor Author

Can you elaborate?

@tenzen-y
Copy link
Member

however the main problem looks to be the fact the the eviction is not longer working as expected.

@trasc Could you clarify the non-expected behavior in the eviction?

@trasc
Copy link
Contributor

trasc commented Sep 26, 2023

Even with the "false" admission, the flow wold have been ,

  1. Admit(before first reconcile)
  2. Evict (during the first reconcile when the ACs are added)
  3. Readmit

However, at least in the configured timeout, the eviction is not ending and the readmission will not take place.

If remember correctly, "Kueue when Creating a Job With Queueing Should readmit preempted job with workloadPriorityClass into a separate flavor" was flaky as well , but since we no longer test 1.24 and we no longer have access to the tests grid I'n not sure.

@tenzen-y
Copy link
Member

Even with the "false" admission, the flow wold have been ,

  1. Admit(before first reconcile)
  2. Evict (during the first reconcile when the ACs are added)
  3. Readmit

However, at least in the configured timeout, the eviction is not ending and the readmission will not take place.

If remember correctly, "Kueue when Creating a Job With Queueing Should readmit preempted job with workloadPriorityClass into a separate flavor" was flaky as well , but since we no longer test 1.24 and we no longer have access to the tests grid I'n not sure.

I see. Thanks for clarifying.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
4 participants