Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Empty imagePullSecrets leads to status.conditions never being set #101697

Closed
jendrikjoe opened this issue May 2, 2021 · 7 comments · Fixed by #103133
Closed

Empty imagePullSecrets leads to status.conditions never being set #101697

jendrikjoe opened this issue May 2, 2021 · 7 comments · Fixed by #103133
Labels
kind/bug Categorizes issue or PR as related to a bug. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@jendrikjoe
Copy link

I am aware that the following is a corner case that does not occur in day-to-day business, but it took me a long while to debug, so I thought I would raise this issue to save others from the same pain.

What happened:

Deploying the following job with empty imagePullSecrets leads to it being stuck in Pending forever, and status looking the following way:

"status": {
    "phase": "Pending",
    "qosClass": "BestEffort"
}
apiVersion: batch/v1
kind: Job
metadata:
  name: hello
spec:
  template:
    spec:
      imagePullSecrets:
        - {}
      containers:
        - name: hello
          image: busybox
          command: ["sh", "-c", 'echo "Hello, Kubernetes!" && sleep 3600']
      restartPolicy: OnFailure
      tolerations:
        - key: "jobs"
          operator: "Equal"
          value: "1"
          effect: "NoSchedule"
      nodeSelector:
        jobs: "1"

What you expected to happen:

The status should look something like:

"status": {
    "phase": "Pending",
    "conditions": [
           {
                "type": "PodScheduled",
                "status": "False",
                "lastProbeTime": null,
                "lastTransitionTime": "2021-05-02T18:34:25Z",
                "reason": "Unschedulable",
                "message": "0/8 nodes are available: 8 node(s) didn't match node selector."
           }
    ],
    "qosClass": "BestEffort"
}

This would allow my autoscaler to kick in and add the corresponding nodes.

How to reproduce it (as minimally and precisely as possible):

Run kubectl apply -f <file-with-above-yaml-as-content>

Anything else we need to know?:

Environment:

  • Kubernetes version (use kubectl version):
Client Version: version.Info{Major:"1", Minor:"20+", GitVersion:"v1.20.4-dirty", GitCommit:"e87da0bd6e03ec3fea7933c4b5263d151aafd07c", GitTreeState:"dirty", BuildDate:"2021-03-15T09:58:13Z", GoVersion:"go1.16.2", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"19+", GitVersion:"v1.19.8-eks-96780e", GitCommit:"96780e1b30acbf0a52c38b6030d7853e575bcdf3", GitTreeState:"clean", BuildDate:"2021-03-10T21:32:29Z", GoVersion:"go1.15.8", Compiler:"gc", Platform:"linux/amd64"}
  • Cloud provider or hardware configuration: AWS EKS
  • OS (e.g: cat /etc/os-release): MacOSX 11.3
  • Kernel (e.g. uname -a): Darwin Kernel Version 20.4.0
  • Install tools: brew
@jendrikjoe jendrikjoe added the kind/bug Categorizes issue or PR as related to a bug. label May 2, 2021
@k8s-ci-robot k8s-ci-robot added needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels May 2, 2021
@jendrikjoe
Copy link
Author

/sig scheduling (I assume)

@k8s-ci-robot k8s-ci-robot added the sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. label May 2, 2021
@k8s-ci-robot
Copy link
Contributor

@jendrikjoe: The label(s) sig/(i, sig/assume) cannot be applied, because the repository doesn't have them.

In response to this:

/sig scheduling (I assume)

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot removed the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label May 2, 2021
@pacoxu
Copy link
Member

pacoxu commented May 6, 2021

/assign

Scheduler log

E0506 06:11:37.642438 1 scheduler.go:342] "Error updating pod" err="failed to create merge patch for pod "default"/"hello-xzpmf": map: map[] does not contain declared merge key: name" pod="default/hello-xzpmf"

@tengqm
Copy link
Contributor

tengqm commented Jun 24, 2021

Should we instead forbid empty items in imagePullSecrets? Is there a use case for using [{}] as imagePullSecrets?

@jendrikjoe
Copy link
Author

Should we instead forbid empty items in imagePullSecrets? Is there a use case for using [{}] as imagePullSecrets?

For me personally that would be totally fine 👍 I had no use case for it. It was more produced by chance as I wasn't aware that I had to handle the case [{}] separately. An error which led to an early failure would have been fine by me and solved all my problems in tracking this down.

@pacoxu pacoxu removed their assignment Jun 28, 2021
@pacoxu
Copy link
Member

pacoxu commented Jun 28, 2021

/assign @marwanad
as #103133

@alculquicondor
Copy link
Member

Forbidding it is the appropriate solution. However, we can't do it for backwards compatibility of the API.

/triage accepted

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jun 28, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
5 participants