Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lost annotation in task.template #1567

Closed
hwdef opened this issue Jun 28, 2021 · 12 comments · Fixed by #1649
Closed

lost annotation in task.template #1567

hwdef opened this issue Jun 28, 2021 · 12 comments · Fixed by #1649
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@hwdef
Copy link
Member

hwdef commented Jun 28, 2021

What happened:

When I add a annotation to vcjob spec.tasks.template.metadata,But the running pod does not have this annotation.

What you expected to happen:

Keep annotation in pod

How to reproduce it (as minimally and precisely as possible):

create a vcjob

apiVersion: batch.volcano.sh/v1alpha1
kind: Job
metadata:
  name: test-job
spec:
  schedulerName: volcano
  priorityClassName: high-priority
  policies:
    - event: PodEvicted
      action: RestartJob
  maxRetry: 10
  tasks:
    - replicas: 1
      name: "default-nginx"
      template:
        metadata:
          name: web
          annotations:
            "xxxxxxxxxxxxx": "xxxxxxxxxxxxxx"
        spec:
          containers:
            - image: nginx
              imagePullPolicy: IfNotPresent
              name: nginx
              resources:
                requests:
                  cpu: "100m"
          restartPolicy: OnFailure

get this vcjob

[centos@master ~]$ kubectl get vcjob test-job -oyaml
apiVersion: batch.volcano.sh/v1alpha1
kind: Job
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"batch.volcano.sh/v1alpha1","kind":"Job","metadata":{"annotations":{},"name":"test-job","namespace":"default"},"spec":{"maxRetry":10,"policies":[{"action":"RestartJob","event":"PodEvicted"}],"priorityClassName":"high-priority","schedulerName":"volcano","tasks":[{"name":"default-nginx","replicas":1,"template":{"metadata":{"annotations":{"xxxxxxxxxxxxx":"xxxxxxxxxxxxxx"},"name":"web"},"spec":{"containers":[{"image":"nginx","imagePullPolicy":"IfNotPresent","name":"nginx","resources":{"requests":{"cpu":"100m"}}}],"restartPolicy":"OnFailure"}}}]}}
  creationTimestamp: "2021-06-28T02:15:27Z"
  generation: 1
  name: test-job
  namespace: default
  resourceVersion: "1446893"
  uid: 6e0ddf28-d920-4eae-a0c0-f38f4e35298e
spec:
  maxRetry: 10
  minAvailable: 1
  policies:
  - action: RestartJob
    event: PodEvicted
  priorityClassName: high-priority
  queue: default
  schedulerName: volcano
  tasks:
  - minAvailable: 1
    name: default-nginx
    replicas: 1
    template:
      metadata: {}
      spec:
        containers:
        - image: nginx
          imagePullPolicy: IfNotPresent
          name: nginx
          resources:
            requests:
              cpu: 100m
        restartPolicy: OnFailure
status:
  minAvailable: 1
  running: 1
  state:
    lastTransitionTime: "2021-06-28T02:15:32Z"
    phase: Running
  taskStatusCount:
    default-nginx:
      phase:
        Running: 1

get pod created by this vcjob

[centos@master ~]$ kubectl get pod test-job-default-nginx-0 -o yaml
apiVersion: v1
kind: Pod
metadata:
  annotations:
    scheduling.k8s.io/group-name: test-job
    volcano.sh/job-name: test-job
    volcano.sh/job-version: "0"
    volcano.sh/queue-name: default
    volcano.sh/task-spec: default-nginx
    volcano.sh/template-uid: test-job-default-nginx
  creationTimestamp: "2021-06-28T02:15:28Z"
  labels:
    volcano.sh/job-name: test-job
    volcano.sh/job-namespace: default
    volcano.sh/queue-name: default
  name: test-job-default-nginx-0
  namespace: default
  ownerReferences:
  - apiVersion: batch.volcano.sh/v1alpha1
    blockOwnerDeletion: true
    controller: true
    kind: Job
    name: test-job
    uid: 6e0ddf28-d920-4eae-a0c0-f38f4e35298e
  resourceVersion: "1446892"
  uid: e7266266-56bc-45b0-9f16-8ff976235b2d
spec:
  containers:
  - image: nginx
    imagePullPolicy: IfNotPresent
    name: nginx
    resources:
      requests:
        cpu: 100m
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: kube-api-access-vw6sr
      readOnly: true
  dnsPolicy: ClusterFirst
  enableServiceLinks: true
  nodeName: node1
  preemptionPolicy: PreemptLowerPriority
  priority: 0
  restartPolicy: OnFailure
  schedulerName: volcano
  securityContext: {}
  serviceAccount: default
  serviceAccountName: default
  terminationGracePeriodSeconds: 30
  tolerations:
  - effect: NoExecute
    key: node.kubernetes.io/not-ready
    operator: Exists
    tolerationSeconds: 300
  - effect: NoExecute
    key: node.kubernetes.io/unreachable
    operator: Exists
    tolerationSeconds: 300
  volumes:
  - name: kube-api-access-vw6sr
    projected:
      defaultMode: 420
      sources:
      - serviceAccountToken:
          expirationSeconds: 3607
          path: token
      - configMap:
          items:
          - key: ca.crt
            path: ca.crt
          name: kube-root-ca.crt
      - downwardAPI:
          items:
          - fieldRef:
              apiVersion: v1
              fieldPath: metadata.namespace
            path: namespace
status:
  conditions:
  - lastProbeTime: null
    lastTransitionTime: "2021-06-28T02:15:30Z"
    status: "True"
    type: Initialized
  - lastProbeTime: null
    lastTransitionTime: "2021-06-28T02:15:32Z"
    status: "True"
    type: Ready
  - lastProbeTime: null
    lastTransitionTime: "2021-06-28T02:15:32Z"
    status: "True"
    type: ContainersReady
  - lastProbeTime: null
    lastTransitionTime: "2021-06-28T02:15:29Z"
    status: "True"
    type: PodScheduled
  containerStatuses:
  - containerID: docker://4181ba7f95dd9099fc284a9f3cc962f36f8f6b566d9208b62b26a31c671f2485
    image: nginx:latest
    imageID: docker-pullable://nginx@sha256:6d75c99af15565a301e48297fa2d121e15d80ad526f8369c526324f0f7ccb750
    lastState: {}
    name: nginx
    ready: true
    restartCount: 0
    started: true
    state:
      running:
        startedAt: "2021-06-28T02:15:31Z"
  hostIP: 10.40.20.107
  phase: Running
  podIP: 10.244.3.18
  podIPs:
  - ip: 10.244.3.18
  qosClass: Burstable
  startTime: "2021-06-28T02:15:30Z"

Anything else we need to know?:

get poggroup

[centos@master ~]$ kubectl get pg test-job -o yaml
apiVersion: scheduling.volcano.sh/v1beta1
kind: PodGroup
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"batch.volcano.sh/v1alpha1","kind":"Job","metadata":{"annotations":{},"name":"test-job","namespace":"default"},"spec":{"maxRetry":10,"policies":[{"action":"RestartJob","event":"PodEvicted"}],"priorityClassName":"high-priority","schedulerName":"volcano","tasks":[{"name":"default-nginx","replicas":1,"template":{"metadata":{"annotations":{"xxxxxxxxxxxxx":"xxxxxxxxxxxxxx"},"name":"web"},"spec":{"containers":[{"image":"nginx","imagePullPolicy":"IfNotPresent","name":"nginx","resources":{"requests":{"cpu":"100m"}}}],"restartPolicy":"OnFailure"}}}]}}
  creationTimestamp: "2021-06-28T02:15:27Z"
  generation: 8
  name: test-job
  namespace: default
  ownerReferences:
  - apiVersion: batch.volcano.sh/v1alpha1
    blockOwnerDeletion: true
    controller: true
    kind: Job
    name: test-job
    uid: 6e0ddf28-d920-4eae-a0c0-f38f4e35298e
  resourceVersion: "1446896"
  uid: 36cd4e41-8054-4831-997e-fb0f7d4101e3
spec:
  minMember: 1
  minResources:
    cpu: 100m
  priorityClassName: high-priority
  queue: default
status:
  conditions:
  - lastTransitionTime: "2021-06-28T02:15:29Z"
    message: '1/0 tasks in gang unschedulable: pod group is not ready, 1 minAvailable.'
    reason: NotEnoughResources
    status: "True"
    transitionID: faada022-83fc-44f8-a4b0-475a8eeb8e68
    type: Unschedulable
  - lastTransitionTime: "2021-06-28T02:15:33Z"
    reason: tasks in gang are ready to be scheduled
    status: "True"
    transitionID: ec4c4b0f-3b92-4cc9-ba6a-3d8d16c886e1
    type: Scheduled
  phase: Running
  running: 1

Environment:

  • Volcano Version: 1.3.0
  • Kubernetes version (use kubectl version): 1.21.0
  • Cloud provider or hardware configuration:
  • OS (e.g. from /etc/os-release): centos 7
  • Kernel (e.g. uname -a): Linux master 3.10.0-1160.24.1.el7.x86_64 SMP Thu Apr 8 19:51:47 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
  • Install tools: kubeadm
  • Others:
@hwdef hwdef added the kind/bug Categorizes issue or PR as related to a bug. label Jun 28, 2021
@hwdef hwdef changed the title lose lost annotation in task.template Jun 28, 2021
@Thor-wl Thor-wl self-assigned this Jun 28, 2021
@Thor-wl
Copy link
Contributor

Thor-wl commented Jun 28, 2021

Thanks for your report. I'll try to reproduce it.

@hwdef
Copy link
Member Author

hwdef commented Jun 28, 2021

@Thor-wl

I will try to fix it, too.

@Thor-wl
Copy link
Contributor

Thor-wl commented Jun 28, 2021

@Thor-wl

I will try to fix it, too.

Great!

@xizhang919
Copy link

The label gets overwritten too.

@hwdef
Copy link
Member Author

hwdef commented Jun 28, 2021

@xizhang919

yes,you are right

@kye308
Copy link

kye308 commented Jun 28, 2021

I believe I am seeing this issue with labels being set in task.template as well

@max0ne
Copy link

max0ne commented Jul 28, 2021

@hwdef @Thor-wl any update on this?

@Thor-wl
Copy link
Contributor

Thor-wl commented Jul 29, 2021

@hwdef @Thor-wl any update on this?

I'm sorry for working on hierarchical queue recently and having not retested this bug. Are you interested in helping for that?

@hwdef
Copy link
Member Author

hwdef commented Jul 29, 2021

@max0ne @Thor-wl
I checked some codes, but couldn't find the reason. When the webhook is passed in, the annotation has been lost.

@hwdef
Copy link
Member Author

hwdef commented Jul 29, 2021

Going forward may have to look at the code in apiserver

@shinytang6
Copy link
Member

upstream related issues: kubernetes-sigs/controller-tools#448

@shinytang6
Copy link
Member

maybe we should upgrade the controller-tools version, v0.6.0 seems to fix that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants