Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retrying failed workflow using WorkflowTemplate get stuck in Running phase of the workflow and Pending phase of the node if POD_NAMES=v2 #7315

Closed
mikutas opened this issue Dec 1, 2021 · 0 comments · Fixed by #7316

Comments

@mikutas
Copy link
Contributor

mikutas commented Dec 1, 2021

Summary

What happened/what you expected to happen?

When retrying failed workflow, the old pod is not deleted and the new pod is not created if POD_NAMES=v2.
The UI tells the workflow's phases is Running and the node's phase is Pending. And nothing will change therefrom.

image

What version of Argo Workflows are you running?

master (d4aa9d1) (includes #7301)

Diagnostics

examples for repro

# templates.yaml
apiVersion: argoproj.io/v1alpha1
kind: WorkflowTemplate
metadata:
  name: workflow-template-whalesay-template
spec:
  entrypoint: whalesay-template
  templates:
  - name: whalesay-template
    inputs:
      parameters:
      - name: message
    container:
      image: docker/whalesay
      command: [cowsay]
      args: ["{{inputs.parameters.message}}"]
---
apiVersion: argoproj.io/v1alpha1
kind: WorkflowTemplate
metadata:
  name: workflow-template-fail-template
spec:
  templates:
  - name: fail-template
    container:
      image: python:alpine3.6
      command: [python, -c]
      # fail with a 100% probability
      args: ["import sys; sys.exit(1)"]
# failure-in-step2.yaml
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  generateName: workflow-template-retry-with-steps-
spec:
  entrypoint: retry-with-steps
  templates:
  - name: retry-with-steps
    steps:
    - - name: hello1
        templateRef:
          name: workflow-template-whalesay-template
          template: whalesay-template
        arguments:
          parameters:
          - name: message
            value: "I am hello1"
    - - name: hello2
        templateRef:
          name: workflow-template-fail-template
          template: fail-template
# fish
env POD_NAMES=v2 make start API=true UI=true
kubectl apply -f templates.yaml -n argo
argo submit failure-in-step2.yaml -n argo

What Kubernetes provider are you using?

k3d

What executor are you running?

Emissary? (just ran make argoexec-image)

# Logs from the server:
argo-server | time="2021-12-01T18:44:54.052Z" level=info msg="Deleting pod: workflow-template-retry-with-steps-qzzt8--2273160064"

Actual pod name is workflow-template-retry-with-steps-qzzt8-fail-template-2273160064


Message from the maintainers:

Impacted by this bug? Give it a 👍. We prioritise the issues with the most 👍.

mikutas added a commit to mikutas/argo-workflows that referenced this issue Dec 1, 2021
node has TemplateRef.Template field rather than TemplateName field
if using WorkflowTemplate

Signed-off-by: mikutas <23391543+mikutas@users.noreply.github.com>
@mikutas mikutas changed the title Retrying failed workflow get stuck in Running phase of the workflow and Pending phase of the node if POD_NAMES=v2 Retrying failed workflow using WorkflowTemplate get stuck in Running phase of the workflow and Pending phase of the node if POD_NAMES=v2 Dec 1, 2021
@alexec alexec removed the triage label Dec 1, 2021
@alexec alexec added this to To do in Run The Business (incl. bugs) via automation Dec 1, 2021
@alexec alexec moved this from To do to In progress in Run The Business (incl. bugs) Dec 1, 2021
alexec pushed a commit that referenced this issue Dec 8, 2021
Signed-off-by: mikutas <23391543+mikutas@users.noreply.github.com>
@alexec alexec moved this from In progress to Done in Run The Business (incl. bugs) Dec 8, 2021
@sarabala1979 sarabala1979 mentioned this issue Dec 15, 2021
73 tasks
sarabala1979 pushed a commit that referenced this issue Dec 15, 2021
Signed-off-by: mikutas <23391543+mikutas@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

Successfully merging a pull request may close this issue.

2 participants