Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pod not found when resubmit --memoized #3214

Closed
brendanstennett opened this issue Jun 11, 2020 · 9 comments
Closed

Pod not found when resubmit --memoized #3214

brendanstennett opened this issue Jun 11, 2020 · 9 comments
Assignees
Labels
type/bug type/regression Regression from previous behavior (a specific type of bug)

Comments

@brendanstennett
Copy link

What happened:

Using the resubmit --memoized example, the resubmitted workflow fails with a "pod deleted" message in the place of the original pod that failed. See "workflow results" below.

What you expected to happen:

I expected that pod to be retried.

How to reproduce it (as minimally and precisely as possible):

Using the example of https://github.com/argoproj/argo/blob/master/examples/resubmit.yaml then issuing resubmit with $ argo resubmit --watch --memoized <workflow>

Anything else we need to know?:

Tried on local Kubernetes and GKE.

Environment:

  • Argo version:
argo: v2.8.1
  BuildDate: 2020-05-28T23:40:32Z
  GitCommit: 0fff4b21c21c5ff5adbb5ff62c68e67edd95d6b8
  GitTreeState: clean
  GitTag: v2.8.1
  GoVersion: go1.13.4
  Compiler: gc
  Platform: darwin/amd64
  • Kubernetes version :
clientVersion:
  buildDate: "2020-04-16T11:44:51Z"
  compiler: gc
  gitCommit: a17149e1a189050796ced469dbd78d380f2ed5ef
  gitTreeState: clean
  gitVersion: v1.16.9
  goVersion: go1.13.9
  major: "1"
  minor: "16"
  platform: darwin/amd64
serverVersion:
  buildDate: "2020-01-15T08:18:29Z"
  compiler: gc
  gitCommit: e7f962ba86f4ce7033828210ca3556393c377bcc
  gitTreeState: clean
  gitVersion: v1.16.6-beta.0
  goVersion: go1.13.5
  major: "1"
  minor: 16+
  platform: linux/amd64

Other debugging information (if applicable):

  • workflow result:
DEBU[0000] CLI version                                   version="{v2.8.1 2020-05-28T23:40:32Z 0fff4b21c21c5ff5adbb5ff62c68e67edd95d6b8 v2.8.1 clean go1.13.4 gc darwin/amd64}"
DEBU[0000] Client options                                opts="{{ false false} 0x2175d40 0xc00004dc70}"
Name:                resubmit-waxpt
Namespace:           default
ServiceAccount:      default
Status:              Failed
Conditions:
 Completed           True
Created:             Thu Jun 11 10:15:57 -0400 (34 seconds ago)
Started:             Thu Jun 11 10:15:58 -0400 (33 seconds ago)
Finished:            Thu Jun 11 10:15:58 -0400 (33 seconds ago)
Duration:            0 seconds

STEP                 TEMPLATE         PODNAME                   DURATION  MESSAGE
 ✖ resubmit-j4phc    rand-fail-dag
 ├-○ A               random-fail                                          original pod: resubmit-j4phc-2666829291
 └-✖ B               rand-fail-steps                                      child 'resubmit-waxpt-231622993' failed
   └-·-○ randfail1a  random-fail                                          original pod: resubmit-j4phc-2717564683
     └-⚠ randfail1b  random-fail      resubmit-waxpt-231622993  34s       pod deleted
  • executor logs:
$ kubectl logs resubmit-j4phc-2734342302 -c wait
time="2020-06-11T14:15:40Z" level=info msg="Starting Workflow Executor" version=v2.8.1+0fff4b2.dirty
time="2020-06-11T14:15:40Z" level=info msg="Creating a docker executor"
time="2020-06-11T14:15:40Z" level=info msg="Executor (version: v2.8.1+0fff4b2.dirty, build_date: 2020-05-29T00:08:35Z) initialized (pod: default/resubmit-j4phc-2734342302) with template:\n{\"name\":\"random-fail\",\"arguments\":{},\"inputs\":{},\"outputs\":{},\"metadata\":{},\"container\":{\"name\":\"\",\"image\":\"python:alpine3.6\",\"command\":[\"python\",\"-c\"],\"args\":[\"import random; import sys; exit_code = random.choice([0, 0, 1]); print('exiting with code {}'.format(exit_code)); sys.exit(exit_code)\"],\"resources\":{}}}"
time="2020-06-11T14:15:40Z" level=info msg="Waiting on main container"
time="2020-06-11T14:15:41Z" level=info msg="main container started with container ID: 0c4e05181c407518826914674dd0243d5368cd96bb7c97b7cfbe5e8fd9b15495"
time="2020-06-11T14:15:41Z" level=info msg="Starting annotations monitor"
time="2020-06-11T14:15:41Z" level=info msg="docker wait 0c4e05181c407518826914674dd0243d5368cd96bb7c97b7cfbe5e8fd9b15495"
time="2020-06-11T14:15:41Z" level=info msg="Starting deadline monitor"
time="2020-06-11T14:15:41Z" level=info msg="Main container completed"
time="2020-06-11T14:15:41Z" level=info msg="No output parameters"
time="2020-06-11T14:15:41Z" level=info msg="No output artifacts"
time="2020-06-11T14:15:41Z" level=info msg="No Script output reference in workflow. Capturing script output ignored"
time="2020-06-11T14:15:41Z" level=info msg="Capturing script exit code"
time="2020-06-11T14:15:41Z" level=info msg="[docker inspect 0c4e05181c407518826914674dd0243d5368cd96bb7c97b7cfbe5e8fd9b15495 --format='{{.State.ExitCode}}']"
time="2020-06-11T14:15:41Z" level=info msg="Annotations monitor stopped"
time="2020-06-11T14:15:41Z" level=info msg="Annotating pod with output"
time="2020-06-11T14:15:41Z" level=info msg="Killing sidecars"
time="2020-06-11T14:15:41Z" level=info msg="Alloc=4067 TotalAlloc=10977 Sys=70848 NumGC=4 Goroutines=9"
  • workflow-controller logs:
time="2020-06-11T14:14:37Z" level=info msg="config map" name=workflow-controller-configmap
time="2020-06-11T14:14:37Z" level=info msg="Configuration:\nartifactRepository: {}\nfeatureFlags: {}\nmetricsConfig:\n  disableLegacy: false\npodSpecLogStrategy: {}\ntelemetryConfig:\n  disableLegacy: false\n"
time="2020-06-11T14:14:37Z" level=info msg="Persistence configuration disabled"
time="2020-06-11T14:14:37Z" level=info msg="Starting CronWorkflow controller"
time="2020-06-11T14:14:37Z" level=info msg="Starting Workflow Controller" version=v2.8.1+0fff4b2.dirty
time="2020-06-11T14:14:37Z" level=info msg="Workers: workflow: 32, pod: 32"
time="2020-06-11T14:14:37Z" level=info msg="Starting workflow TTL controller (resync 20m0s)"
time="2020-06-11T14:14:37Z" level=info msg="Performing periodic GC every 5m0s"
time="2020-06-11T14:14:37Z" level=info msg="Persistence disabled - so archived workflow GC disabled - you must restart the controller if you enable this"
time="2020-06-11T14:14:37Z" level=info msg="Started workflow TTL worker"
time="2020-06-11T14:15:40Z" level=info msg="Processing workflow" namespace=default workflow=resubmit-j4phc
time="2020-06-11T14:15:40Z" level=info msg="Updated phase  -> Running" namespace=default workflow=resubmit-j4phc
time="2020-06-11T14:15:40Z" level=info msg="DAG node {resubmit-j4phc resubmit-j4phc resubmit-j4phc DAG rand-fail-dag nil   local/resubmit-j4phc Running   2020-06-11 14:15:40.080867138 +0000 UTC 0001-01-01 00:00:00 +0000 UTC   <nil> nil nil [] [] } initialized Running" namespace=default workflow=resubmit-j4phc
time="2020-06-11T14:15:40Z" level=info msg="All of node resubmit-j4phc.B dependencies [] completed" namespace=default workflow=resubmit-j4phc
time="2020-06-11T14:15:40Z" level=info msg="Steps node {resubmit-j4phc-2683606910 resubmit-j4phc.B B Steps rand-fail-steps nil   local/resubmit-j4phc Running resubmit-j4phc  2020-06-11 14:15:40.081453903 +0000 UTC 0001-01-01 00:00:00 +0000 UTC   <nil> nil nil [] [] } initialized Running" namespace=default workflow=resubmit-j4phc
time="2020-06-11T14:15:40Z" level=info msg="StepGroup node {resubmit-j4phc-2787011136 resubmit-j4phc.B[0] [0] StepGroup rand-fail-steps nil   local/resubmit-j4phc Running resubmit-j4phc-2683606910  2020-06-11 14:15:40.081577742 +0000 UTC 0001-01-01 00:00:00 +0000 UTC   <nil> nil nil [] [] } initialized Running" namespace=default workflow=resubmit-j4phc
time="2020-06-11T14:15:40Z" level=info msg="Pod node {resubmit-j4phc-2717564683 resubmit-j4phc.B[0].randfail1a randfail1a Pod random-fail nil   local/resubmit-j4phc Pending resubmit-j4phc-2683606910  2020-06-11 14:15:40.081820535 +0000 UTC 0001-01-01 00:00:00 +0000 UTC   <nil> nil nil [] [] } initialized Pending" namespace=default workflow=resubmit-j4phc
time="2020-06-11T14:15:40Z" level=info msg="Created pod: resubmit-j4phc.B[0].randfail1a (resubmit-j4phc-2717564683)" namespace=default workflow=resubmit-j4phc
time="2020-06-11T14:15:40Z" level=info msg="Pod node {resubmit-j4phc-2734342302 resubmit-j4phc.B[0].randfail1b randfail1b Pod random-fail nil   local/resubmit-j4phc Pending resubmit-j4phc-2683606910  2020-06-11 14:15:40.095182705 +0000 UTC 0001-01-01 00:00:00 +0000 UTC   <nil> nil nil [] [] } initialized Pending" namespace=default workflow=resubmit-j4phc
time="2020-06-11T14:15:40Z" level=info msg="Created pod: resubmit-j4phc.B[0].randfail1b (resubmit-j4phc-2734342302)" namespace=default workflow=resubmit-j4phc
time="2020-06-11T14:15:40Z" level=info msg="Workflow step group node &NodeStatus{ID:resubmit-j4phc-2787011136,Name:resubmit-j4phc.B[0],DisplayName:[0],Type:StepGroup,TemplateName:rand-fail-steps,TemplateRef:nil,Phase:Running,BoundaryID:resubmit-j4phc-2683606910,Message:,StartedAt:2020-06-11 14:15:40.081577742 +0000 UTC,FinishedAt:0001-01-01 00:00:00 +0000 UTC,PodIP:,Daemoned:nil,Inputs:nil,Outputs:nil,Children:[resubmit-j4phc-2717564683 resubmit-j4phc-2734342302],OutboundNodes:[],StoredTemplateID:,WorkflowTemplateName:,TemplateScope:local/resubmit-j4phc,ResourcesDuration:ResourcesDuration{},HostNodeName:,} not yet completed" namespace=default workflow=resubmit-j4phc
time="2020-06-11T14:15:40Z" level=info msg="All of node resubmit-j4phc.A dependencies [] completed" namespace=default workflow=resubmit-j4phc
time="2020-06-11T14:15:40Z" level=info msg="Pod node {resubmit-j4phc-2666829291 resubmit-j4phc.A A Pod random-fail nil   local/resubmit-j4phc Pending resubmit-j4phc  2020-06-11 14:15:40.1072837 +0000 UTC 0001-01-01 00:00:00 +0000 UTC   <nil> nil nil [] [] } initialized Pending" namespace=default workflow=resubmit-j4phc
time="2020-06-11T14:15:40Z" level=info msg="Created pod: resubmit-j4phc.A (resubmit-j4phc-2666829291)" namespace=default workflow=resubmit-j4phc
time="2020-06-11T14:15:40Z" level=info msg="Workflow update successful" namespace=default phase=Running resourceVersion=826647 workflow=resubmit-j4phc
time="2020-06-11T14:15:41Z" level=info msg="Processing workflow" namespace=default workflow=resubmit-j4phc
time="2020-06-11T14:15:41Z" level=info msg="Updating node &NodeStatus{ID:resubmit-j4phc-2734342302,Name:resubmit-j4phc.B[0].randfail1b,DisplayName:randfail1b,Type:Pod,TemplateName:random-fail,TemplateRef:nil,Phase:Pending,BoundaryID:resubmit-j4phc-2683606910,Message:,StartedAt:2020-06-11 14:15:40 +0000 UTC,FinishedAt:0001-01-01 00:00:00 +0000 UTC,PodIP:,Daemoned:nil,Inputs:nil,Outputs:nil,Children:[],OutboundNodes:[],StoredTemplateID:,WorkflowTemplateName:,TemplateScope:local/resubmit-j4phc,ResourcesDuration:ResourcesDuration{},HostNodeName:,} status Pending -> Running"
time="2020-06-11T14:15:41Z" level=info msg="Updating node &NodeStatus{ID:resubmit-j4phc-2717564683,Name:resubmit-j4phc.B[0].randfail1a,DisplayName:randfail1a,Type:Pod,TemplateName:random-fail,TemplateRef:nil,Phase:Pending,BoundaryID:resubmit-j4phc-2683606910,Message:,StartedAt:2020-06-11 14:15:40 +0000 UTC,FinishedAt:0001-01-01 00:00:00 +0000 UTC,PodIP:,Daemoned:nil,Inputs:nil,Outputs:nil,Children:[],OutboundNodes:[],StoredTemplateID:,WorkflowTemplateName:,TemplateScope:local/resubmit-j4phc,ResourcesDuration:ResourcesDuration{},HostNodeName:,} message: ContainerCreating"
time="2020-06-11T14:15:41Z" level=info msg="Updating node &NodeStatus{ID:resubmit-j4phc-2666829291,Name:resubmit-j4phc.A,DisplayName:A,Type:Pod,TemplateName:random-fail,TemplateRef:nil,Phase:Pending,BoundaryID:resubmit-j4phc,Message:,StartedAt:2020-06-11 14:15:40 +0000 UTC,FinishedAt:0001-01-01 00:00:00 +0000 UTC,PodIP:,Daemoned:nil,Inputs:nil,Outputs:nil,Children:[],OutboundNodes:[],StoredTemplateID:,WorkflowTemplateName:,TemplateScope:local/resubmit-j4phc,ResourcesDuration:ResourcesDuration{},HostNodeName:,} message: ContainerCreating"
time="2020-06-11T14:15:41Z" level=info msg="Workflow step group node &NodeStatus{ID:resubmit-j4phc-2787011136,Name:resubmit-j4phc.B[0],DisplayName:[0],Type:StepGroup,TemplateName:rand-fail-steps,TemplateRef:nil,Phase:Running,BoundaryID:resubmit-j4phc-2683606910,Message:,StartedAt:2020-06-11 14:15:40 +0000 UTC,FinishedAt:0001-01-01 00:00:00 +0000 UTC,PodIP:,Daemoned:nil,Inputs:nil,Outputs:nil,Children:[resubmit-j4phc-2717564683 resubmit-j4phc-2734342302],OutboundNodes:[],StoredTemplateID:,WorkflowTemplateName:,TemplateScope:local/resubmit-j4phc,ResourcesDuration:ResourcesDuration{},HostNodeName:,} not yet completed" namespace=default workflow=resubmit-j4phc
time="2020-06-11T14:15:41Z" level=info msg="Workflow update successful" namespace=default phase=Running resourceVersion=826666 workflow=resubmit-j4phc
time="2020-06-11T14:15:42Z" level=info msg="Processing workflow" namespace=default workflow=resubmit-j4phc
time="2020-06-11T14:15:42Z" level=info msg="Setting node &NodeStatus{ID:resubmit-j4phc-2734342302,Name:resubmit-j4phc.B[0].randfail1b,DisplayName:randfail1b,Type:Pod,TemplateName:random-fail,TemplateRef:nil,Phase:Running,BoundaryID:resubmit-j4phc-2683606910,Message:,StartedAt:2020-06-11 14:15:40 +0000 UTC,FinishedAt:0001-01-01 00:00:00 +0000 UTC,PodIP:,Daemoned:nil,Inputs:nil,Outputs:nil,Children:[],OutboundNodes:[],StoredTemplateID:,WorkflowTemplateName:,TemplateScope:local/resubmit-j4phc,ResourcesDuration:ResourcesDuration{},HostNodeName:docker-desktop,} outputs"
time="2020-06-11T14:15:42Z" level=info msg="Updating node &NodeStatus{ID:resubmit-j4phc-2734342302,Name:resubmit-j4phc.B[0].randfail1b,DisplayName:randfail1b,Type:Pod,TemplateName:random-fail,TemplateRef:nil,Phase:Running,BoundaryID:resubmit-j4phc-2683606910,Message:,StartedAt:2020-06-11 14:15:40 +0000 UTC,FinishedAt:0001-01-01 00:00:00 +0000 UTC,PodIP:,Daemoned:nil,Inputs:nil,Outputs:&Outputs{Parameters:[]Parameter{},Artifacts:[]Artifact{},Result:nil,ExitCode:*1,},Children:[],OutboundNodes:[],StoredTemplateID:,WorkflowTemplateName:,TemplateScope:local/resubmit-j4phc,ResourcesDuration:ResourcesDuration{},HostNodeName:docker-desktop,} status Running -> Failed"
time="2020-06-11T14:15:42Z" level=info msg="Updating node &NodeStatus{ID:resubmit-j4phc-2734342302,Name:resubmit-j4phc.B[0].randfail1b,DisplayName:randfail1b,Type:Pod,TemplateName:random-fail,TemplateRef:nil,Phase:Failed,BoundaryID:resubmit-j4phc-2683606910,Message:,StartedAt:2020-06-11 14:15:40 +0000 UTC,FinishedAt:0001-01-01 00:00:00 +0000 UTC,PodIP:,Daemoned:nil,Inputs:nil,Outputs:&Outputs{Parameters:[]Parameter{},Artifacts:[]Artifact{},Result:nil,ExitCode:*1,},Children:[],OutboundNodes:[],StoredTemplateID:,WorkflowTemplateName:,TemplateScope:local/resubmit-j4phc,ResourcesDuration:ResourcesDuration{},HostNodeName:docker-desktop,} message: failed with exit code 1"
time="2020-06-11T14:15:42Z" level=info msg="Updating node &NodeStatus{ID:resubmit-j4phc-2666829291,Name:resubmit-j4phc.A,DisplayName:A,Type:Pod,TemplateName:random-fail,TemplateRef:nil,Phase:Pending,BoundaryID:resubmit-j4phc,Message:ContainerCreating,StartedAt:2020-06-11 14:15:40 +0000 UTC,FinishedAt:0001-01-01 00:00:00 +0000 UTC,PodIP:,Daemoned:nil,Inputs:nil,Outputs:nil,Children:[],OutboundNodes:[],StoredTemplateID:,WorkflowTemplateName:,TemplateScope:local/resubmit-j4phc,ResourcesDuration:ResourcesDuration{},HostNodeName:docker-desktop,} status Pending -> Running"
time="2020-06-11T14:15:42Z" level=info msg="Updating node &NodeStatus{ID:resubmit-j4phc-2717564683,Name:resubmit-j4phc.B[0].randfail1a,DisplayName:randfail1a,Type:Pod,TemplateName:random-fail,TemplateRef:nil,Phase:Pending,BoundaryID:resubmit-j4phc-2683606910,Message:ContainerCreating,StartedAt:2020-06-11 14:15:40 +0000 UTC,FinishedAt:0001-01-01 00:00:00 +0000 UTC,PodIP:,Daemoned:nil,Inputs:nil,Outputs:nil,Children:[],OutboundNodes:[],StoredTemplateID:,WorkflowTemplateName:,TemplateScope:local/resubmit-j4phc,ResourcesDuration:ResourcesDuration{},HostNodeName:docker-desktop,} status Pending -> Running"
time="2020-06-11T14:15:42Z" level=info msg="Workflow step group node &NodeStatus{ID:resubmit-j4phc-2787011136,Name:resubmit-j4phc.B[0],DisplayName:[0],Type:StepGroup,TemplateName:rand-fail-steps,TemplateRef:nil,Phase:Running,BoundaryID:resubmit-j4phc-2683606910,Message:,StartedAt:2020-06-11 14:15:40 +0000 UTC,FinishedAt:0001-01-01 00:00:00 +0000 UTC,PodIP:,Daemoned:nil,Inputs:nil,Outputs:nil,Children:[resubmit-j4phc-2717564683 resubmit-j4phc-2734342302],OutboundNodes:[],StoredTemplateID:,WorkflowTemplateName:,TemplateScope:local/resubmit-j4phc,ResourcesDuration:ResourcesDuration{},HostNodeName:,} not yet completed" namespace=default workflow=resubmit-j4phc
time="2020-06-11T14:15:42Z" level=info msg="Workflow update successful" namespace=default phase=Running resourceVersion=826677 workflow=resubmit-j4phc
time="2020-06-11T14:15:43Z" level=info msg="Processing workflow" namespace=default workflow=resubmit-j4phc
time="2020-06-11T14:15:43Z" level=info msg="Labeled pod default/resubmit-j4phc-2734342302 completed"
time="2020-06-11T14:15:43Z" level=info msg="Setting node &NodeStatus{ID:resubmit-j4phc-2666829291,Name:resubmit-j4phc.A,DisplayName:A,Type:Pod,TemplateName:random-fail,TemplateRef:nil,Phase:Running,BoundaryID:resubmit-j4phc,Message:,StartedAt:2020-06-11 14:15:40 +0000 UTC,FinishedAt:0001-01-01 00:00:00 +0000 UTC,PodIP:,Daemoned:nil,Inputs:nil,Outputs:nil,Children:[],OutboundNodes:[],StoredTemplateID:,WorkflowTemplateName:,TemplateScope:local/resubmit-j4phc,ResourcesDuration:ResourcesDuration{},HostNodeName:docker-desktop,} outputs"
time="2020-06-11T14:15:43Z" level=info msg="Updating node &NodeStatus{ID:resubmit-j4phc-2666829291,Name:resubmit-j4phc.A,DisplayName:A,Type:Pod,TemplateName:random-fail,TemplateRef:nil,Phase:Running,BoundaryID:resubmit-j4phc,Message:,StartedAt:2020-06-11 14:15:40 +0000 UTC,FinishedAt:0001-01-01 00:00:00 +0000 UTC,PodIP:,Daemoned:nil,Inputs:nil,Outputs:&Outputs{Parameters:[]Parameter{},Artifacts:[]Artifact{},Result:nil,ExitCode:*0,},Children:[],OutboundNodes:[],StoredTemplateID:,WorkflowTemplateName:,TemplateScope:local/resubmit-j4phc,ResourcesDuration:ResourcesDuration{},HostNodeName:docker-desktop,} status Running -> Succeeded"
time="2020-06-11T14:15:43Z" level=info msg="Setting node &NodeStatus{ID:resubmit-j4phc-2717564683,Name:resubmit-j4phc.B[0].randfail1a,DisplayName:randfail1a,Type:Pod,TemplateName:random-fail,TemplateRef:nil,Phase:Running,BoundaryID:resubmit-j4phc-2683606910,Message:,StartedAt:2020-06-11 14:15:40 +0000 UTC,FinishedAt:0001-01-01 00:00:00 +0000 UTC,PodIP:,Daemoned:nil,Inputs:nil,Outputs:nil,Children:[],OutboundNodes:[],StoredTemplateID:,WorkflowTemplateName:,TemplateScope:local/resubmit-j4phc,ResourcesDuration:ResourcesDuration{},HostNodeName:docker-desktop,} outputs"
time="2020-06-11T14:15:43Z" level=info msg="Workflow step group node &NodeStatus{ID:resubmit-j4phc-2787011136,Name:resubmit-j4phc.B[0],DisplayName:[0],Type:StepGroup,TemplateName:rand-fail-steps,TemplateRef:nil,Phase:Running,BoundaryID:resubmit-j4phc-2683606910,Message:,StartedAt:2020-06-11 14:15:40 +0000 UTC,FinishedAt:0001-01-01 00:00:00 +0000 UTC,PodIP:,Daemoned:nil,Inputs:nil,Outputs:nil,Children:[resubmit-j4phc-2717564683 resubmit-j4phc-2734342302],OutboundNodes:[],StoredTemplateID:,WorkflowTemplateName:,TemplateScope:local/resubmit-j4phc,ResourcesDuration:ResourcesDuration{},HostNodeName:,} not yet completed" namespace=default workflow=resubmit-j4phc
time="2020-06-11T14:15:43Z" level=info msg="Workflow update successful" namespace=default phase=Running resourceVersion=826687 workflow=resubmit-j4phc
time="2020-06-11T14:15:44Z" level=info msg="Processing workflow" namespace=default workflow=resubmit-j4phc
time="2020-06-11T14:15:44Z" level=info msg="Labeled pod default/resubmit-j4phc-2666829291 completed"
time="2020-06-11T14:15:44Z" level=info msg="Updating node &NodeStatus{ID:resubmit-j4phc-2717564683,Name:resubmit-j4phc.B[0].randfail1a,DisplayName:randfail1a,Type:Pod,TemplateName:random-fail,TemplateRef:nil,Phase:Running,BoundaryID:resubmit-j4phc-2683606910,Message:,StartedAt:2020-06-11 14:15:40 +0000 UTC,FinishedAt:0001-01-01 00:00:00 +0000 UTC,PodIP:,Daemoned:nil,Inputs:nil,Outputs:&Outputs{Parameters:[]Parameter{},Artifacts:[]Artifact{},Result:nil,ExitCode:*0,},Children:[],OutboundNodes:[],StoredTemplateID:,WorkflowTemplateName:,TemplateScope:local/resubmit-j4phc,ResourcesDuration:ResourcesDuration{},HostNodeName:docker-desktop,} status Running -> Succeeded"
time="2020-06-11T14:15:44Z" level=info msg="Step group node &NodeStatus{ID:resubmit-j4phc-2787011136,Name:resubmit-j4phc.B[0],DisplayName:[0],Type:StepGroup,TemplateName:rand-fail-steps,TemplateRef:nil,Phase:Running,BoundaryID:resubmit-j4phc-2683606910,Message:,StartedAt:2020-06-11 14:15:40 +0000 UTC,FinishedAt:0001-01-01 00:00:00 +0000 UTC,PodIP:,Daemoned:nil,Inputs:nil,Outputs:nil,Children:[resubmit-j4phc-2717564683 resubmit-j4phc-2734342302],OutboundNodes:[],StoredTemplateID:,WorkflowTemplateName:,TemplateScope:local/resubmit-j4phc,ResourcesDuration:ResourcesDuration{},HostNodeName:,} deemed failed: child 'resubmit-j4phc-2734342302' failed" namespace=default workflow=resubmit-j4phc
time="2020-06-11T14:15:44Z" level=info msg="node &NodeStatus{ID:resubmit-j4phc-2787011136,Name:resubmit-j4phc.B[0],DisplayName:[0],Type:StepGroup,TemplateName:rand-fail-steps,TemplateRef:nil,Phase:Running,BoundaryID:resubmit-j4phc-2683606910,Message:,StartedAt:2020-06-11 14:15:40 +0000 UTC,FinishedAt:0001-01-01 00:00:00 +0000 UTC,PodIP:,Daemoned:nil,Inputs:nil,Outputs:nil,Children:[resubmit-j4phc-2717564683 resubmit-j4phc-2734342302],OutboundNodes:[],StoredTemplateID:,WorkflowTemplateName:,TemplateScope:local/resubmit-j4phc,ResourcesDuration:ResourcesDuration{},HostNodeName:,} phase Running -> Failed" namespace=default workflow=resubmit-j4phc
time="2020-06-11T14:15:44Z" level=info msg="node &NodeStatus{ID:resubmit-j4phc-2787011136,Name:resubmit-j4phc.B[0],DisplayName:[0],Type:StepGroup,TemplateName:rand-fail-steps,TemplateRef:nil,Phase:Failed,BoundaryID:resubmit-j4phc-2683606910,Message:,StartedAt:2020-06-11 14:15:40 +0000 UTC,FinishedAt:0001-01-01 00:00:00 +0000 UTC,PodIP:,Daemoned:nil,Inputs:nil,Outputs:nil,Children:[resubmit-j4phc-2717564683 resubmit-j4phc-2734342302],OutboundNodes:[],StoredTemplateID:,WorkflowTemplateName:,TemplateScope:local/resubmit-j4phc,ResourcesDuration:ResourcesDuration{},HostNodeName:,} message: child 'resubmit-j4phc-2734342302' failed" namespace=default workflow=resubmit-j4phc
time="2020-06-11T14:15:44Z" level=info msg="node &NodeStatus{ID:resubmit-j4phc-2787011136,Name:resubmit-j4phc.B[0],DisplayName:[0],Type:StepGroup,TemplateName:rand-fail-steps,TemplateRef:nil,Phase:Failed,BoundaryID:resubmit-j4phc-2683606910,Message:child 'resubmit-j4phc-2734342302' failed,StartedAt:2020-06-11 14:15:40 +0000 UTC,FinishedAt:2020-06-11 14:15:44.212409855 +0000 UTC,PodIP:,Daemoned:nil,Inputs:nil,Outputs:nil,Children:[resubmit-j4phc-2717564683 resubmit-j4phc-2734342302],OutboundNodes:[],StoredTemplateID:,WorkflowTemplateName:,TemplateScope:local/resubmit-j4phc,ResourcesDuration:ResourcesDuration{},HostNodeName:,} finished: 2020-06-11 14:15:44.212409855 +0000 UTC" namespace=default workflow=resubmit-j4phc
time="2020-06-11T14:15:44Z" level=info msg="step group resubmit-j4phc-2787011136 was unsuccessful: child 'resubmit-j4phc-2734342302' failed" namespace=default workflow=resubmit-j4phc
time="2020-06-11T14:15:44Z" level=info msg="Outbound nodes of resubmit-j4phc-2717564683 is [resubmit-j4phc-2717564683]" namespace=default workflow=resubmit-j4phc
time="2020-06-11T14:15:44Z" level=info msg="Outbound nodes of resubmit-j4phc-2734342302 is [resubmit-j4phc-2734342302]" namespace=default workflow=resubmit-j4phc
time="2020-06-11T14:15:44Z" level=info msg="Outbound nodes of resubmit-j4phc-2683606910 is [resubmit-j4phc-2717564683 resubmit-j4phc-2734342302]" namespace=default workflow=resubmit-j4phc
time="2020-06-11T14:15:44Z" level=info msg="node &NodeStatus{ID:resubmit-j4phc-2683606910,Name:resubmit-j4phc.B,DisplayName:B,Type:Steps,TemplateName:rand-fail-steps,TemplateRef:nil,Phase:Running,BoundaryID:resubmit-j4phc,Message:,StartedAt:2020-06-11 14:15:40 +0000 UTC,FinishedAt:0001-01-01 00:00:00 +0000 UTC,PodIP:,Daemoned:nil,Inputs:nil,Outputs:nil,Children:[resubmit-j4phc-2787011136],OutboundNodes:[resubmit-j4phc-2717564683 resubmit-j4phc-2734342302],StoredTemplateID:,WorkflowTemplateName:,TemplateScope:local/resubmit-j4phc,ResourcesDuration:ResourcesDuration{},HostNodeName:,} phase Running -> Failed" namespace=default workflow=resubmit-j4phc
time="2020-06-11T14:15:44Z" level=info msg="node &NodeStatus{ID:resubmit-j4phc-2683606910,Name:resubmit-j4phc.B,DisplayName:B,Type:Steps,TemplateName:rand-fail-steps,TemplateRef:nil,Phase:Failed,BoundaryID:resubmit-j4phc,Message:,StartedAt:2020-06-11 14:15:40 +0000 UTC,FinishedAt:0001-01-01 00:00:00 +0000 UTC,PodIP:,Daemoned:nil,Inputs:nil,Outputs:nil,Children:[resubmit-j4phc-2787011136],OutboundNodes:[resubmit-j4phc-2717564683 resubmit-j4phc-2734342302],StoredTemplateID:,WorkflowTemplateName:,TemplateScope:local/resubmit-j4phc,ResourcesDuration:ResourcesDuration{},HostNodeName:,} message: child 'resubmit-j4phc-2734342302' failed" namespace=default workflow=resubmit-j4phc
time="2020-06-11T14:15:44Z" level=info msg="node &NodeStatus{ID:resubmit-j4phc-2683606910,Name:resubmit-j4phc.B,DisplayName:B,Type:Steps,TemplateName:rand-fail-steps,TemplateRef:nil,Phase:Failed,BoundaryID:resubmit-j4phc,Message:child 'resubmit-j4phc-2734342302' failed,StartedAt:2020-06-11 14:15:40 +0000 UTC,FinishedAt:2020-06-11 14:15:44.219947517 +0000 UTC,PodIP:,Daemoned:nil,Inputs:nil,Outputs:nil,Children:[resubmit-j4phc-2787011136],OutboundNodes:[resubmit-j4phc-2717564683 resubmit-j4phc-2734342302],StoredTemplateID:,WorkflowTemplateName:,TemplateScope:local/resubmit-j4phc,ResourcesDuration:ResourcesDuration{},HostNodeName:,} finished: 2020-06-11 14:15:44.219947517 +0000 UTC" namespace=default workflow=resubmit-j4phc
time="2020-06-11T14:15:44Z" level=info msg="Checking daemoned children of resubmit-j4phc-2683606910" namespace=default workflow=resubmit-j4phc
time="2020-06-11T14:15:44Z" level=info msg=C namespace=default workflow=resubmit-j4phc
time="2020-06-11T14:15:44Z" level=info msg=D namespace=default workflow=resubmit-j4phc
time="2020-06-11T14:15:44Z" level=info msg="Outbound nodes of resubmit-j4phc set to []" namespace=default workflow=resubmit-j4phc
time="2020-06-11T14:15:44Z" level=info msg="node &NodeStatus{ID:resubmit-j4phc,Name:resubmit-j4phc,DisplayName:resubmit-j4phc,Type:DAG,TemplateName:rand-fail-dag,TemplateRef:nil,Phase:Running,BoundaryID:,Message:,StartedAt:2020-06-11 14:15:40 +0000 UTC,FinishedAt:0001-01-01 00:00:00 +0000 UTC,PodIP:,Daemoned:nil,Inputs:nil,Outputs:nil,Children:[resubmit-j4phc-2683606910 resubmit-j4phc-2666829291],OutboundNodes:[],StoredTemplateID:,WorkflowTemplateName:,TemplateScope:local/resubmit-j4phc,ResourcesDuration:ResourcesDuration{},HostNodeName:,} phase Running -> Failed" namespace=default workflow=resubmit-j4phc
time="2020-06-11T14:15:44Z" level=info msg="node &NodeStatus{ID:resubmit-j4phc,Name:resubmit-j4phc,DisplayName:resubmit-j4phc,Type:DAG,TemplateName:rand-fail-dag,TemplateRef:nil,Phase:Failed,BoundaryID:,Message:,StartedAt:2020-06-11 14:15:40 +0000 UTC,FinishedAt:2020-06-11 14:15:44.22738077 +0000 UTC,PodIP:,Daemoned:nil,Inputs:nil,Outputs:nil,Children:[resubmit-j4phc-2683606910 resubmit-j4phc-2666829291],OutboundNodes:[],StoredTemplateID:,WorkflowTemplateName:,TemplateScope:local/resubmit-j4phc,ResourcesDuration:ResourcesDuration{},HostNodeName:,} finished: 2020-06-11 14:15:44.22738077 +0000 UTC" namespace=default workflow=resubmit-j4phc
time="2020-06-11T14:15:44Z" level=info msg="Checking daemoned children of resubmit-j4phc" namespace=default workflow=resubmit-j4phc
time="2020-06-11T14:15:44Z" level=info msg="Updated phase Running -> Failed" namespace=default workflow=resubmit-j4phc
time="2020-06-11T14:15:44Z" level=info msg="Marking workflow completed" namespace=default workflow=resubmit-j4phc
time="2020-06-11T14:15:44Z" level=info msg="Checking daemoned children of " namespace=default workflow=resubmit-j4phc
time="2020-06-11T14:15:44Z" level=info msg="Workflow update successful" namespace=default phase=Failed resourceVersion=826697 workflow=resubmit-j4phc
time="2020-06-11T14:15:45Z" level=info msg="Labeled pod default/resubmit-j4phc-2717564683 completed"
time="2020-06-11T14:15:45Z" level=info msg="Labeled pod default/resubmit-j4phc-2666829291 completed"
time="2020-06-11T14:15:57Z" level=info msg="Processing workflow" namespace=default workflow=resubmit-waxpt
time="2020-06-11T14:15:57Z" level=info msg="node &NodeStatus{ID:resubmit-waxpt-1139642181,Name:resubmit-waxpt.B[0],DisplayName:[0],Type:StepGroup,TemplateName:rand-fail-steps,TemplateRef:nil,Phase:Pending,BoundaryID:resubmit-waxpt-4141730081,Message:,StartedAt:2020-06-11 14:15:57 +0000 UTC,FinishedAt:2020-06-11 14:15:57 +0000 UTC,PodIP:,Daemoned:nil,Inputs:nil,Outputs:nil,Children:[resubmit-waxpt-181290136 resubmit-waxpt-231622993],OutboundNodes:[],StoredTemplateID:,WorkflowTemplateName:,TemplateScope:local/resubmit-j4phc,ResourcesDuration:ResourcesDuration{},HostNodeName:,} phase Pending -> Running" namespace=default workflow=resubmit-waxpt
time="2020-06-11T14:15:57Z" level=info msg="Workflow step group node &NodeStatus{ID:resubmit-waxpt-1139642181,Name:resubmit-waxpt.B[0],DisplayName:[0],Type:StepGroup,TemplateName:rand-fail-steps,TemplateRef:nil,Phase:Running,BoundaryID:resubmit-waxpt-4141730081,Message:,StartedAt:2020-06-11 14:15:57 +0000 UTC,FinishedAt:2020-06-11 14:15:57 +0000 UTC,PodIP:,Daemoned:nil,Inputs:nil,Outputs:nil,Children:[resubmit-waxpt-181290136 resubmit-waxpt-231622993],OutboundNodes:[],StoredTemplateID:,WorkflowTemplateName:,TemplateScope:local/resubmit-j4phc,ResourcesDuration:ResourcesDuration{},HostNodeName:,} not yet completed" namespace=default workflow=resubmit-waxpt
time="2020-06-11T14:15:57Z" level=info msg="Workflow update successful" namespace=default phase=Pending resourceVersion=826724 workflow=resubmit-waxpt
time="2020-06-11T14:15:58Z" level=info msg="Processing workflow" namespace=default workflow=resubmit-waxpt
time="2020-06-11T14:15:58Z" level=warning msg="pod resubmit-waxpt-231622993 deleted" namespace=default workflow=resubmit-waxpt
time="2020-06-11T14:15:58Z" level=info msg="Step group node &NodeStatus{ID:resubmit-waxpt-1139642181,Name:resubmit-waxpt.B[0],DisplayName:[0],Type:StepGroup,TemplateName:rand-fail-steps,TemplateRef:nil,Phase:Running,BoundaryID:resubmit-waxpt-4141730081,Message:,StartedAt:2020-06-11 14:15:57 +0000 UTC,FinishedAt:2020-06-11 14:15:57 +0000 UTC,PodIP:,Daemoned:nil,Inputs:nil,Outputs:nil,Children:[resubmit-waxpt-181290136 resubmit-waxpt-231622993],OutboundNodes:[],StoredTemplateID:,WorkflowTemplateName:,TemplateScope:local/resubmit-j4phc,ResourcesDuration:ResourcesDuration{},HostNodeName:,} deemed failed: child 'resubmit-waxpt-231622993' failed" namespace=default workflow=resubmit-waxpt
time="2020-06-11T14:15:58Z" level=info msg="node &NodeStatus{ID:resubmit-waxpt-1139642181,Name:resubmit-waxpt.B[0],DisplayName:[0],Type:StepGroup,TemplateName:rand-fail-steps,TemplateRef:nil,Phase:Running,BoundaryID:resubmit-waxpt-4141730081,Message:,StartedAt:2020-06-11 14:15:57 +0000 UTC,FinishedAt:2020-06-11 14:15:57 +0000 UTC,PodIP:,Daemoned:nil,Inputs:nil,Outputs:nil,Children:[resubmit-waxpt-181290136 resubmit-waxpt-231622993],OutboundNodes:[],StoredTemplateID:,WorkflowTemplateName:,TemplateScope:local/resubmit-j4phc,ResourcesDuration:ResourcesDuration{},HostNodeName:,} phase Running -> Failed" namespace=default workflow=resubmit-waxpt
time="2020-06-11T14:15:58Z" level=info msg="node &NodeStatus{ID:resubmit-waxpt-1139642181,Name:resubmit-waxpt.B[0],DisplayName:[0],Type:StepGroup,TemplateName:rand-fail-steps,TemplateRef:nil,Phase:Failed,BoundaryID:resubmit-waxpt-4141730081,Message:,StartedAt:2020-06-11 14:15:57 +0000 UTC,FinishedAt:2020-06-11 14:15:57 +0000 UTC,PodIP:,Daemoned:nil,Inputs:nil,Outputs:nil,Children:[resubmit-waxpt-181290136 resubmit-waxpt-231622993],OutboundNodes:[],StoredTemplateID:,WorkflowTemplateName:,TemplateScope:local/resubmit-j4phc,ResourcesDuration:ResourcesDuration{},HostNodeName:,} message: child 'resubmit-waxpt-231622993' failed" namespace=default workflow=resubmit-waxpt
time="2020-06-11T14:15:58Z" level=info msg="step group resubmit-waxpt-1139642181 was unsuccessful: child 'resubmit-waxpt-231622993' failed" namespace=default workflow=resubmit-waxpt
time="2020-06-11T14:15:58Z" level=info msg="Outbound nodes of resubmit-waxpt-181290136 is [resubmit-waxpt-181290136]" namespace=default workflow=resubmit-waxpt
time="2020-06-11T14:15:58Z" level=info msg="Outbound nodes of resubmit-waxpt-231622993 is [resubmit-waxpt-231622993]" namespace=default workflow=resubmit-waxpt
time="2020-06-11T14:15:58Z" level=info msg="Outbound nodes of resubmit-waxpt-4141730081 is [resubmit-waxpt-181290136 resubmit-waxpt-231622993]" namespace=default workflow=resubmit-waxpt
time="2020-06-11T14:15:58Z" level=info msg="node &NodeStatus{ID:resubmit-waxpt-4141730081,Name:resubmit-waxpt.B,DisplayName:B,Type:Steps,TemplateName:rand-fail-steps,TemplateRef:nil,Phase:Pending,BoundaryID:resubmit-waxpt,Message:,StartedAt:2020-06-11 14:15:57 +0000 UTC,FinishedAt:2020-06-11 14:15:57 +0000 UTC,PodIP:,Daemoned:nil,Inputs:nil,Outputs:nil,Children:[resubmit-waxpt-1139642181],OutboundNodes:[resubmit-waxpt-181290136 resubmit-waxpt-231622993],StoredTemplateID:,WorkflowTemplateName:,TemplateScope:local/resubmit-j4phc,ResourcesDuration:ResourcesDuration{},HostNodeName:,} phase Pending -> Failed" namespace=default workflow=resubmit-waxpt
time="2020-06-11T14:15:58Z" level=info msg="node &NodeStatus{ID:resubmit-waxpt-4141730081,Name:resubmit-waxpt.B,DisplayName:B,Type:Steps,TemplateName:rand-fail-steps,TemplateRef:nil,Phase:Failed,BoundaryID:resubmit-waxpt,Message:,StartedAt:2020-06-11 14:15:57 +0000 UTC,FinishedAt:2020-06-11 14:15:57 +0000 UTC,PodIP:,Daemoned:nil,Inputs:nil,Outputs:nil,Children:[resubmit-waxpt-1139642181],OutboundNodes:[resubmit-waxpt-181290136 resubmit-waxpt-231622993],StoredTemplateID:,WorkflowTemplateName:,TemplateScope:local/resubmit-j4phc,ResourcesDuration:ResourcesDuration{},HostNodeName:,} message: child 'resubmit-waxpt-231622993' failed" namespace=default workflow=resubmit-waxpt
time="2020-06-11T14:15:58Z" level=info msg="Checking daemoned children of resubmit-waxpt-4141730081" namespace=default workflow=resubmit-waxpt
time="2020-06-11T14:15:58Z" level=info msg=C namespace=default workflow=resubmit-waxpt
time="2020-06-11T14:15:58Z" level=info msg=D namespace=default workflow=resubmit-waxpt
time="2020-06-11T14:15:58Z" level=info msg="Outbound nodes of resubmit-waxpt set to []" namespace=default workflow=resubmit-waxpt
time="2020-06-11T14:15:58Z" level=info msg="node &NodeStatus{ID:resubmit-waxpt,Name:resubmit-waxpt,DisplayName:resubmit-j4phc,Type:DAG,TemplateName:rand-fail-dag,TemplateRef:nil,Phase:Pending,BoundaryID:,Message:,StartedAt:2020-06-11 14:15:57 +0000 UTC,FinishedAt:2020-06-11 14:15:57 +0000 UTC,PodIP:,Daemoned:nil,Inputs:nil,Outputs:nil,Children:[resubmit-waxpt-4141730081 resubmit-waxpt-4091397224],OutboundNodes:[],StoredTemplateID:,WorkflowTemplateName:,TemplateScope:local/resubmit-j4phc,ResourcesDuration:ResourcesDuration{},HostNodeName:,} phase Pending -> Failed" namespace=default workflow=resubmit-waxpt
time="2020-06-11T14:15:58Z" level=info msg="Checking daemoned children of resubmit-waxpt" namespace=default workflow=resubmit-waxpt
time="2020-06-11T14:15:58Z" level=info msg="Updated phase Pending -> Failed" namespace=default workflow=resubmit-waxpt
time="2020-06-11T14:15:58Z" level=info msg="Marking workflow completed" namespace=default workflow=resubmit-waxpt
time="2020-06-11T14:15:58Z" level=info msg="Checking daemoned children of " namespace=default workflow=resubmit-waxpt
time="2020-06-11T14:15:58Z" level=info msg="Workflow update successful" namespace=default phase=Failed resourceVersion=826729 workflow=resubmit-waxpt

Message from the maintainers:

If you are impacted by this bug please add a 👍 reaction to this issue! We often sort issues this way to know what to prioritize.

@alexec
Copy link
Contributor

alexec commented Jul 10, 2020

Investigating. Sorry about the delay.

@alexec
Copy link
Contributor

alexec commented Jul 10, 2020

I think this is a bug with panics node already initialized

@alexec
Copy link
Contributor

alexec commented Jul 10, 2020

Related? #2721

@alexec
Copy link
Contributor

alexec commented Jul 10, 2020

@alexec
Copy link
Contributor

alexec commented Jul 11, 2020

See #2385

@alexec
Copy link
Contributor

alexec commented Jul 11, 2020

Related #1552

@alexec alexec linked a pull request Jul 13, 2020 that will close this issue
6 tasks
@alexec
Copy link
Contributor

alexec commented Jul 13, 2020

See #3097

@alexec
Copy link
Contributor

alexec commented Jul 13, 2020

OK. So @sarabala1979 fixed on master - but now we need to back-port a new fix to v2.9.

@alexec
Copy link
Contributor

alexec commented Jul 14, 2020

Fixed in v2.9

@alexec alexec closed this as completed Jul 14, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/bug type/regression Regression from previous behavior (a specific type of bug)
Projects
None yet
Development

No branches or pull requests

2 participants