Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cli: failed to stop workflow #266

Closed
2 tasks done
diegodelemos opened this issue Feb 7, 2020 · 1 comment · Fixed by reanahub/reana-workflow-controller#298
Closed
2 tasks done

cli: failed to stop workflow #266

diegodelemos opened this issue Feb 7, 2020 · 1 comment · Fixed by reanahub/reana-workflow-controller#298

Comments

@diegodelemos
Copy link
Member

diegodelemos commented Feb 7, 2020

When trying to stop a running workflow:

$ # roofit example
$ git diff                                                                                                                                                      [5:00:56]
diff --git a/reana.yaml b/reana.yaml                                                                                                                                                                                                   
index e1355ca..51be33c 100644                                                                                      
--- a/reana.yaml                                                                                                   
+++ b/reana.yaml                                                                                                                                                                                                                       
@@ -19,7 +19,7 @@ workflow:                           
       - name: fitdata                                
         environment: 'reanahub/reana-env-root6:6.18.04'
         commands:                                                                                                                                                                                                                     
-        - root -b -q 'code/fitdata.C("${data}","${plot}")'                                                                                                                                                                            
+        - sleep 1000                              
 outputs:                                             
   files:                                             
     - results/plot.png 
$ reana-client run
[INFO] Creating a workflow...                         
workflow.19                                           
[INFO] Uploading files...                             
File code/gendata.C was successfully uploaded.        
File code/fitdata.C was successfully uploaded.                                                                                                                                                                                         
[INFO] Starting workflow...                                                                                                                                                                                                            
workflow.19 is running 

Then first workflow job finished (14e3e86f-31ac-4990-9405-1f888d 5adb98) and the second one (70c6e3eb-6142-43fb-8169-aa90c060023c-4r2dn ) starts to run:

$ kubectl get pods                                                                                                                                                      [5:01:49]
NAME                                                            READY   STATUS    RESTARTS   AGE
70c6e3eb-6142-43fb-8169-aa90c060023c-4r2dn                      1/1     Running   0          9s
reana-batch-serial-dcac5ed9-1d93-4a91-8631-ce518d4f8be2-l2gcg   2/2     Running   0          34s
reana-cache-88b76b854-85ptz                                     1/1     Running   0          36m
reana-db-5cff5946c7-xwmtk                                       1/1     Running   0          36m
reana-message-broker-65bbb5956f-7mgss                           1/1     Running   0          36m
reana-server-6487dc6d74-mkrcc                                   2/2     Running   0          32m
reana-traefik-86556f5678-h9nqk                                  1/1     Running   0          36m
reana-ui-597887756f-hhhsd                                       1/1     Running   0          36m
reana-wdb-574d66ff44-6kdhk                                      1/1     Running   0          36m
reana-workflow-controller-64d7cd56b4-4cq9q                      2/2     Running   0          34m

User tries to stop the workflow execution:

$ reana-client stop -w workflow.19 --force                                                                                                                          [5:01:45]
Workflow could not be stopped: 
(404)
Reason: Not Found
HTTP response headers: HTTPHeaderDict({'Cache-Control': 'no-cache, private', 'Content-Type': 'application/json', 'Date': 'Fri, 07 Feb 2020 16:01:49 GMT', 'Content-Length': '262'})
HTTP response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"jobs.batch \"14e3e86f-31ac-4990-9405-1f888d5adb98\" not found","reason":"NotFound","details":{"name":"14e3e86f-31ac-4990-9405-1f888d
5adb98","group":"batch","kind":"jobs"},"code":404}

We should:

  • Not try to delete a job that was already finished, or at least not fail and continue with the rest to finilise the stop
  • Give better error messages for stop
@diegodelemos
Copy link
Member Author

This is also happening when jobs are stuck in ImagePullBackoff , see conversation and complete issue about ImagePullBackoff problem.

@diegodelemos diegodelemos self-assigned this Feb 10, 2020
@diegodelemos diegodelemos moved this from Ready for work to In work in Awesome-Workshop-Final Feb 10, 2020
diegodelemos pushed a commit to diegodelemos/reana-workflow-controller that referenced this issue Feb 10, 2020
* Reads job status from Job table, to get the running jobs
  (closes reanahub/reana#266).
diegodelemos pushed a commit to diegodelemos/reana-workflow-controller that referenced this issue Feb 11, 2020
* Reads job status from Job table to get the running jobs
  as the `job_progress` in the Workflow table is not correctly
  updated, finished jobs are reported as running creating an exception
  while trying to delete a job which doesn't exist (for more see
  reanahub#299)
  (closes reanahub/reana#266).
diegodelemos pushed a commit to diegodelemos/reana-workflow-controller that referenced this issue Feb 11, 2020
* Reads job status from Job table to get the running jobs
  as the `job_progress` in the Workflow table is not correctly
  updated, finished jobs are reported as running creating an exception
  while trying to delete a job which doesn't exist (for more see
  reanahub#299)
  (closes reanahub/reana#266).

* Provides a user understandable message (closes reanahub/reana#266).
@diegodelemos diegodelemos moved this from In work to In review in Awesome-Workshop-Final Feb 11, 2020
@diegodelemos diegodelemos moved this from In review to In work in Awesome-Workshop-Final Feb 11, 2020
diegodelemos pushed a commit to diegodelemos/reana-workflow-controller that referenced this issue Feb 12, 2020
* Reads job status from Job table to get the running jobs
  as the `job_progress` in the Workflow table is not correctly
  updated, finished jobs are reported as running creating an exception
  while trying to delete a job which doesn't exist (for more see
  reanahub#299)
  (closes reanahub/reana#266).

* Provides a user understandable message (closes reanahub/reana#266).
@diegodelemos diegodelemos moved this from In work to In review in Awesome-Workshop-Final Feb 12, 2020
Awesome-Workshop-Final automation moved this from In review to Done Feb 12, 2020
mdonadoni added a commit to mdonadoni/reana that referenced this issue Mar 5, 2024
chore(reana-ui/master): release 0.9.4

build(reana-ui/package): update yarn.lock (reanahub#399)

build(reana-ui/package): require jsroot<7.6.0 (reanahub#399)

ci(reana-ui/commitlint): allow release commit style (reanahub#400)

docs(reana-ui/authors): complete list of contributors (reanahub#396)

ci(reana-ui/shellcheck): exclude node_modules from the analyzed paths (reanahub#387)

fix(reana-ui/progress): update failed workflows duration using finish time (reanahub#387)

feat(reana-ui/footer): link privacy notice to configured URL (reanahub#393)

refactor(reana-ui/docs): move from reST to Markdown (reanahub#391)

ci(reana-ui/commitlint): check for the presence of concrete PR number (reanahub#390)

ci(reana-ui/shellcheck): fix exit code propagation (reanahub#390)

fix(reana-ui/launcher): remove dollar sign in generated Markdown (reanahub#389)

ci(reana-ui/release-please): update version in package.json and Dockerfile (reanahub#385)

ci(reana-ui/release-please): switch to `simple` release strategy (reanahub#383)

fix(reana-ui/router): show 404 page for invalid URLs (reanahub#382)

ci(reana-ui/release-please): initial configuration (reanahub#380)

ci(reana-ui/commitlint): addition of commit message linter (reanahub#380)

chore(reana-message-broker/master): release 0.9.3

ci(reana-message-broker/commitlint): allow release commit style (reanahub#67)

docs(reana-message-broker/authors): complete list of contributors (reanahub#66)

refactor(reana-message-broker/docs): move from reST to Markdown (reanahub#65)

ci(reana-message-broker/commitlint): check for the presence of concrete PR number (reanahub#64)

ci(reana-message-broker/shellcheck): fix exit code propagation (reanahub#64)

ci(reana-message-broker/release-please): update version in Dockerfile (reanahub#63)

fix(reana-message-broker/startup): handle signals for graceful shutdown (reanahub#59)

ci(reana-message-broker/release-please): initial configuration (reanahub#60)

ci(reana-message-broker/commitlint): addition of commit message linter (reanahub#60)

chore(reana-server/master): release 0.9.3

build(reana-server/python): bump all required packages as of 2024-03-04 (reanahub#674)

build(reana-server/python): bump shared REANA packages as of 2024-03-04 (reanahub#674)

build(reana-server/python): bump shared modules (reanahub#676)

ci(reana-server/commitlint): allow release commit style (reanahub#675)

docs(reana-server/authors): complete list of contributors (reanahub#673)

ci(reana-server/pytest): move to PostgreSQL 14.10 (reanahub#672)

refactor(reana-server/docs): move from reST to Markdown (reanahub#671)

style(reana-server/black): format with black v24 (reanahub#670)

ci(reana-server/commitlint): check for the presence of concrete PR number (reanahub#669)

ci(reana-server/shellcheck): fix exit code propagation (reanahub#669)

ci(reana-server/release-please): update version in Dockerfile/OpenAPI specs (reanahub#668)

build(reana-server/docker): non-editable submodules in "latest" mode (reanahub#656)

build(reana-server/deps): pin invenio-userprofiles to 1.2.4 (reanahub#665)

ci(reana-server/release-please): initial configuration (reanahub#665)

ci(reana-server/commitlint): addition of commit message linter (reanahub#665)

chore(reana-workflow-controller/master): release 0.9.3

build(reana-workflow-controller/python): bump all required packages as of 2024-03-04 (reanahub#574)

build(reana-workflow-controller/python): bump shared REANA packages as of 2024-03-04 (reanahub#574)

feat(reana-workflow-controller/manager): increase termination period of run-batch pods (reanahub#572)

ci(reana-workflow-controller/commitlint): allow release commit style (reanahub#575)

feat(reana-workflow-controller/manager): pass custom env variables to job controller (reanahub#571)

feat(reana-workflow-controller/manager): pass custom env variables to workflow engines (reanahub#571)

docs(reana-workflow-controller/authors): complete list of contributors (reanahub#570)

ci(reana-workflow-controller/pytest): move to PostgreSQL 14.10 (reanahub#568)

fix(reana-workflow-controller/manager): use valid group name when calling `groupadd` (reanahub#566)

refactor(reana-workflow-controller/docs): move from reST to Markdown (reanahub#567)

fix(reana-workflow-controller/stop): store engine logs of stopped workflow (reanahub#563)

fix(reana-workflow-controller/manager): graceful shutdown of job-controller (reanahub#559)

feat(reana-workflow-controller/manager): call shutdown endpoint before workflow stop (reanahub#559)

refactor(reana-workflow-controller/consumer): do not update status of jobs (reanahub#559)

style(reana-workflow-controller/black): format with black v24 (reanahub#564)

ci(reana-workflow-controller/commitlint): check for the presence of concrete PR number (reanahub#562)

ci(reana-workflow-controller/shellcheck): fix exit code propagation (reanahub#562)

ci(reana-workflow-controller/release-please): update version in Dockerfile/OpenAPI specs (reanahub#558)

build(reana-workflow-controller/docker): non-editable submodules in "latest" mode (reanahub#551)

ci(reana-workflow-controller/release-please): initial configuration (reanahub#555)

ci(reana-workflow-controller/commitlint): addition of commit message linter (reanahub#555)

chore(reana-job-controller/master): release 0.9.3

build(reana-job-controller/python): bump all required packages as of 2024-03-04 (reanahub#442)

build(reana-job-controller/python): bump shared REANA packages as of 2024-03-04 (reanahub#442)

ci(reana-job-controller/commitlint): allow release commit style (reanahub#443)

build(reana-job-controller/certificates): update expired CERN Grid CA certificate (reanahub#440)

fix(reana-job-controller/database): limit the number of open database connections (reanahub#437)

docs(reana-job-controller/authors): complete list of contributors (reanahub#434)

perf(reana-job-controller/cache): avoid caching jobs when the cache is disabled (reanahub#435)

ci(reana-job-controller/pytest): move to PostgreSQL 14.10 (reanahub#429)

refactor(reana-job-controller/docs): move from reST to Markdown (reanahub#428)

ci(reana-job-controller/commitlint): check for the presence of concrete PR number (reanahub#425)

ci(reana-job-controller/shellcheck): fix exit code propagation (reanahub#425)

feat(reana-job-controller/shutdown): stop all running jobs before stopping workflow (reanahub#423)

refactor(reana-job-controller/monitor): move fetching of logs to job-manager (reanahub#423)

refactor(reana-job-controller/db): set job status also in the main database (reanahub#423)

refactor(reana-job-controller/monitor): centralise logs and status updates (reanahub#423)

style(reana-job-controller/black): format with black v24 (reanahub#426)

ci(reana-job-controller/release-please): update version in Dockerfile/OpenAPI specs (reanahub#421)

build(reana-job-controller/docker): non-editable submodules in "latest" mode (reanahub#416)

ci(reana-job-controller/release-please): initial configuration (reanahub#417)

ci(reana-job-controller/commitlint): addition of commit message linter (reanahub#417)

chore(reana-workflow-engine-cwl/master): release 0.9.3

build(reana-workflow-engine-cwl/python): bump all required packages as of 2024-03-04 (reanahub#267)

build(reana-workflow-engine-cwl/python): bump shared REANA packages as of 2024-03-04 (reanahub#267)

docs(reana-workflow-engine-cwl/conformance-tests): update CWL conformance test badges (reanahub#264)

ci(reana-workflow-engine-cwl/commitlint): allow release commit style (reanahub#268)

docs(reana-workflow-engine-cwl/authors): complete list of contributors (reanahub#266)

refactor(reana-workflow-engine-cwl/docs): move from reST to Markdown (reanahub#263)

fix(reana-workflow-engine-cwl/progress): handle stopped jobs (reanahub#260)

ci(reana-workflow-engine-cwl/commitlint): check for the presence of concrete PR number (reanahub#262)

ci(reana-workflow-engine-cwl/shellcheck): fix exit code propagation (reanahub#262)

build(reana-workflow-engine-cwl/docker): install correct extras of reana-commons submodule (reanahub#261)

ci(reana-workflow-engine-cwl/release-please): update version in Dockerfile (reanahub#259)

build(reana-workflow-engine-cwl/docker): non-editable submodules in "latest" mode (reanahub#255)

ci(reana-workflow-engine-cwl/release-please): initial configuration (reanahub#256)

ci(reana-workflow-engine-cwl/commitlint): addition of commit message linter (reanahub#256)

chore(reana-workflow-engine-serial/master): release 0.9.3

build(reana-workflow-engine-serial/python): bump all required packages as of 2024-03-04 (reanahub#200)

build(reana-workflow-engine-serial/python): bump shared REANA packages as of 2024-03-04 (reanahub#200)

ci(reana-workflow-engine-serial/commitlint): allow release commit style (reanahub#201)

docs(reana-workflow-engine-serial/authors): complete list of contributors (reanahub#199)

refactor(reana-workflow-engine-serial/docs): move from reST to Markdown (reanahub#198)

fix(reana-workflow-engine-serial/progress): handle stopped jobs (reanahub#195)

ci(reana-workflow-engine-serial/commitlint): check for the presence of concrete PR number (reanahub#197)

ci(reana-workflow-engine-serial/shellcheck): fix exit code propagation (reanahub#197)

build(reana-workflow-engine-serial/docker): install correct extras of reana-commons submodule (reanahub#196)

ci(reana-workflow-engine-serial/release-please): update version in Dockerfile (reanahub#194)

build(reana-workflow-engine-serial/docker): non-editable submodules in "latest" mode (reanahub#190)

ci(reana-workflow-engine-serial/release-please): initial configuration (reanahub#191)

ci(reana-workflow-engine-serial/commitlint): addition of commit message linter (reanahub#191)

chore(reana-workflow-engine-yadage/master): release 0.9.4

build(reana-workflow-engine-yadage/python): bump all required packages as of 2024-03-04 (reanahub#261)

build(reana-workflow-engine-yadage/python): bump shared REANA packages as of 2024-03-04 (reanahub#261)

ci(reana-workflow-engine-yadage/commitlint): allow release commit style (reanahub#262)

docs(reana-workflow-engine-yadage/authors): complete list of contributors (reanahub#260)

refactor(reana-workflow-engine-yadage/docs): move from reST to Markdown (reanahub#259)

fix(reana-workflow-engine-yadage/progress): correctly handle running and stopped jobs (reanahub#258)

ci(reana-workflow-engine-yadage/commitlint): check for the presence of concrete PR number (reanahub#257)

ci(reana-workflow-engine-yadage/shellcheck): fix exit code propagation (reanahub#257)

build(reana-workflow-engine-yadage/docker): install correct extras of reana-commons submodule (reanahub#256)

ci(reana-workflow-engine-yadage/release-please): update version in Dockerfile (reanahub#254)

build(reana-workflow-engine-yadage/docker): non-editable submodules in "latest" mode (reanahub#249)

ci(reana-workflow-engine-yadage/release-please): initial configuration (reanahub#251)

ci(reana-workflow-engine-yadage/commitlint): addition of commit message linter (reanahub#251)

chore(reana-workflow-engine-snakemake/master): release 0.9.3

build(reana-workflow-engine-snakemake/python): bump all required packages as of 2024-03-04 (reanahub#85)

build(reana-workflow-engine-snakemake/python): bump shared REANA packages as of 2024-03-04 (reanahub#85)

ci(reana-workflow-engine-snakemake/commitlint): allow release commit style (reanahub#86)

feat(reana-workflow-engine-snakemake/config): get max number of parallel jobs from env vars (reanahub#84)

feat(reana-workflow-engine-snakemake/executor): upgrade to Snakemake v7.32.4 (reanahub#81)

docs(reana-workflow-engine-snakemake/authors): complete list of contributors (reanahub#83)

refactor(reana-workflow-engine-snakemake/docs): move from reST to Markdown (reanahub#82)

fix(reana-workflow-engine-snakemake/progress): handle stopped jobs (reanahub#78)

ci(reana-workflow-engine-snakemake/commitlint): check for the presence of concrete PR number (reanahub#80)

ci(reana-workflow-engine-snakemake/shellcheck): fix exit code propagation (reanahub#80)

build(reana-workflow-engine-snakemake/docker): install correct extras of reana-commons submodule (reanahub#79)

ci(reana-workflow-engine-snakemake/release-please): update version in Dockerfile (reanahub#77)

build(reana-workflow-engine-snakemake/docker): non-editable submodules in "latest" mode (reanahub#73)

ci(reana-workflow-engine-snakemake/release-please): initial configuration (reanahub#74)

ci(reana-workflow-engine-snakemake/commitlint): addition of commit message linter (reanahub#74)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Development

Successfully merging a pull request may close this issue.

1 participant