Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

externalbackend: upgraded yadage #122

Merged
merged 1 commit into from Aug 30, 2019

Conversation

@dprelipcean
Copy link
Contributor

dprelipcean commented Aug 14, 2019

Signed-off-by: Daniel Prelipcean daniel.prelipcean@cern.ch

@diegodelemos

This comment has been minimized.

Copy link
Member

diegodelemos commented Aug 14, 2019

I've checked out everything locally and when running the examples:

$ DEMO=reana-demo-worldpopulation make example
[2019-08-14T11:44:39] reana: make example
source /Users/rodrigdi/.virtualenvs/reana/bin/activate && \
        eval $(reana-dev setup-environment) && \
        reana-dev run-example -c reana-demo-worldpopulation
...
[2019-08-14T11:46:33] reana-demo-worldpopulation: reana-client create -f reana-yadage.yaml -n worldpopulation.yadage
/Users/rodrigdi/.virtualenvs/reana/lib/python3.6/site-packages/urllib3/connectionpool.py:851: InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification i
s strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  InsecureRequestWarning)
worldpopulation.yadage.2
[2019-08-14T11:46:35] reana-demo-worldpopulation: reana-client upload -w worldpopulation.yadage
/Users/rodrigdi/.virtualenvs/reana/lib/python3.6/site-packages/urllib3/connectionpool.py:851: InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification i
s strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  InsecureRequestWarning)
File code/worldpopulation.ipynb was successfully uploaded.
/Users/rodrigdi/.virtualenvs/reana/lib/python3.6/site-packages/urllib3/connectionpool.py:851: InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification i
s strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  InsecureRequestWarning)
File data/World_historical_and_predicted_populations_in_percentage.csv was successfully uploaded.
[2019-08-14T11:46:37] reana-demo-worldpopulation: reana-client start -w worldpopulation.yadage  
/Users/rodrigdi/.virtualenvs/reana/lib/python3.6/site-packages/urllib3/connectionpool.py:851: InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification i
s strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  InsecureRequestWarning)
/Users/rodrigdi/.virtualenvs/reana/lib/python3.6/site-packages/urllib3/connectionpool.py:851: InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification i
s strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  InsecureRequestWarning)
worldpopulation.yadage is running
[2019-08-14T11:46:43] reana-demo-worldpopulation: reana-client status -w worldpopulation.yadage
/Users/rodrigdi/.virtualenvs/reana/lib/python3.6/site-packages/urllib3/connectionpool.py:851: InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification i
s strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  InsecureRequestWarning)
NAME                     RUN_NUMBER   CREATED               STATUS    PROGRESS
worldpopulation.yadage   2            2019-08-14T09:35:20   running   -/-     
[2019-08-14T11:46:50] reana-demo-worldpopulation: reana-client status -w worldpopulation.yadage
/Users/rodrigdi/.virtualenvs/reana/lib/python3.6/site-packages/urllib3/connectionpool.py:851: InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification i
s strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  InsecureRequestWarning)
NAME                     RUN_NUMBER   CREATED               STATUS    PROGRESS
worldpopulation.yadage   2            2019-08-14T09:35:20   running   0/1     
[2019-08-14T11:46:56] reana-demo-worldpopulation: reana-client status -w worldpopulation.yadage
/Users/rodrigdi/.virtualenvs/reana/lib/python3.6/site-packages/urllib3/connectionpool.py:851: InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification i
s strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  InsecureRequestWarning)
NAME                     RUN_NUMBER   CREATED               STATUS   PROGRESS
worldpopulation.yadage   2            2019-08-14T09:35:20   failed   0/1     
[2019-08-14T11:46:57] reana-demo-worldpopulation: reana-client ls -w worldpopulation.yadage
/Users/rodrigdi/.virtualenvs/reana/lib/python3.6/site-packages/urllib3/connectionpool.py:851: InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification i
s strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  InsecureRequestWarning)
NAME                                                                SIZE    LAST-MODIFIED      
_yadage/yadage_snapshot_workflow.json                               8386    2019-08-14T09:35:42
data/World_historical_and_predicted_populations_in_percentage.csv   574     2019-08-14T09:35:22
code/worldpopulation.ipynb                                          19221   2019-08-14T09:35:22
[ERROR] Expected output file plot.png not found. Exiting.

And I get the following error:

$ kubectl logs batch-yadage-1262f796-6ce1-48fc-9955-4301a6d4692e-k7kss workflow-engine -f
2019-08-14 09:35:36,808 |       pack.init.step |   INFO | publishing data: <TypedLeafs: {u'year_max': 2012, u'input_file': u'/var/reana/users/00000000-0000-0000-0000-000000000000/workflows/12
62f796-6ce1-48fc-9955-4301a6d4692e/data/World_historical_and_predicted_populations_in_percentage.csv', u'region': u'Africa', u'output_file': u'results/plot.png', u'notebook': u'/var/reana/use
rs/00000000-0000-0000-0000-000000000000/workflows/1262f796-6ce1-48fc-9955-4301a6d4692e/code/worldpopulation.ipynb', u'year_min': 1500}>
2019-08-14 09:35:42,031 | adage | MainThread | ERROR | some weird exception caught in adage process loop
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/site-packages/adage/__init__.py", line 51, in run_polling_workflow
    for stepnum, controller in enumerate(coroutine):
  File "/usr/local/lib/python2.7/site-packages/adage/pollingexec.py", line 93, in adage_coroutine
    process_dag(controller,submit_decider)
  File "/usr/local/lib/python2.7/site-packages/adage/pollingexec.py", line 64, in process_dag
    controller.submit_nodes(nodes)
  File "/usr/local/lib/python2.7/site-packages/adage/wflowcontroller.py", line 45, in submit_nodes
    ctrlutils.submit_nodes(nodes, self.backend)
  File "/usr/local/lib/python2.7/site-packages/adage/controllerutils.py", line 99, in submit_nodes
    nodeobj.resultproxy = backend.submit(nodeobj.task)
  File "/usr/local/lib/python2.7/site-packages/yadage/backends/federatedbackend.py", line 23, in submit
    return self.routedsubmit(task)
  File "/usr/local/lib/python2.7/site-packages/yadage/backends/packtivitybackend.py", line 76, in routedsubmit
    task.spec, task.parameters, task.state, task.metadata
  File "/usr/local/lib/python2.7/site-packages/reana_workflow_engine_yadage/externalbackend.py", line 97, in submit
    job = build_job(spec['process'], parameters, state, self.config)
  File "/usr/local/lib/python2.7/site-packages/packtivity/syncbackends.py", line 96, in build_job
    return handler(process, parameters, state)
  File "/usr/local/lib/python2.7/site-packages/packtivity/handlers/process_handlers.py", line 11, in stringinterp_handler
    if isinstance(parameters.typed(), dict):
AttributeError: 'tuple' object has no attribute 'typed'
2019-08-14 09:35:42,050 | root | MainThread | ERROR | Error while publishing channel disconnected
@dprelipcean

This comment has been minimized.

Copy link
Contributor Author

dprelipcean commented Aug 14, 2019

Reproduced, working on it.

@lukasheinrich I see that for the new versions, the published data is not in JSON format, but as TypedLeafs. Do you have any hints on how that changes our API?

@dprelipcean

This comment has been minimized.

Copy link
Contributor Author

dprelipcean commented Aug 15, 2019

Status Quo

I've made changes and now the examples work, i.e. the outputs are correctly generated, but the workflow status appears as failed. I assume that I have to do some changes in tracker.py as well.

$ reana-client status -w yadpar.2
NAME     RUN_NUMBER   CREATED               STATUS   PROGRESS
yadpar   2            2019-08-15T15:56:05   failed   1/1     
$ reana-client ls -w yadpar.2
NAME                                    SIZE   LAST-MODIFIED      
helloworld/greetings.txt                34     2019-08-15T15:57:13
_yadage/yadage_snapshot_workflow.json   4139   2019-08-15T15:57:21
data/names.txt                          20     2019-08-15T15:56:17
$ reana-client download helloworld/greetings.txt -w yadpar.2
File helloworld/greetings.txt downloaded to /home/dprelipc/project/reana/reana-demo-helloworld.

Debugging info

Since the outputs are correctly generated, and from debugging and inspecting the individual job nodes, I am pretty certain that the individual job nodes terminate successfully. What I get now is an error that does not stop my workflow (compared to @diegodelemos which failed completely):

root | MainThread | ERROR | Error while publishing channel disconnected

My wild guess is that this is due to some changes in the API for publishing/submitting results.

@dprelipcean dprelipcean force-pushed the dprelipcean:upgrade_yadage branch 6 times, most recently from 04ea467 to 2edf862 Aug 15, 2019
@dprelipcean

This comment has been minimized.

Copy link
Contributor Author

dprelipcean commented Aug 19, 2019

@lukasheinrich Can you hint me to the possible change in the API that gives us the failure for the tracker? The individual steps end successfully and give the appropriate output, but the workflow as a whole appears as failed.

@lukasheinrich

This comment has been minimized.

Copy link
Member

lukasheinrich commented Aug 19, 2019

hm, I'm not sure @dprelipcean can you add logging in https://github.com/dprelipcean/reana-workflow-engine-yadage/blob/upgrade_yadage/reana_workflow_engine_yadage/tracker.py#L27 to see which node in the graph makes the workflow to be marked as fail?

@dprelipcean

This comment has been minimized.

Copy link
Contributor Author

dprelipcean commented Aug 19, 2019

@lukasheinrich I've added logs to give me the nodestates and the progress

this is a tracking log at 2019-08-19T13:09:06.631935
 with nodes [{'state': 'succeeded', 'job_id': "{'job_id': '3b3bc371-e78f-4a53-bbf7-da7ad9a6a2ee'}"}, {'state': 'succeeded', 'job_id': "{'job_id': 'e0d42aa8-f6ad-41c9-b8e5-89358fbfc75c'}"}]
 and progress {'engine_specific': {'dag': {'edges': [['5916ea84-7ef0-4b85-b01b-2397d8c88d89', '22a435b9-0c02-4535-a757-284c4eb407aa'], ['5916ea84-7ef0-4b85-b01b-2397d8c88d89', '0a210b81-095a-4dd7-89fa-fb6e80f1c9dd'], ['22a435b9-0c02-4535-a757-284c4eb407aa', '0a210b81-095a-4dd7-89fa-fb6e80f1c9dd']], 'nodes': [{'metadata': {'name': 'init'}, 'id': '5916ea84-7ef0-4b85-b01b-2397d8c88d89', 'jobid': None}, {'metadata': {'name': 'gendata'}, 'id': '22a435b9-0c02-4535-a757-284c4eb407aa', 'jobid': "{'job_id': '3b3bc371-e78f-4a53-bbf7-da7ad9a6a2ee'}"}, {'metadata': {'name': 'fitdata'}, 'id': '0a210b81-095a-4dd7-89fa-fb6e80f1c9dd', 'jobid': "{'job_id': 'e0d42aa8-f6ad-41c9-b8e5-89358fbfc75c'}"}]}}, 'planned': {'total': 0, 'job_ids': []}, 'failed': {'total': 0, 'job_ids': []}, 'total': {'total': 0, 'job_ids': []}, 'running': {'total': 0, 'job_ids': []}, 'finished': {'total': 2, 'job_ids': ['3b3bc371-e78f-4a53-bbf7-da7ad9a6a2ee', 'e0d42aa8-f6ad-41c9-b8e5-89358fbfc75c']}}

but I don"t see any failed node.

This is for the roofit example.

@dprelipcean dprelipcean force-pushed the dprelipcean:upgrade_yadage branch from 2edf862 to eb9efda Aug 20, 2019
@dprelipcean

This comment has been minimized.

Copy link
Contributor Author

dprelipcean commented Aug 22, 2019

@lukasheinrich I added more logs to it and now have:

2019-08-22 08:21:14,647 | root | MainThread | DEBUG | Publisher: message sent: {'workflow_uuid': '2906788f-93a8-46fb-be02-8c317e2a2dae', 'logs': 'this is a tracking log at 2019-08-22T08:21:14.642584\n', 'status': 1, 'message': {'progress': {'engine_specific': {'dag': {'edges': [['aade5b29-8e2a-4d3b-9478-5f42bef39fbd', '31eb4a02-07f4-4ae4-b49a-8db5b700faa7']], 'nodes': [{'metadata': {'name': 'init'}, 'id': 'aade5b29-8e2a-4d3b-9478-5f42bef39fbd', 'jobid': None}, {'metadata': {'name': 'helloworld'}, 'id': '31eb4a02-07f4-4ae4-b49a-8db5b700faa7', 'jobid': "{'job_id': 'f906b6ce-91b1-49e1-875d-97a8fe1d85bb'}"}]}}, 'planned': {'total': 0, 'job_ids': []}, 'failed': {'total': 0, 'job_ids': []}, 'total': {'total': 0, 'job_ids': []}, 'running': {'total': 0, 'job_ids': []}, 'finished': {'total': 1, 'job_ids': ['f906b6ce-91b1-49e1-875d-97a8fe1d85bb']}}}}
2019-08-22 08:21:14,647 | root | MainThread | DEBUG | Publisher: closing queue connection
2019-08-22 08:21:14,653 | amqp | MainThread | DEBUG | Closed channel #1
2019-08-22 08:21:14,657 | adage | MainThread | INFO | adage state loop done.
2019-08-22 08:21:14,658 | adage | MainThread | INFO | execution valid. (in terms of execution order)
2019-08-22 08:21:14,659 | adage | MainThread | INFO | workflow completed successfully.
2019-08-22 08:21:14,659 | yadage.steering_api | MainThread | INFO | done. dumping workflow to disk.
2019-08-22 08:21:14,670 | reana-workflow-engine-yadage | MainThread | INFO | workflow failed: Object of type 'TypedLeafs' is not JSON serializable
2019-08-22 08:21:14,674 | root | MainThread | ERROR | Error while publishing channel disconnected

I guess the error happens when closing the context manager as we do it on our side.

@dprelipcean dprelipcean self-assigned this Aug 26, 2019
@dprelipcean dprelipcean added this to In progress in Physics-Examples-Summer-2019 via automation Aug 26, 2019
dprelipcean added a commit to dprelipcean/reana-workflow-engine-yadage that referenced this pull request Aug 26, 2019
* adjusted reana backend to new yadage API (Closes reanahub#122)

* upgraded to python 3.6

Signed-off-by: Daniel Prelipcean <daniel.prelipcean@cern.ch>
@dprelipcean dprelipcean force-pushed the dprelipcean:upgrade_yadage branch 4 times, most recently from d9dd899 to 311b4ec Aug 26, 2019
@dprelipcean dprelipcean moved this from In progress to Review in progress in Physics-Examples-Summer-2019 Aug 26, 2019
@dprelipcean dprelipcean force-pushed the dprelipcean:upgrade_yadage branch from 311b4ec to 758bdb1 Aug 27, 2019
* adjusted reana backend to new yadage API

* upgraded to python 3.6

Signed-off-by: Daniel Prelipcean <daniel.prelipcean@cern.ch>

WIP
@diegodelemos diegodelemos force-pushed the dprelipcean:upgrade_yadage branch from 758bdb1 to e4d0cb4 Aug 30, 2019
Physics-Examples-Summer-2019 automation moved this from Review in progress to Reviewer approved Aug 30, 2019
@diegodelemos diegodelemos merged commit e4d0cb4 into reanahub:master Aug 30, 2019
2 checks passed
2 checks passed
continuous-integration/travis-ci/pr The Travis CI build passed
Details
coverage/coveralls First build on upgrade_yadage at 3.139%
Details
Physics-Examples-Summer-2019 automation moved this from Reviewer approved to Done Aug 30, 2019
@dprelipcean dprelipcean deleted the dprelipcean:upgrade_yadage branch Sep 2, 2019
@dprelipcean dprelipcean mentioned this pull request Sep 3, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
3 participants
You can’t perform that action at this time.