Skip to content

BF produce an error in create and run_procedure if procedure does not exist#6143

Merged
mih merged 7 commits into
datalad:masterfrom
mslw:bf-run-procedure
Nov 11, 2021
Merged

BF produce an error in create and run_procedure if procedure does not exist#6143
mih merged 7 commits into
datalad:masterfrom
mslw:bf-run-procedure

Conversation

@mslw
Copy link
Copy Markdown
Contributor

@mslw mslw commented Nov 4, 2021

This PR changes the behaviour of create and run_procedure so that using an incorrect (unavailable) procedure name results in an error. Closes #6129

Specifically, the following changes were made:

  1. The run_procedure on its own was changed to raise a ValueError instead of reporting a result with status='impossible'
  2. A call to run_procedure(discover=True) was introduced in datalad create before any creation takes place to check whether the requested cfg_proc(s) are available.
  3. To avoid going through a discovery process again when the procedure is applied to a newly created dataset, results of the call described above are reused. To make it possible, a new behaviour was introduced to run_procedure(spec, ...) when spec is a dictionary.

The third change seems the most "invasive" to me so I'd be happy to have your opinions on whether this is the correct way.

Resulting behaviour:

datalad create -c unknown my-dataset                                                             `129` !
[ERROR  ] ValueError(Cannot find procedure with name 'unknown') (ValueError)

Remaining TODO items:

  • Add a test for create asserting that ValueError is raised and the dataset is not created
  • Add a test for run_procedure asserting that it can be done with a (proper) dictionary input.

mslw added 3 commits November 3, 2021 16:42
Previously the operation reported a result with status: impossible.
This commit also changes an applicable test to use assert_raises().
If the specified cfg_proc cannot be discovered, throw an error
before creating the dataset.
This commit allows run_procedure to accept as input a dictionary
reported by run_procedure(discover=True), causing it to skip
calling _get_procedure_implementation().
This mechanism is used in create: first run_procedure(discover=True)
is executed to establish availability before creating the dataset,
and then the results are used to run the procedure on the created
dataset.
@codecov
Copy link
Copy Markdown

codecov Bot commented Nov 4, 2021

Codecov Report

Merging #6143 (18774bf) into master (5adfde7) will decrease coverage by 53.59%.
The diff coverage is 29.01%.

❗ Current head 18774bf differs from pull request most recent head f1d3ac9. Consider uploading reports for the commit f1d3ac9 to get more accurate results
Impacted file tree graph

@@             Coverage Diff             @@
##           master    #6143       +/-   ##
===========================================
- Coverage   89.73%   36.13%   -53.60%     
===========================================
  Files         317      318        +1     
  Lines       42394    41871      -523     
===========================================
- Hits        38042    15131    -22911     
- Misses       4352    26740    +22388     
Impacted Files Coverage Δ
datalad/core/local/tests/test_status.py 98.47% <ø> (-0.02%) ⬇️
datalad/customremotes/base.py 26.37% <0.00%> (-56.90%) ⬇️
datalad/customremotes/tests/test_archives.py 0.00% <0.00%> (-89.41%) ⬇️
...ad/distributed/tests/test_create_sibling_ghlike.py 0.00% <0.00%> (-69.70%) ⬇️
datalad/distributed/tests/test_drop.py 0.00% <0.00%> (ø)
datalad/distribution/dataset.py 87.39% <ø> (-9.25%) ⬇️
datalad/distribution/drop.py 0.00% <0.00%> (-96.52%) ⬇️
datalad/distribution/remove.py 0.00% <0.00%> (-90.60%) ⬇️
datalad/distribution/tests/test_create_sibling.py 0.00% <0.00%> (-76.14%) ⬇️
datalad/distribution/tests/test_dataset.py 0.00% <0.00%> (-99.71%) ⬇️
... and 305 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5adfde7...f1d3ac9. Read the comment docs.

In create, test whether running with an incorrect cfg_proc produces
a ValueError and no dataset is created.

In cfg_proc, test whether a dictionary returned by
run_procedure(discover=True) will be accepted as the spec argument.
@mslw mslw marked this pull request as ready for review November 4, 2021 21:14
@mslw mslw changed the title Bf run procedure BF produce an error in create and run_procedure if procedure does not exist Nov 4, 2021
Copy link
Copy Markdown
Member

@mih mih left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx for the PR!

Apart from a performance-related aspect, the approach looks good and clean to me. Once the multi-call- issue is fixed, we can merge this IMHO. Thx!

Comment thread datalad/core/local/create.py Outdated
Comment thread datalad/interface/run_procedure.py Outdated
mslw and others added 2 commits November 5, 2021 18:04
Co-authored-by: Michael Hanke <michael.hanke@gmail.com>
Call run_procedure(discover=True) once, creating a list which is
then checked, instead of calling it with return_type='generator'
for each specified cfg_proc.
@mslw mslw requested a review from mih November 5, 2021 17:09
@adswa
Copy link
Copy Markdown
Member

adswa commented Nov 10, 2021

The failure on travis seems unrelated:

=====================================================================

ERROR: datalad.interface.tests.test_download_url.test_download_url_need_datalad_remote

----------------------------------------------------------------------

Traceback (most recent call last):

  File "/tmp/dl-miniconda-duu7yni7/lib/python3.9/site-packages/nose/case.py", line 198, in runTest

    self.test(*self.arg)

  File "/tmp/dl-miniconda-duu7yni7/lib/python3.9/site-packages/datalad/tests/utils.py", line 190, in _wrap_skip_if_no_network

    return func(*args, **kwargs)

  File "/tmp/dl-miniconda-duu7yni7/lib/python3.9/site-packages/datalad/tests/utils.py", line 737, in _wrap_with_tempfile

    return t(*(arg + (filename,)), **kw)

  File "/tmp/dl-miniconda-duu7yni7/lib/python3.9/site-packages/datalad/interface/tests/test_download_url.py", line 237, in test_download_url_need_datalad_remote

    ds_a.download_url([url], path="foo")

  File "/tmp/dl-miniconda-duu7yni7/lib/python3.9/site-packages/datalad/distribution/dataset.py", line 485, in apply_func

    return f(**kwargs)

  File "/tmp/dl-miniconda-duu7yni7/lib/python3.9/site-packages/datalad/interface/utils.py", line 484, in eval_func

    return return_func(generator_func)(*args, **kwargs)

  File "/tmp/dl-miniconda-duu7yni7/lib/python3.9/site-packages/datalad/interface/utils.py", line 476, in return_func

    results = list(results)

  File "/tmp/dl-miniconda-duu7yni7/lib/python3.9/site-packages/datalad/interface/utils.py", line 461, in generator_func

    raise IncompleteResultsError(

datalad.support.exceptions.IncompleteResultsError: Command did not complete successfully. 1 failed:

[{'action': 'download_url',

  'exception': BotoServerError: 503 Slow Down

 [download_url.py:__call__:199,base.py:download:517,base.py:access:162,s3.py:_establish_session:228,utils.py:_wrap_try_multiple_dec:2053,s3.py:authenticate:121,s3.py:get_bucket:131,utils.py:_wrap_try_multiple_dec:2053,connection.py:get_bucket:509,connection.py:head_bucket:528,connection.py:make_request:667,connection.py:make_request:1070,connection.py:_mexe:1028],

  'exception_traceback': '[download_url.py:__call__:199,base.py:download:517,base.py:access:162,s3.py:_establish_session:228,utils.py:_wrap_try_multiple_dec:2053,s3.py:authenticate:121,s3.py:get_bucket:131,utils.py:_wrap_try_multiple_dec:2053,connection.py:get_bucket:509,connection.py:head_bucket:528,connection.py:make_request:667,connection.py:make_request:1070,connection.py:_mexe:1028]',

  'message': 'BotoServerError(BotoServerError: 503 Slow Down\n)',

  'path': '/tmp/datalad_temp_test_download_url_need_datalad_remotenico1j3p/a/foo',

  'status': 'error',

  'type': 'file'}]

You can fix the semver test by adding a versioning label, e.g., patch (see https://github.com/datalad/datalad/blob/master/CONTRIBUTING.md#labelling-pull-requests). Should you not have permissions to add a label, I think this should be fixed :)

I tried your change, works nicely for me - thanks!

@mslw
Copy link
Copy Markdown
Contributor Author

mslw commented Nov 10, 2021

Thank you for taking a look at the proposed changes and the checks @adswa!

The failure on travis seems unrelated

Guess so, unless that's a weird collateral - unfortunately, I have nothing to add (edit: this test passes on my laptop).

You can fix the semver test by adding a versioning label, e.g., patch (see https://github.com/datalad/datalad/blob/master/CONTRIBUTING.md#labelling-pull-requests). Should you not have permissions to add a label, I think this should be fixed :)

Thanks for pointing it out (we also talked about it with @bpoldrack yesterday). Indeed, I do not have permissions - no button saying "Add label" where it would normally be. Whether that's something to be fixed, I don't know. I think that's tied to repository write access?

@adswa
Copy link
Copy Markdown
Member

adswa commented Nov 10, 2021

I think @mih should add you to the datalad organization - do you know, @mih?

Copy link
Copy Markdown
Member

@mih mih left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx for the update, LGTM now.

@mih mih added the semver-patch Increment the patch version when merged label Nov 11, 2021
@mih mih merged commit e7dd141 into datalad:master Nov 11, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

semver-patch Increment the patch version when merged

Projects

None yet

Development

Successfully merging this pull request may close these issues.

No warning / error when incorrect config procedure is given to datalad create -c

3 participants