Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BF produce an error in create and run_procedure if procedure does not exist #6143

Merged
merged 7 commits into from
Nov 11, 2021

Conversation

mslw
Copy link
Contributor

@mslw mslw commented Nov 4, 2021

This PR changes the behaviour of create and run_procedure so that using an incorrect (unavailable) procedure name results in an error. Closes #6129

Specifically, the following changes were made:

  1. The run_procedure on its own was changed to raise a ValueError instead of reporting a result with status='impossible'
  2. A call to run_procedure(discover=True) was introduced in datalad create before any creation takes place to check whether the requested cfg_proc(s) are available.
  3. To avoid going through a discovery process again when the procedure is applied to a newly created dataset, results of the call described above are reused. To make it possible, a new behaviour was introduced to run_procedure(spec, ...) when spec is a dictionary.

The third change seems the most "invasive" to me so I'd be happy to have your opinions on whether this is the correct way.

Resulting behaviour:

datalad create -c unknown my-dataset                                                             `129` !
[ERROR  ] ValueError(Cannot find procedure with name 'unknown') (ValueError)

Remaining TODO items:

  • Add a test for create asserting that ValueError is raised and the dataset is not created
  • Add a test for run_procedure asserting that it can be done with a (proper) dictionary input.

Previously the operation reported a result with status: impossible.
This commit also changes an applicable test to use assert_raises().
If the specified cfg_proc cannot be discovered, throw an error
before creating the dataset.
This commit allows run_procedure to accept as input a dictionary
reported by run_procedure(discover=True), causing it to skip
calling _get_procedure_implementation().
This mechanism is used in create: first run_procedure(discover=True)
is executed to establish availability before creating the dataset,
and then the results are used to run the procedure on the created
dataset.
@codecov
Copy link

codecov bot commented Nov 4, 2021

Codecov Report

Merging #6143 (18774bf) into master (5adfde7) will decrease coverage by 53.59%.
The diff coverage is 29.01%.

❗ Current head 18774bf differs from pull request most recent head f1d3ac9. Consider uploading reports for the commit f1d3ac9 to get more accurate results
Impacted file tree graph

@@             Coverage Diff             @@
##           master    #6143       +/-   ##
===========================================
- Coverage   89.73%   36.13%   -53.60%     
===========================================
  Files         317      318        +1     
  Lines       42394    41871      -523     
===========================================
- Hits        38042    15131    -22911     
- Misses       4352    26740    +22388     
Impacted Files Coverage Δ
datalad/core/local/tests/test_status.py 98.47% <ø> (-0.02%) ⬇️
datalad/customremotes/base.py 26.37% <0.00%> (-56.90%) ⬇️
datalad/customremotes/tests/test_archives.py 0.00% <0.00%> (-89.41%) ⬇️
...ad/distributed/tests/test_create_sibling_ghlike.py 0.00% <0.00%> (-69.70%) ⬇️
datalad/distributed/tests/test_drop.py 0.00% <0.00%> (ø)
datalad/distribution/dataset.py 87.39% <ø> (-9.25%) ⬇️
datalad/distribution/drop.py 0.00% <0.00%> (-96.52%) ⬇️
datalad/distribution/remove.py 0.00% <0.00%> (-90.60%) ⬇️
datalad/distribution/tests/test_create_sibling.py 0.00% <0.00%> (-76.14%) ⬇️
datalad/distribution/tests/test_dataset.py 0.00% <0.00%> (-99.71%) ⬇️
... and 305 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5adfde7...f1d3ac9. Read the comment docs.

In create, test whether running with an incorrect cfg_proc produces
a ValueError and no dataset is created.

In cfg_proc, test whether a dictionary returned by
run_procedure(discover=True) will be accepted as the spec argument.
@mslw mslw marked this pull request as ready for review November 4, 2021 21:14
@mslw mslw changed the title Bf run procedure BF produce an error in create and run_procedure if procedure does not exist Nov 4, 2021
Copy link
Member

@mih mih left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx for the PR!

Apart from a performance-related aspect, the approach looks good and clean to me. Once the multi-call- issue is fixed, we can merge this IMHO. Thx!

datalad/core/local/create.py Outdated Show resolved Hide resolved
datalad/interface/run_procedure.py Outdated Show resolved Hide resolved
mslw and others added 2 commits November 5, 2021 18:04
Co-authored-by: Michael Hanke <michael.hanke@gmail.com>
Call run_procedure(discover=True) once, creating a list which is
then checked, instead of calling it with return_type='generator'
for each specified cfg_proc.
@mslw mslw requested a review from mih November 5, 2021 17:09
@adswa
Copy link
Member

adswa commented Nov 10, 2021

The failure on travis seems unrelated:

=====================================================================

ERROR: datalad.interface.tests.test_download_url.test_download_url_need_datalad_remote

----------------------------------------------------------------------

Traceback (most recent call last):

  File "/tmp/dl-miniconda-duu7yni7/lib/python3.9/site-packages/nose/case.py", line 198, in runTest

    self.test(*self.arg)

  File "/tmp/dl-miniconda-duu7yni7/lib/python3.9/site-packages/datalad/tests/utils.py", line 190, in _wrap_skip_if_no_network

    return func(*args, **kwargs)

  File "/tmp/dl-miniconda-duu7yni7/lib/python3.9/site-packages/datalad/tests/utils.py", line 737, in _wrap_with_tempfile

    return t(*(arg + (filename,)), **kw)

  File "/tmp/dl-miniconda-duu7yni7/lib/python3.9/site-packages/datalad/interface/tests/test_download_url.py", line 237, in test_download_url_need_datalad_remote

    ds_a.download_url([url], path="foo")

  File "/tmp/dl-miniconda-duu7yni7/lib/python3.9/site-packages/datalad/distribution/dataset.py", line 485, in apply_func

    return f(**kwargs)

  File "/tmp/dl-miniconda-duu7yni7/lib/python3.9/site-packages/datalad/interface/utils.py", line 484, in eval_func

    return return_func(generator_func)(*args, **kwargs)

  File "/tmp/dl-miniconda-duu7yni7/lib/python3.9/site-packages/datalad/interface/utils.py", line 476, in return_func

    results = list(results)

  File "/tmp/dl-miniconda-duu7yni7/lib/python3.9/site-packages/datalad/interface/utils.py", line 461, in generator_func

    raise IncompleteResultsError(

datalad.support.exceptions.IncompleteResultsError: Command did not complete successfully. 1 failed:

[{'action': 'download_url',

  'exception': BotoServerError: 503 Slow Down

 [download_url.py:__call__:199,base.py:download:517,base.py:access:162,s3.py:_establish_session:228,utils.py:_wrap_try_multiple_dec:2053,s3.py:authenticate:121,s3.py:get_bucket:131,utils.py:_wrap_try_multiple_dec:2053,connection.py:get_bucket:509,connection.py:head_bucket:528,connection.py:make_request:667,connection.py:make_request:1070,connection.py:_mexe:1028],

  'exception_traceback': '[download_url.py:__call__:199,base.py:download:517,base.py:access:162,s3.py:_establish_session:228,utils.py:_wrap_try_multiple_dec:2053,s3.py:authenticate:121,s3.py:get_bucket:131,utils.py:_wrap_try_multiple_dec:2053,connection.py:get_bucket:509,connection.py:head_bucket:528,connection.py:make_request:667,connection.py:make_request:1070,connection.py:_mexe:1028]',

  'message': 'BotoServerError(BotoServerError: 503 Slow Down\n)',

  'path': '/tmp/datalad_temp_test_download_url_need_datalad_remotenico1j3p/a/foo',

  'status': 'error',

  'type': 'file'}]

You can fix the semver test by adding a versioning label, e.g., patch (see https://github.com/datalad/datalad/blob/master/CONTRIBUTING.md#labelling-pull-requests). Should you not have permissions to add a label, I think this should be fixed :)

I tried your change, works nicely for me - thanks!

@mslw
Copy link
Contributor Author

mslw commented Nov 10, 2021

Thank you for taking a look at the proposed changes and the checks @adswa!

The failure on travis seems unrelated

Guess so, unless that's a weird collateral - unfortunately, I have nothing to add (edit: this test passes on my laptop).

You can fix the semver test by adding a versioning label, e.g., patch (see https://github.com/datalad/datalad/blob/master/CONTRIBUTING.md#labelling-pull-requests). Should you not have permissions to add a label, I think this should be fixed :)

Thanks for pointing it out (we also talked about it with @bpoldrack yesterday). Indeed, I do not have permissions - no button saying "Add label" where it would normally be. Whether that's something to be fixed, I don't know. I think that's tied to repository write access?

@adswa
Copy link
Member

adswa commented Nov 10, 2021

I think @mih should add you to the datalad organization - do you know, @mih?

Copy link
Member

@mih mih left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx for the update, LGTM now.

@mih mih added the semver-patch Increment the patch version when merged label Nov 11, 2021
@mih mih merged commit e7dd141 into datalad:master Nov 11, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
semver-patch Increment the patch version when merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

No warning / error when incorrect config procedure is given to datalad create -c
3 participants