Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BF: Don't skip subds underneath paths in reckless & recursive drop #7308

Merged
merged 3 commits into from
Mar 22, 2023

Conversation

adswa
Copy link
Member

@adswa adswa commented Feb 27, 2023

This fixes #7013.
If a drop --recursive --what all --reckless kill was provided with paths
to directories, it silently skipped killing the contained subdatasets.
The problem arises when the internal resolving of paths to datasets
determines that the provided directory path belongs to the superdataset.
Instead of then recursing into the subdirectory's potential subdatasets,
drop restrained the scope of what was to be dropped to only
'filecontent' in order to limit the drop from the determined
superdataset to the provided path as an internal safety mechanism:

if paths is not None and paths != [ds.pathobj] and what == 'all':
    # so we have paths constraints that prevent dropping the full dataset
    lgr.debug('Only dropping file content for given paths in %s, '
              'allthough instruction was to drop %s', ds, what)
    what = 'filecontent'

This change lifts this safety mechanism partially: Whenever reckless is
set to 'kill', and the provided path does not match the currently
investigated dataset, it adds a subdataset call constrained to the path
in question. If this returns a dataset, we know that this dataset is
meant to be dropped as well, as recursive (which has to be set whenever
reckless==kill) must be set, too.

@adswa adswa added semver-patch Increment the patch version when merged CHANGELOG-missing When a PR's description does not contain a changelog item, yet. labels Feb 27, 2023
@github-actions github-actions bot removed the CHANGELOG-missing When a PR's description does not contain a changelog item, yet. label Feb 27, 2023
@adswa adswa force-pushed the bf-7013 branch 2 times, most recently from af804e0 to ecb94d4 Compare March 2, 2023 07:58
This fixes datalad#7013.
If a drop --recursive --what all --reckless kill was provided with paths
to directories, it silently skipped killing the contained subdatasets.
The problem arises when the internal resolving of paths to datasets
determines that the provided directory path belongs to the superdataset.
Instead of then recursing into the subdirectory's potential subdatasets,
drop restrained the scope of what was to be dropped to only
'filecontent' in order to limit the drop from the determined
superdataset to the provided path as an internal safety mechanism:

if paths is not None and paths != [ds.pathobj] and what == 'all':
    # so we have paths constraints that prevent dropping the full dataset
    lgr.debug('Only dropping file content for given paths in %s, '
              'allthough instruction was to drop %s', ds, what)
    what = 'filecontent'

This change lifts this safety mechanism partially: Whenever reckless is
set to 'kill', and the provided path does not match the currently
investigated dataset, it adds a subdataset call constrained to the path
in question. If this returns a dataset, we know that this dataset is
meant to be dropped as well, as recursive (which has to be set whenever
reckless==kill) must be set, too.
@adswa
Copy link
Member Author

adswa commented Mar 8, 2023

The failure is unrelated:

FAILED ../datalad/support/tests/test_sshconnector.py::test_ssh_copy - datalad.runner.exception.CommandError: CommandError: 'scp -o ControlPath=/home/travis/.cache/datalad/sockets/1d11e952 /tmp/datalad_temp_test_ssh_copyxs_2d2y5 /tmp/datalad_temp_test_ssh_copy8zhyfqs1 datalad-test:/tmp/datalad_temp_test_ssh_copywrh7glsq' failed with exitcode 1 [err: 'lost connection']

Copy link
Member

@bpoldrack bpoldrack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @adswa!

Overall reads right, but I think it needs a minor change. Correct me if I'm wrong.

datalad/distributed/drop.py Show resolved Hide resolved
@codeclimate
Copy link

codeclimate bot commented Mar 17, 2023

Analysis results are not available for those commits

View more on Code Climate.

@adswa
Copy link
Member Author

adswa commented Mar 17, 2023

I've cancelled the appveyor build because a few time sensitive CI runs in next and dataverse have been waiting for hours.

@adswa
Copy link
Member Author

adswa commented Mar 20, 2023

I have restarted the appveyor build I cancelled on Friday. There is one Travis run that failed, but it looks like something in the test setup glitched and caused widespread failures

@bpoldrack
Copy link
Member

AppVeyor fails are #7320

Travis failed with only one build and this one is - once more, sigh - just completely failing to start up the http server for the tests. However, that build should not be relevant WRT to correctness of this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
semver-patch Increment the patch version when merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants