Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Forbid drop operation from symlink'ed annex (e.g. due to being cloned with --reckless=ephemeral) to prevent data-loss #6959

Merged
merged 1 commit into from
Aug 18, 2022

Conversation

mih
Copy link
Member

@mih mih commented Aug 18, 2022

This is a blunt, but effective measure to prevent a data security issue,
where dropping annex keys from 'here' is actually also dropping them
from 'origin' -- thereby potentially deleting the last existing copy.

Even once a more elegant error anticipation and handling is implemented,
it should be kept for safety reasons.

Closes #6948

PS: I will be working on a more elegant wrapping of this fix. But given the severity, this should be merged and released ASAP. #4750 continues to remain as a reminder.

I came to the conclusion that it might be best to remove the entire feature. I filed #6961

Spellchecker errors are addressed in #6956

@mih mih added severity-critical makes it unusable, causes data loss, introduces security vulnerability semver-patch Increment the patch version when merged labels Aug 18, 2022
@adswa
Copy link
Member

adswa commented Aug 18, 2022

But given the severity, this should be merged and released ASAP.

I agree

@codecov
Copy link

codecov bot commented Aug 18, 2022

Codecov Report

Merging #6959 (df38f9e) into maint (5f0b28d) will increase coverage by 0.98%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##            maint    #6959      +/-   ##
==========================================
+ Coverage   89.97%   90.96%   +0.98%     
==========================================
  Files         354      354              
  Lines       46259    46290      +31     
==========================================
+ Hits        41622    42108     +486     
+ Misses       4637     4182     -455     
Impacted Files Coverage Δ
datalad/distributed/drop.py 97.70% <100.00%> (+0.02%) ⬆️
datalad/distributed/tests/test_drop.py 100.00% <100.00%> (ø)
datalad/_version.py 45.68% <0.00%> (-0.28%) ⬇️
datalad/tests/test_config.py 99.73% <0.00%> (+<0.01%) ⬆️
datalad/config.py 97.26% <0.00%> (+<0.01%) ⬆️
datalad/tests/utils.py 65.10% <0.00%> (+11.03%) ⬆️
datalad/tests/test_tests_utils.py 92.34% <0.00%> (+92.34%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@yarikoptic
Copy link
Member

Crippledfs where no symlinjs supported failure is relevant

@@ -727,6 +727,9 @@ def _kill_dataset(ds):
def _drop_allkeys(ds, repo, force=False, jobs=None):
"""
"""
assert not (repo.dot_git / 'annex').is_symlink(), \
"Dropping from a symlinked annex is unsupported to prevent data-loss"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general I am good with such a stop gap feature (after testing is fixed up on crippled fs). But also such message could confuse some naive user a little. We might better rely first on

[datalad "clone"]
	reckless = ephemeral

being present in repo.config -- and inform user that dropping in ephemeral clone (the term they must have heard about since they had to use it to get here) is not supported. But indeed checking for symlink is more robust and could be the 2nd line of defense if so desired.

Copy link
Member Author

@mih mih Aug 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree. Whether or not this config item is still set makes little difference re the outcome -- it remains data loss.

This conclusion matches the outcome of #4750

I don't understand what "2nd line of defense if so desired" wants to say.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I decode @yarikoptic's comment correctly, its about rewording the message under certain circumstances:
1: Check if the config item about an ephemeral clone
2: If it is present, reword the warning to talk about "ephemeral clone" instead of "symlinked annex"
The symlink check and warning would co-exist and catch a symlinked annex even without the ephemeral config.
Is that interpretation correct?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes @adswa , thank you!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now let's just merge, and if @mih decides to improve UX, could be in a follow up PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will adjust the title to be little more "obvious" and reflective of the change.

This is a blunt, but effective measure to prevent a data security issue,
where dropping annex keys from 'here' is actually also dropping them
from 'origin' -- thereby potentially deleting the last existing copy.

Even once a more elegant error anticipation and handling is implemented,
it should be kept for safety reasons.

Fixes datalad#6948
@yarikoptic yarikoptic changed the title Safety-net to prevent data-loss by drop from symlink'ed annex Forbid drop operation from symlink'ed annex (e.g. due to being cloned with --reckless=ephemeral) to prevent data-loss Aug 18, 2022
@yarikoptic yarikoptic merged commit c3fa185 into datalad:maint Aug 18, 2022
@mih mih deleted the bf-6948 branch August 21, 2022 19:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
semver-patch Increment the patch version when merged severity-critical makes it unusable, causes data loss, introduces security vulnerability
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants