Skip to content

Add annex.private to ephemeral clones. That would make git-annex not assign shared (in git-annex branch) annex uuid. #6702

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
May 31, 2022

Conversation

bpoldrack
Copy link
Member

@bpoldrack bpoldrack commented May 24, 2022

Set annex.private=true when cloning with --reckless=ephemeral. This is
better than declaring 'dead here', since it not only prevents
availability from 'here' to be propagated, but also the location (uuid)
itself. Otherwise workflows based on ephemeral clones would accumulate
lots of locations that are essentially gone by the time another repo
learns about their existence.

However, still declare dead regardless. Seamingly superfluous in
combination with private mode, but a safeguard: Should an older annex
happen to touch the repo, at least the availability is still not
propagated.

Closes #5835

Changelog

💫 Enhancements and new features

  • datalad clone --reckless=ephemeral now uses git-annex' private repositories

@bpoldrack bpoldrack added the semver-patch Increment the patch version when merged label May 24, 2022
Copy link
Member

@mih mih left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense and LGTM. Thx!

@codecov
Copy link

codecov bot commented May 24, 2022

Codecov Report

Merging #6702 (436e7e5) into maint (476b48d) will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##            maint    #6702   +/-   ##
=======================================
  Coverage   91.18%   91.18%           
=======================================
  Files         353      353           
  Lines       44515    44523    +8     
=======================================
+ Hits        40591    40599    +8     
  Misses       3924     3924           
Impacted Files Coverage Δ
datalad/interface/common_opts.py 100.00% <ø> (ø)
datalad/core/distributed/clone.py 91.24% <100.00%> (+0.03%) ⬆️
datalad/core/distributed/tests/test_clone.py 97.59% <100.00%> (+0.01%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 476b48d...436e7e5. Read the comment docs.

Set annex.private=true when cloning with --reckless=ephemeral. This is
better than declaring 'dead here', since it not only prevents
availability from 'here' to be propagated, but also the location (uuid)
itself. Otherwise workflows based on ephemeral clones would accumulate
lots of locations that are essentially gone by the time another repo
learns about their existence.

However, still declare dead regardless. Seamingly superfluous in
combination with private mode, but a safeguard: Should an older annex
happen to touch the repo, at least the availability is still not
propagated.

Closes datalad#5835
@yarikoptic
Copy link
Member

something nagging me about use case where I would want to know but can't come up with it... hence -- sounds great! @bpoldrack -- please rebase on maint -- I have mitigated the test__version__ fail issue and do not have time to check if that is the only one keeping us red here.

@adswa
Copy link
Member

adswa commented May 31, 2022

I rebased to current maint

@adswa
Copy link
Member

adswa commented May 31, 2022

The docfailure is fixed in master, and the MacOS failure is known. There is a test failure on Appveyor I haven't seen before:

======================================================================
ERROR: datalad.support.tests.test_parallel.test_creatsubdatasets
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/appveyor/projects/datalad/datalad/support/parallel.py", line 368, in _iter_threads
    raise _FinalShutdown()
datalad.support.parallel._FinalShutdown
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/home/appveyor/venv3.7.12/lib/python3.7/site-packages/nose/case.py", line 198, in runTest
    self.test(*self.arg)
  File "/home/appveyor/projects/datalad/datalad/tests/utils.py", line 874, in _wrap_with_tempfile
    return t(*(arg + (filename,)), **kw)
  File "/home/appveyor/projects/datalad/datalad/support/tests/test_parallel.py", line 133, in test_creatsubdatasets
    list(ProducerConsumer(paths, create_, safe_to_consume=no_parentds_in_futures, jobs=5))
  File "/home/appveyor/projects/datalad/datalad/support/parallel.py", line 265, in __iter__
    yield from self._iter_threads(self._jobs)
  File "/home/appveyor/projects/datalad/datalad/support/parallel.py", line 417, in _iter_threads
    self.shutdown(force=True, exception=self._producer_exception or interrupted_by_exception)
  File "/home/appveyor/projects/datalad/datalad/support/parallel.py", line 233, in shutdown
    raise exception
  File "/home/appveyor/projects/datalad/datalad/support/parallel.py", line 401, in _iter_threads
    done_useful |= self._pop_done_futures(lgr)
  File "/home/appveyor/projects/datalad/datalad/support/parallel.py", line 463, in _pop_done_futures
    raise exception
  File "/home/appveyor/.localpython3.7.12/lib/python3.7/concurrent/futures/thread.py", line 57, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/home/appveyor/projects/datalad/datalad/support/parallel.py", line 329, in consumer_worker
    for r in res:
  File "/home/appveyor/projects/datalad/datalad/interface/utils.py", line 369, in generator_func
    allkwargs):
  File "/home/appveyor/projects/datalad/datalad/interface/utils.py", line 544, in _process_results
    for res in results:
  File "/home/appveyor/projects/datalad/datalad/core/local/create.py", line 291, in __call__
    paths=[check_path.relative_to(parentds_path)])
  File "/home/appveyor/projects/datalad/datalad/support/gitrepo.py", line 2881, in status
    eval_submodule_state=eval_submodule_state)
  File "/home/appveyor/projects/datalad/datalad/support/gitrepo.py", line 2974, in diffstatus
    paths=paths, ref=None, untracked=untracked)
  File "/home/appveyor/projects/datalad/datalad/support/gitrepo.py", line 2774, in get_content_info
    read_only=True)
  File "/home/appveyor/projects/datalad/datalad/dataset/gitrepo.py", line 437, in call_git
    read_only=read_only))
  File "/home/appveyor/projects/datalad/datalad/dataset/gitrepo.py", line 482, in call_git_items_
    sep=sep):
  File "/home/appveyor/projects/datalad/datalad/dataset/gitrepo.py", line 345, in _generator_call_git
    for file_no, content in generator:
  File "/home/appveyor/projects/datalad/datalad/runner/gitrunner.py", line 268, in run_on_filelist_chunks_items_
    yield from chunk_generator
  File "/home/appveyor/.localpython3.7.12/lib/python3.7/_collections_abc.py", line 317, in __next__
    return self.send(None)
  File "/home/appveyor/projects/datalad/datalad/runner/nonasyncrunner.py", line 88, in send
    runner.process_queue()
  File "/home/appveyor/projects/datalad/datalad/runner/nonasyncrunner.py", line 557, in process_queue
    self.remove_file_number(file_number)
  File "/home/appveyor/projects/datalad/datalad/runner/nonasyncrunner.py", line 592, in remove_file_number
    self.fileno_mapping[file_number],
KeyError: 34

I restarted the failed Appveyor build to see if its a fluke. In any case, it seems unrelated to the changes proposed in this PR.

@yarikoptic
Copy link
Member

Even though it brings a behavioral change, I would not mind having it merged.

@yarikoptic yarikoptic changed the title Add annex.private to ephemeral clones Add annex.private to ephemeral clones. That would make git-annex not assign shared (in git-annex branch) annex uuid. May 31, 2022
@yarikoptic yarikoptic merged commit 9b21717 into datalad:maint May 31, 2022
@github-actions
Copy link

github-actions bot commented Jun 2, 2022

🚀 PR was released in 0.16.4 🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
released semver-patch Increment the patch version when merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants