Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: when installing/getting a submodule, make checkout branch to point to that commit #1370

Merged
merged 3 commits into from
Mar 17, 2017

Conversation

yarikoptic
Copy link
Member

Closes #937
but now I wonder if we should place more safe-guards, e.g. to not perform it if previous position of the branch is not available within any remote, thus signalling that it probably has local changes, thus we shouldn't just 'loose' them.

@@ -327,7 +364,7 @@ def _get_flexible_source_candidates_for_submodule(ds, sm_path, sm_url=None):
sm_url,
remote_url if remote_url else ds.path)

return clone_urls
return unique(clone_urls)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spotted it going through identical URLs -- so fixed with our helper

@@ -214,7 +215,43 @@ def _install_subds_from_flexible_source(ds, sm_path, sm_url, reckless):
# do fancy update
if sm_path in ds.get_subdatasets(absolute=False, recursive=False):
lgr.debug("Update cloned subdataset {0} in parent".format(subds))
# TODO: move all of that into update_submodule ??
# TODO: direct mode ramifications?
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

started to move into update_submodule but that one is defined in GitRepo, and so far nothing decorated at AnnexRepo level. @bpoldrack -- should it work currently in direct mode somehow?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm too late obviously, but for FTR: Yes, it should. AnnexRepo makes sure, that all the git calls are done with -c core.bare=False automatically in direct mode. It needs to override it only if this isn't sufficient, but in most cases it is.

@codecov-io
Copy link

codecov-io commented Mar 14, 2017

Codecov Report

Merging #1370 into master will decrease coverage by 0.04%.
The diff coverage is 94.54%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1370      +/-   ##
==========================================
- Coverage    89.4%   89.36%   -0.05%     
==========================================
  Files         236      236              
  Lines       24682    24773      +91     
==========================================
+ Hits        22068    22138      +70     
- Misses       2614     2635      +21
Impacted Files Coverage Δ
datalad/support/tests/test_gitrepo.py 100% <100%> (ø) ⬆️
datalad/distribution/tests/test_install.py 99.77% <100%> (+0.01%) ⬆️
datalad/support/gitrepo.py 86.27% <100%> (+0.94%) ⬆️
datalad/distribution/utils.py 90.68% <85%> (-0.67%) ⬇️
datalad/tests/test__main__.py 93.54% <0%> (-6.46%) ⬇️
datalad/ui/progressbars.py 91.35% <0%> (-5.71%) ⬇️
datalad/log.py 92.17% <0%> (-5.22%) ⬇️
datalad/crawler/nodes/matches.py 83.49% <0%> (-4.86%) ⬇️
datalad/tests/utils.py 88.41% <0%> (-1.13%) ⬇️
datalad/support/annexrepo.py 90.69% <0%> (ø) ⬆️
... and 10 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ddae14a...f6f8d69. Read the comment docs.

@yarikoptic
Copy link
Member Author

@bpoldrack @mih -- what do you think? should I implement more safe-guards or should be ok as is for now?

@mih
Copy link
Member

mih commented Mar 16, 2017

There is indeed the potential to loose something. Especially, if we start to prune things automatically, as suggested in #1371

I think we should have protection against that.

…as already installed (should not happen actually)
@yarikoptic
Copy link
Member Author

ha -- looking back at the code -- I think we are safe since this function is to install (clone) new sub-datasets so there is nothing too loose since we didn't have anything before running it. Nevertheless I have pushed f6f8d69 to

  • fix a bug of me taking branch name from the super-dataset not sub-dataset ;)
  • adding an additional check and not perform this smart branch update whenever dataset
    was somehow pre-installed first.

This whole logic/adjustment I guess is also relevant to 'update' action, which should be actually the one worrying about loosing stuff ;) but I did not find other uses of update_submodule so I guess it is done differently in the update, and could be the next thing to tackle for consistent with this change behavior

@mih
Copy link
Member

mih commented Mar 17, 2017

The reason why you don't find any other use of that function is that update is really dumb (I mentioned it repeatedly). While it works recursively, it treats all datasets it acts upon absolutely independently. I will simple wrack the ship in any more than trivial update scenario. It should not cause any real harm, but it would leave behind a dataset hierarchy that has lots of dirty-submodule-superdataset relationships.

update needs an update that makes it figure out what changed, and what needs to be saved where, compiles that into a content_by_ds dict and gives it to save_dataset_hierarchy.

@mih mih merged commit 4ae7520 into datalad:master Mar 17, 2017
@mih mih mentioned this pull request Mar 17, 2017
20 tasks
@mih
Copy link
Member

mih commented Mar 19, 2017

#1350 is now broken after having merged this PR. The last line of the log snippet below documents the origin of the failure. That line was added in this PR and accesses the repo attribute of the to-be-installed dataset -- which is None in this case.

I am not sure why this is happening at all. I would appreciate any hints. Thanks.

DATALAD_LOG_LEVEL=debug PATH=~/hacking/datalad/git/bin:$PATH PYTHONPATH=~/hacking/datalad/git DATALAD_TESTS_SSH=1 nosetests3 -s datalad/distribution/tests/test_install.py:test_install_skip_failed_recursive
....
[DEBUG  ] Determined class of decorated function: <class 'datalad.distribution.get.Get'> 
[DEBUG  ] Resolved input path arguments: ['/tmp/datalad_temp_test_install_skip_failed_recursiveuj9ipxvl'] 
[INFO   ] Obtaining <Dataset path=/tmp/datalad_temp_test_install_skip_failed_recursiveuj9ipxvl>  recursively 
[DEBUG  ] Failed to detect a valid repo at /tmp/datalad_temp_test_install_skip_failed_recursiveuj9ipxvl/subm 1 
[DEBUG  ] Failed to detect a valid repo at /tmp/datalad_temp_test_install_skip_failed_recursiveuj9ipxvl/subm 1 
[DEBUG  ] Retrieving a dataset from URL: /tmp/datalad_temp_testrepo_bd4ykdz9/subm 1 
[DEBUG  ] Failed to retrieve from URL: /tmp/datalad_temp_testrepo_bd4ykdz9/subm 1 
[DEBUG  ] Retrieving a dataset from URL: file:///tmp/datalad_temp_testrepo_from62uw 
[DEBUG  ] Failed to retrieve from URL: file:///tmp/datalad_temp_testrepo_from62uw 
[DEBUG  ] Failed to detect a valid repo at /tmp/datalad_temp_test_install_skip_failed_recursiveuj9ipxvl/subm 1 
[WARNING] Target /tmp/datalad_temp_test_install_skip_failed_recursiveuj9ipxvl/subm 1 already exists and is not an installed dataset. Skipped. 
[DEBUG  ] Original failure:
| Cmd('git') failed due to: exit code(128)
|   cmdline: git clone -v file:///tmp/datalad_temp_testrepo_from62uw /tmp/datalad_temp_test_install_skip_failed_recursiveuj9ipxvl/subm 1
|   stderr: 'fatal: destination path '/tmp/datalad_temp_test_install_skip_failed_recursiveuj9ipxvl/subm 1' already exists and is not an empty directory.' [gitrepo.py:clone:576] 
[DEBUG  ] Update cloned subdataset <Dataset path=/tmp/datalad_temp_test_install_skip_failed_recursiveuj9ipxvl/subm 1> in parent 
[DEBUG  ] Failed to detect a valid repo at /tmp/datalad_temp_test_install_skip_failed_recursiveuj9ipxvl/subm 1 
[DEBUG  ] Failed to detect a valid repo at /tmp/datalad_temp_test_install_skip_failed_recursiveuj9ipxvl/subm 1 
[ERROR  ] Installation of subdatasets <Dataset path=/tmp/datalad_temp_test_install_skip_failed_recursiveuj9ipxvl/subm 1> failed with exception: 'NoneType' object has no attribute 'get_active_branch' [utils.py:_install_subds_from_flexible_source:146] 

@mih mih mentioned this pull request Mar 19, 2017
39 tasks
@yarikoptic
Copy link
Member Author

sorry -- can't find that code within 0.4.1-688-g55689abc of your branch -- 146th line of that file is just "return subds" for me

@mih
Copy link
Member

mih commented Mar 20, 2017 via email

@yarikoptic yarikoptic deleted the enh-no-detached branch April 2, 2017 01:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants