Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RF: Implement new install/get API #613

Merged
merged 69 commits into from
Sep 21, 2016
Merged

Conversation

bpoldrack
Copy link
Member

@bpoldrack bpoldrack commented Jul 7, 2016

No actual description. Just referencing things ...

Closes #553
Closes #834
Closes #724
Closes #693
Closes #787

@coveralls
Copy link

coveralls commented Jul 7, 2016

Coverage Status

Coverage increased (+0.04%) to 85.51% when pulling b196cc8 on bpoldrack:rf-get into 797b455 on datalad:master.

@codecov-io
Copy link

codecov-io commented Jul 7, 2016

Codecov Report

Merging #613 into master will increase coverage by 0.19%.
The diff coverage is 92.26%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #613      +/-   ##
==========================================
+ Coverage   87.48%   87.68%   +0.19%     
==========================================
  Files         216      219       +3     
  Lines       19912    20369     +457     
==========================================
+ Hits        17421    17861     +440     
- Misses       2491     2508      +17
Impacted Files Coverage Δ
datalad/interface/__init__.py 100% <ø> (ø) ⬆️
datalad/distribution/add.py 84.82% <ø> (ø) ⬆️
datalad/tests/utils.py 90.07% <ø> (+0.15%) ⬆️
datalad/distribution/tests/test_dataset.py 100% <100%> (ø) ⬆️
datalad/interface/save.py 96.59% <100%> (+1.13%) ⬆️
datalad/crawler/nodes/annex.py 84.39% <100%> (ø) ⬆️
datalad/distribution/tests/test_get.py 100% <100%> (ø)
datalad/distribution/create_test_dataset.py 84.7% <100%> (-0.18%) ⬇️
datalad/tests/test_dochelpers.py 100% <100%> (ø) ⬆️
datalad/auto.py 78.46% <100%> (ø) ⬆️
... and 28 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d5d30e0...310bc1e. Read the comment docs.

@yarikoptic
Copy link
Member

how is it going? I would be excited to try new install and get combination ;)

annex_get_opts=annex_get_opts)

@staticmethod
@datasetmethod(name='add')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

name='get'?

@coveralls
Copy link

coveralls commented Sep 6, 2016

Coverage Status

Coverage decreased (-2.2%) to 86.674% when pulling 465b985 on bpoldrack:rf-get into 5ff2165 on datalad:master.

2 similar comments
@coveralls
Copy link

Coverage Status

Coverage decreased (-2.2%) to 86.674% when pulling 465b985 on bpoldrack:rf-get into 5ff2165 on datalad:master.

@coveralls
Copy link

Coverage Status

Coverage decreased (-2.2%) to 86.674% when pulling 465b985 on bpoldrack:rf-get into 5ff2165 on datalad:master.

recursion_limit=recursion_limit)
if p_ds is None:
raise FileNotInRepositoryError(
msg="{0} not in dataset.".format(p))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or just a warning that this will be ignored?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be. Same for not existing files?

@mih
Copy link
Member

mih commented Sep 6, 2016

Observation:

% ll -l
total 16K
drwxr-xr-x 3 mih mih 4,0K Sep  1 19:37 deep/
lrwxrwxrwx 1 mih mih  108 Sep  1 19:36 one -> .git/annex/objects/2W/kW/MD5E-s0--d41d8cd98f00b204e9800998ecf8427e/MD5E-s0--d41d8cd98f00b204e9800998ecf8427e
lrwxrwxrwx 1 mih mih  108 Sep  1 19:36 three -> .git/annex/objects/2W/kW/MD5E-s0--d41d8cd98f00b204e9800998ecf8427e/MD5E-s0--d41d8cd98f00b204e9800998ecf8427e
lrwxrwxrwx 1 mih mih  108 Sep  1 19:36 two -> .git/annex/objects/2W/kW/MD5E-s0--d41d8cd98f00b204e9800998ecf8427e/MD5E-s0--d41d8cd98f00b204e9800998ecf8427e
% datalad get one two three
one ... failed.
two ... failed.
three ... failed.

All of the above symlinks already point to existing content.

@mih
Copy link
Member

mih commented Sep 6, 2016

Somewhat overwhelmed by the verbosity on standard talkative level:

% datalad get deep/some
2016-09-06 15:54:26,226 [ERROR  ] FileNotInRepositoryError: command ''.
| /tmp/some/deep/some belongs to subdataset <Dataset path=/tmp/some/deep>. To get its content use option `recursive` or call get on the subdataset.
| [Errno None] : /tmp/some/deep/some belongs to subdataset <Dataset path=/tmp/some/deep>. To get its content use option `recursive` or call get on the subdataset.: '' [get.py:__call__:133] (FileNotInRepositoryError) (main.py:259)

@bpoldrack
Copy link
Member Author

Re observation on identical content: Yes. Fixed already (partially), but not pushed yet. Result reporting is a topic for hangout. I have questions ... ;-)

resolved_datasets.get(p_ds.path, []) + [p]

# TODO: Change behaviour of Dataset: Make subdatasets singletons to
# always get the same object referencing a certain subdataset.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would worry about that later whenever we actually run into the problem... for now I think the most important is to get get and install working correctly

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, "the problem" is that it currently leads to a less clean and straight implementation of API. But I agree - getting them working correctly again is the more important thing right now.

@yarikoptic
Copy link
Member

yarikoptic commented Sep 20, 2016

yes

edit: we discussed that before, agreed that it is useful, and we had it implemented (and I thought we had a test for that). So imho it should stay implemented

edit2: later we might want to make it up to configuration variable to do such installations in general or not, and ask user upon initial invocation

@bpoldrack
Copy link
Member Author

Well, if you say so, I will add it.

@@ -262,6 +270,9 @@ def __call__(
relpath(opj(ds_path, local_results[i]['file']), ds.path)

global_results.extend(local_results)

if not found_an_annex:
lgr.warning("Found no annex. Could not perform any get operation.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am ok with such approach instead of mine... just fix up the test... I forgot also to improve that test by making an annex subdataset and testing actual recursive call on super pure-git dataset... we really need those tests

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just cherry pick that commit ENHing single_or_plural please.

ok -- I will leave you alone for now ;)

@yarikoptic
Copy link
Member

I think that the merge of this PR is upon us -- basic real life testing identified a few issue, but it is quite good already. so I will mark it as for milestone 0.3 ;-)

@yarikoptic
Copy link
Member

So we are just 1 quick fix away from the merge! ;) whoohooo -- great job @bpoldrack! It feels like the beast could be usable now! But I would really appreciate if you give a test yourself on our /// and especially those datasets which require authentication (crcns, kaggle, etc)

# Keep original in debug output:
lgr.debug("Original failure:{0}"
"{1}".format(linesep, exc_str(e)))
return None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nope -- not None. Remember -- "fulfilling the promise!" ;) and that is why tests failed as well btw

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, None! ;-)
I think you misread it. If there is an installed dataset at the target we return it. But if it is something else we return None since we neither installed a dataset nor is it already there.

But there's something else wrong with it. Condition conflicts with URL guessing, since an empty directory is okay, but could still fail in an attempt to guess the correct URL.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hopefully works now ;-)

# (TODO: eventually check for being the one, that this
# is about)
if current_dataset.is_installed():
lgr.info("{0} appears to be installed already.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

forgotten .format()?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

break
else:
lgr.warning("Target {0} already exists and is not an "
"installed dataset. Skipped.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again forgotten format or I am not aware of some magic? (in python3.5 e.g. you could do smth like r"{path} blah" which would use local var path)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. Committed to fast.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

BF: Tested wrong condition
get was unnecessarily recursivly called from within install
@bpoldrack
Copy link
Member Author

Damn it. OSX failing ...

# is about)
if current_dataset.is_installed():
lgr.info("{0} appears to be installed already."
"".format(current_dataset))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ha - "".format -- cute... ;) thought myself about what to do in such cases ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants