Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

run: Fix path handling for gh-3746 and for subdirectory runs #3747

Merged
merged 16 commits into from
Oct 9, 2019

Conversation

kyleam
Copy link
Contributor

@kyleam kyleam commented Oct 4, 2019

This sits on top of gh-3746 and should fix the run failures (including hopefully the Windows one; edit: that was too hopeful). Here's the restricted diff. This series could be merged into that PR or merged into master following that PR. Or, if desired, I can separate this series from that one, though there will end up being at least be some minor conflicts to deal with.

This fixes gh-3551. The added test, based off of the examples in that issue, will run on Windows, so it might need a compatibility fixup.

@kyleam
Copy link
Contributor Author

kyleam commented Oct 4, 2019

on Windows, so it might need a compatibility fixup.

Sadly it does: https://github.com/datalad/datalad/pull/3747/checks?check_run_id=248506560#step:8:254

@kyleam kyleam added the WIP work in progress label Oct 4, 2019
mih added a commit to mih/datalad that referenced this pull request Oct 5, 2019
mih and others added 16 commits October 6, 2019 11:23
This tries to be a gentle RF that doesn't change (much) of the original
behavior (see test diff). However, a change in-line with datalad#3742 is
unavoidable (see removed test).

This is bringing path semantics in line with status/subdatasets, and is
a precondition to fixing datalad#3469
Also update that filter to match the new conventions
This test leads to run() calling `get(dataset=ds.path, path=['*'])`.
On non-Windows systems, the '*' is considered a literal path and
reported to be absent, as expected.  On Windows, the underlying
os.stat() call fails [0] with

    OSError: [WinError 123] The filename, directory name, or volume
    label syntax is incorrect:
    'C:\\Users\\RUNNER~1\\AppData\\Local\\Temp\\datalad_temp_5m0lh8wt\\*'

As discussed here [1], we might want to rework run() and/or
GlobbedPaths to avoid hanging on to the unmatched glob and feeding
them as a literal paths.  That might get rid of the failure for this
particular test, though it would just paper over the core Windows path
handling problem shown by this failure..

[0]: https://github.com/datalad/datalad/pull/3746/checks?check_run_id=248850583#step:8:417
[1]: datalad#3746 (comment)
The result records are no longer based on annotate_paths.
Th next commit will pass None as the `path` argument instead of ".",
which would be a breaking change to how the `saver` argument is.  But
adding a compatibility kludge for outside callers isn't worth the
effort.  This argument was added so that the datalad-revolution
extension could use rev_save() instead of add() as the saver, and that
code has been absorbed into the core.  So let's just ignore the
`saver` argument entirely and slate it for removal.
We pass "." as the path argument to Save() to mean "save all changes".
The next commit will change the Save() call from a dataset-bound call
to an unbound call.  That means that Save(dataset=<string>, path=".")
when called in a subdirectory will just save the subdirectory changes.
Instead we need to pass path=None.
We already access this in a couple of spots, and the next commit will
add several more.
As described here [1], there are two approaches to dealing with the
current path rules: (1) resolve all paths to absolute paths and use
bound dataset methods or (2) use unbound dataset methods and relative
paths.

run() has mostly taken the first approach.  There is one spot that
incorrectly passes relative paths to install() [2].  If this is
changed to use absolute paths, it reveals a deeper incompatibility
with the new path handling on master.  rev_resolve_path() isn't happy
with absolute paths that look like "/go/../upstairs" (see its
docstring and code comments for why).  run() constructs paths like
these when the run is happening in a subdirectory and there are inputs
or outputs upstairs (see the examples in the added test).

So instead let's switch to the second approach.

Note that the compatibility kludge around the dataset handling is
mainly for the sake of datalad-htcondor.

Also note that needing to use chpwd() in several spots is ugly.  We do
this because we need to make sure we're in the correct directory when
rerunning.  We might want to make rerun() responsible for calling
chpwd() around its call of run_command(), but that would take a bit
more refactoring.

Fixes datalad#3551.
Re: datalad#3746

[1] datalad#3746 (comment)
[2] ... though these incorrect still work in at least some cases. See
    dataladgh-3650.
@codecov
Copy link

codecov bot commented Oct 6, 2019

Codecov Report

Merging #3747 into master will decrease coverage by 0.04%.
The diff coverage is 94.51%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master   #3747      +/-   ##
=========================================
- Coverage   82.94%   82.9%   -0.05%     
=========================================
  Files         273     273              
  Lines       35943   35940       -3     
=========================================
- Hits        29813   29795      -18     
- Misses       6130    6145      +15
Impacted Files Coverage Δ
datalad/metadata/aggregate.py 58.16% <ø> (ø) ⬆️
datalad/interface/tests/test_rerun.py 100% <100%> (ø) ⬆️
datalad/core/local/tests/test_run.py 99.55% <100%> (+0.03%) ⬆️
datalad/metadata/metadata.py 88.08% <100%> (ø) ⬆️
datalad/distribution/tests/test_get.py 100% <100%> (ø) ⬆️
datalad/interface/results.py 95.08% <100%> (+0.81%) ⬆️
datalad/distribution/tests/test_uninstall.py 99.69% <100%> (ø) ⬆️
datalad/interface/tests/test_annotate_paths.py 100% <100%> (ø) ⬆️
datalad/distribution/tests/test_install.py 99.8% <100%> (ø) ⬆️
datalad/core/local/run.py 92.2% <84.61%> (-2.51%) ⬇️
... and 12 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1c598c1...2be5c1c. Read the comment docs.

@kyleam kyleam removed the WIP work in progress label Oct 6, 2019
Copy link
Member

@mih mih left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM (the pieces that you wrote ;-)

@yarikoptic yarikoptic merged commit 2be5c1c into datalad:master Oct 9, 2019
@kyleam kyleam deleted the run-go-unbound branch October 9, 2019 13:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

run: seems to incorrectly handle relative paths if ran within subdirectory
3 participants