Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updates for (unrelased) DataLad 0.13.0 #506

Merged
merged 15 commits into from
May 28, 2020
Merged

Conversation

kyleam
Copy link
Contributor

@kyleam kyleam commented Apr 9, 2020

This series updates run-related functionality for changes in the next feature release of DataLad. That version will contain some functionality that was prompted by issues on ReproMan's end, in particular improvements to datalad update and datalad create-sibling.

The changes above will require bumping our dependency to DataLad 0.13.0, so other commits in this series prune no longer needed compatibility kludges. In addition, 0.13 removes the deprecated datalad add command, so a commit here migrates to datalad save (which was held off to retain compatibility with DataLad 0.11.x line).

I'm marking this as a draft because (1) this shouldn't be merged until DataLad 0.13.0 is released and (2) this should resolve at least a few open issues, but I still need to reference them in the commit messages and add relevant tests. The latter will probably point out the need for at least minor tweaks.

@codecov
Copy link

codecov bot commented Apr 9, 2020

Codecov Report

Merging #506 into master will increase coverage by 0.05%.
The diff coverage is 96.47%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #506      +/-   ##
==========================================
+ Coverage   89.60%   89.66%   +0.05%     
==========================================
  Files         148      148              
  Lines       12290    12388      +98     
==========================================
+ Hits        11013    11108      +95     
- Misses       1277     1280       +3     
Impacted Files Coverage Δ
reproman/support/jobs/orchestrators.py 93.54% <88.88%> (+1.69%) ⬆️
reproman/support/jobs/tests/test_orchestrators.py 94.89% <100.00%> (+1.15%) ⬆️
reproman/support/exceptions.py 75.00% <0.00%> (-3.50%) ⬇️
reproman/interface/jobs.py 96.11% <0.00%> (-2.89%) ⬇️
reproman/interface/tests/test_run.py 98.46% <0.00%> (-1.15%) ⬇️
reproman/interface/run.py 100.00% <0.00%> (ø)
reproman/support/jobs/submitters.py 79.25% <0.00%> (+1.04%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 214dbea...a1653bd. Read the comment docs.

@kyleam
Copy link
Contributor Author

kyleam commented Apr 9, 2020

The Travis run isn't included in the checks box (at least at the moment), but here's the run:

https://travis-ci.org/github/ReproNim/reproman/builds/673081638

The build that covers the most run functionality is still in progress, but scanning a couple of runs that already failed, it looks like the failures are due to test_conda_init_install_and_detect failing. That's unlikely to be related to this PR (*) and is probably a failure we need to address on master. (Perhaps this is related to gh-446, and the workaround from gh-443 is no longer sufficient.)

(*) edit: As expected, the test_conda_init_install_and_detect failure is showing up after the latest push to gh-505.

@kyleam
Copy link
Contributor Author

kyleam commented Apr 17, 2020

Updated to resolve conflicts with master.

range-diff
 1:  32938c829 =  1:  4345efaf6 TST: travis: Install development version of DataLad
 2:  afecf0060 !  2:  7192205d4 RF: run: Use 'datalad save' instead of 'datalad add'
    @@ reproman/support/jobs/tests/test_orchestrators.py: def dataset(tmpdir_factory):
          return ds
      
     @@ reproman/support/jobs/tests/test_orchestrators.py: def run_and_check(spec):
    -         # The submitter log file is ignored (currently only relevant for
    -         # condor; see b9277ebc0 for more details). Add the directory to get to
    -         # a clean state.
    --        dataset.add(".reproman")
    -+        dataset.save(".reproman")
    -     orc = run_and_check(dumped_spec)
    - 
    +             # The submitter log file is ignored (currently only relevant for
    +             # condor; see b9277ebc0 for more details). Add the directory to get
    +             # to a clean state.
    +-            dataset.add(".reproman")
    ++            dataset.save(".reproman")
    +         orc = run_and_check(dumped_spec)
    +     return fn
      
     @@ reproman/support/jobs/tests/test_orchestrators.py: def test_orc_datalad_run_change_head(job_spec, dataset, shell):
      
 3:  b413335f7 =  3:  db477e416 RF: orchestrators: Prune now unneeded SSH URL compatibility kludge
 4:  f3effe526 =  4:  ace092835 RF: orchestrators: Adjust datalad run imports
 5:  20c6b9125 =  5:  fe302e8a2 MNT: orchestrators: Drop warning about an unsupported DataLad version
 6:  4f3e114f1 <  -:  --------- RF: test_orchestrators: Use DataLad's repo.call_git()
 -:  --------- >  6:  b87cbe762 RF: test_orchestrators: Use DataLad's repo.call_git()
 7:  ba6ca4bb6 !  7:  493ac8533 ENH: orchestrators: Meld handling of local and SSH sessions
    @@ reproman/support/jobs/orchestrators.py: def fetch(self, on_remote_finish=None):
     
      ## reproman/support/jobs/tests/test_orchestrators.py ##
     @@ reproman/support/jobs/tests/test_orchestrators.py: def run_and_check(spec):
    -         # Our reproman-based copying of data doesn't isn't (yet) OK with data
    -         # files that already exist.
    -         dumped_spec["inputs"] = []
    --    dataset.repo.call_git(["reset", "--hard", "start-pt"])
    -+    dataset.repo.call_git(["checkout", "-b", "other", "start-pt"])
    -     if dataset.repo.dirty:
    -         # The submitter log file is ignored (currently only relevant for
    -         # condor; see b9277ebc0 for more details). Add the directory to get to
    +             # Our reproman-based copying of data doesn't isn't (yet) OK with
    +             # data files that already exist.
    +             dumped_spec["inputs"] = []
    +-        dataset.repo.call_git(["reset", "--hard", "start-pt"])
    ++        dataset.repo.call_git(["checkout", "-b", "other", "start-pt"])
    +         if dataset.repo.dirty:
    +             # The submitter log file is ignored (currently only relevant for
    +             # condor; see b9277ebc0 for more details). Add the directory to get
 8:  da5bd5f4d =  8:  7c9e6cee8 BF: orchestrators: Initialize submodules
 9:  a41694f1e =  9:  3ff339655 ENH: orchestrators: Fetch with 'update --follow=parentds'
10:  776ffb290 = 10:  a902ac9cc BF: orchestrators: Fetch with --recurse-submodules=no
11:  190abdb1e = 11:  c4347675a [todo] Add placeholder comment

kyleam added 12 commits May 6, 2020 14:23
The upcoming commits will depend on features and fixes that will be a
part of DataLad's 0.13 release.
The changes in this series depend on the 0.13 release.  While not done
by this commit, we should eventually do version checks on the remote
end as well (issue 477).  In that case, datalad v0.12.x would probably
be sufficient, though I'm not sure it's worth making the distinction.
add() was deprecated in DataLad v0.12 and has been dropped in the
upcoming 0.13 release.

Closes ReproNim#463.
In DataLad v0.12 the module was moved to datalad.core.local.run;
datalad.interface.run is now just a compatibility shim.
DataLad v0.12 exposed a call_git() method that should be used instead
of _git_custom_command().
With the upcoming DataLad 0.13 release, create-sibling now supports
local paths in addition to SSH URLs.

Closes ReproNim#462.
In order to run something that involves submodules, we of course need
to populate them.  create-sibling won't do this for us as of DataLad's
78e00dcd23 (RF: do not run submodule update --init in the post-update
hook, 2019-04-10), so let's do a blanket 'update --init'.

This will probably need to be refined later, as getting all
subdatasets is unlikely to be desirable.  And we very likely should
try harder to stay on a branch.  This shouldn't functionally matter
with the upcoming 'update --follow=parent' change, but it risks
confusing users that inspect the remote directly.
By default 'git fetch' will try to fetch submodules if the recorded
commit isn't available locally.  Avoid this because 1) with the
upcoming 0.13 release of Datalad, update() try to fetch an commit even
if it isn't in a ref that's brought down by the initial fetch [*] and
2) submodule.c hard codes 'origin' as a fallback to fetch from, so the
fetch is likely to fail, as 'run' names remotes by the resource name.

[*] See DataLad's dbd84e5662 (ENH: update: Try to fetch submodule
    commit explicitly if needed, 2020-02-07).

Fixes ReproNimgh-499.
head_at() already aborts with a dirty repository to avoid losing any
changes.  But, if a submodule changed between the current and target
commit, the repository may become dirty after checking out the target.
In that case, the submodule is carrying an undefined set of changes
that could cause confusion (e.g., if the submodule change came in with
a save).
datalad-pair's fetch() fetches the job ref from the resource and then
relies on 'datalad update' to bring down changes from the remote.  If
the merged in branch doesn't contain the job ref, then we temporarily
switch to the job ref, get the outputs, and then let the caller know
that the changes weren't brought in.

This approach is problematic with submodules.  One issue is that the
head switch mentioned above does not check out submodules, and in
general DataLad doesn't yet offer a way to set/reload a dataset
hierarchy to a particular state.

Another issue is that update() selects a merge target from the
submodule remote that doesn't necessarily contain the registered
commit.  See DataLad's 94efa8637a (NF: update: Add option to merge in
recorded submodule commit, 2020-02-17) for more information.  In what
will be the 0.13 release, update() learned a --follow=parentds mode
that makes update() take the registered commit as the merge target
when updating submodules.  Use it.

However, this still doesn't deal with bringing a non-mainline job ref
into the top-level dataset because update() unconditionally merges in
a branch from the remote.  Try harder to bring in the job ref by doing
an explicit 'git merge' of the ref before the update() call.
@kyleam kyleam force-pushed the datalad-bump branch 3 times, most recently from 5dd5d8a to cdce122 Compare May 7, 2020 16:57
@@ -7,5 +7,5 @@ sudo apt-get install git-annex-standalone
# Install datalad system-wide for use with localhost ssh
sudo apt-get install datalad
# ... and install it into the virtualenv.
pip install datalad
pip install git+https://github.com/datalad/datalad.git
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now that there is a datalad 0.13.0rc1 I think it should be sufficient to declare ~= 0.13.0rc1 for datalad I think. That would limit though to 0.13 "major" series of datalad.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line will be reverted and setup.py adjusted with the dependency anyway.

To help debug failures, make all results available at the DEBUG level,
and display status="error" results directly in the OrchestratorError
message.

At this point, we're not doing any analysis of the error message, so
remove the datalad-update suggestion.  As demonstrated by ReproNimgh-441, this
may fail for a variety of reasons, and in some cases the
datalad-update snippet will just be confusing.  Eventually it might be
good to do more analysis on an error to give targeted suggestions.
And at the very least it'd probably be good to format the message file
if it's coming in as a (<format string>, <args>) tuple.

Closes ReproNim#441.
Re: ReproNim#511
A recursive create-sibling is called only if the top-level remote
dataset doesn't exist.  That means that, after an initial run, another
run that has new subdatasets locally will fail because the subdataset
won't yet have a remote for this resource.

Update prepare_remote() to always call create-sibling, skipping
existing targets, to handle this situation.

Ref: ReproNim#511 (comment)
This should have been done when we started using DataLad
functionality.

Adjust the .travis.yml install target so that DataLad continues to
only be pulled in for the INSTALL_DATALAD=1 run.
@kyleam kyleam marked this pull request as ready for review May 27, 2020 20:57
@yarikoptic yarikoptic merged commit d47e38b into ReproNim:master May 28, 2020
@kyleam kyleam deleted the datalad-bump branch May 28, 2020 14:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants