Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

addurls: Don't pass --nosave to create (and use rev-create) #3259

Merged
merged 4 commits into from Mar 23, 2019

Conversation

@kyleam
Copy link
Contributor

@kyleam kyleam commented Mar 22, 2019

In gh-3252, @mih says

addurls offer a global save switch that causes any dataset modification to not be saved. This includes the initial dataset content placed in by create as well. I don't see why this would be needed, and it is off by default already.

I don't either. @yarikoptic, this is from your d3d6610 (ENH: --no-save flag for addurls (TODO: tests), 2018-03-14). Do you recall any reason why you wanted create to honor --nosave?

This PR stops propagating --nosave to create, and updates addurls to use rev-create.

kyleam added 4 commits Mar 22, 2019
This isn't a real attempt to make these tests portable, but we might
as well avoid joining _local_ paths with "/". (Many of the hard-coded
"/"s are within the HTTPPath URL paths.)
d3d6610 (ENH: --no-save flag for addurls, 2018-03-14) propagated the
--nosave flag through to the create calls.  However, there's not an
obvious scenario where an addurls caller would want to leave the
initial dataset files untracked, and create's replacement, rev-create,
doesn't offer a --nosave option.

Start saving the dataset on creation even when --nosave is passed.

Re: datalad#3252
dataladgh-3252 didn't do this because rev-create dropped create's --nosave
flag, and addurls unnecessarily propagated --nosave to create.  This
is no longer the case.
@codecov
Copy link

@codecov codecov bot commented Mar 22, 2019

Codecov Report

Merging #3259 into master will increase coverage by 0.04%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #3259      +/-   ##
==========================================
+ Coverage   90.97%   91.02%   +0.04%     
==========================================
  Files         263      263              
  Lines       34125    34131       +6     
==========================================
+ Hits        31046    31068      +22     
+ Misses       3079     3063      -16
Impacted Files Coverage Δ
datalad/plugin/addurls.py 99.68% <100%> (ø) ⬆️
datalad/plugin/tests/test_addurls.py 100% <100%> (ø) ⬆️
datalad/tests/test_log.py 99% <0%> (+0.99%) ⬆️
datalad/log.py 90% <0%> (+7.14%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0bb31b3...58189e9. Read the comment docs.

1 similar comment
@codecov
Copy link

@codecov codecov bot commented Mar 22, 2019

Codecov Report

Merging #3259 into master will increase coverage by 0.04%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #3259      +/-   ##
==========================================
+ Coverage   90.97%   91.02%   +0.04%     
==========================================
  Files         263      263              
  Lines       34125    34131       +6     
==========================================
+ Hits        31046    31068      +22     
+ Misses       3079     3063      -16
Impacted Files Coverage Δ
datalad/plugin/addurls.py 99.68% <100%> (ø) ⬆️
datalad/plugin/tests/test_addurls.py 100% <100%> (ø) ⬆️
datalad/tests/test_log.py 99% <0%> (+0.99%) ⬆️
datalad/log.py 90% <0%> (+7.14%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0bb31b3...58189e9. Read the comment docs.

@yarikoptic
Copy link
Member

@yarikoptic yarikoptic commented Mar 22, 2019

the PR was #2302 which says

This pull request fixes #2280. It adds a --nosave flag to addurls so that it's possible to say datalad run 'datalad addurls ...'.

So it still sounds like a legit use-case to me - to be able to datalad run our own commands, which otherwise do not create run record, but when their invocation is "loaded" enough to warrant book keeping. Somewhat relates to #3236 (commit messages) as well in that.

@mih
Copy link
Member

@mih mih commented Mar 22, 2019

I don't understand why a dataset's config file must not be commited after creating a fresh dataset in order to be able to call run.

@yarikoptic
Copy link
Member

@yarikoptic yarikoptic commented Mar 22, 2019

The main target for nosave_opt was to not save the entire result so it could be saved by run in a single commit (with run record) in the superds. This way it was possible to catch what has changed inside of it as a result of addurls was ran. We do not have "split run" recording (i.e. this was the first commit for the run command, and this was the last one).
I guess it should be ok to commit (save) those created subdatasets (i.e. during create), but without reflecting that in the super dataset (i.e. not being invoked via dataset.rev_create("subds", ...)). If subdatasets had their individual "create" commits, should be ok and contribute to the commit in the super.
As for the top dataset - it would then loose information that it was created as a part of the addurls run.

From the other side - we do not demand config being saved to be used, so there should be no penalty from no saving during create, or is there some side-effect?

@kyleam
Copy link
Contributor Author

@kyleam kyleam commented Mar 22, 2019

the PR was #2302 which says

This pull request fixes #2280. It adds a --nosave flag to addurls so that it's possible to say datalad run 'datalad addurls ...'.

Ah, thanks for digging that up.

@mih
Copy link
Member

@mih mih commented Mar 23, 2019

The side effect would be that addurls is a (if not the) command that cannot use rev_create due to its particular demands, or would need to git reset afterwards. I dont think that is desired.

Having run be able to capture command effects that alter a dataset's history seems like a worthwhile thing. But it is not a problem that is to be solved in each and every command, but in run. @bpoldrack has faced this issue before in the context of datalad-hirni.

The general aspects of this issue could move to a dedicated issue for planning.

mih
mih approved these changes Mar 23, 2019
@mih mih merged commit 828a6c7 into datalad:master Mar 23, 2019
5 checks passed
@yarikoptic
Copy link
Member

@yarikoptic yarikoptic commented Mar 23, 2019

spoken by a true revolutionist @mih

Весь мир насилья мы разрушим
До основанья, а затем
Мы наш, мы новый мир построим, 

--- https://ru.wikipedia.org/wiki/Интернационал_(гимн)#Официальная_версия_гимна

But I agree -- there should be a more generic way to reflect information that a series of commits correspond to a single run command. Thus #3265

@kyleam kyleam deleted the addurls-create-save branch Apr 12, 2019
@kyleam kyleam mentioned this pull request May 15, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

3 participants