Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: -d^. to point to the top of the current dataset #3242

Merged
merged 2 commits into from Dec 13, 2019

Conversation

yarikoptic
Copy link
Member

@yarikoptic yarikoptic commented Mar 19, 2019

It is often desired to reference current dataset. E.g. to add an existing
dataset as subdataset to the current dataset somewhere not at the top of the
dataset directories tree. Especially in quite nested hierarchies it
becomes a burden to figure out where is the boundary of the dataset to specify
it in -d option. -d . would point to the (not installed, and possibly illegit
see #3211) subdataset in current directory, whenever actual need is to specify
smth like -d ../../.. .

With -d^. it becomes possible to say "refer to the current dataset I am in".

The only gotcha I am aware of is that shell might like to use "^" as
a quick substitution for previous command, that is why I specified it
as -d^., i.e. without space - then seems to work as intended -- running
"datalad add -d^. ./subds" would add subds in current subdirectory to the
dataset without requiring full path in -d.

Depending on the outcome of #3230 discussion, this change might not be really
needed, e.g. if default context for operations (including on subdatasets) would
migrate to be of "current dataset". Otherwise, if -d would be needed to "bind"
the context to current dataset - at least with this helper it would be easy to
unambigously reference current dataset regardless of the position in the
directories hierarchy.

If you have a better suggestion instead of ^. - please suggest. My choice was because we have ^ to point to supermost dataset, thus use it to point to the (very) top. So it kinda made sense to me to point to the top of dataset curdir belongs to with ^., and have it on the left because that target path is to the left of curdir (although .^ probably would have been better for use in shell due to aforementioned gotcha)

Instructions

  • Delete these instructions. :-)

@mih
Copy link
Member

@mih mih commented Mar 19, 2019

FTR: I think we should be reluctant to add general meaning to argument values that are only needed to shoehorn particular behavior into the existing machinery. This issue is a flavor of "the shell is too dumb". In Python code this should be rather unnecessary (one would use proper Dataset instances). Maybe it is worth implementing such argument transformation in the initial cmdline API layer, and not further complicate path interpretation globally.

Given that the example uses add, I would prefer to discuss this in the context of rev-save which doesn't have many of add's issues and also consider my proposal in #3230 (comment) for any command to stop behaving differently with or without -d given -- which would make the entire issue go away, without adding more decision making logic.

@codecov
Copy link

@codecov codecov bot commented Mar 19, 2019

Codecov Report

Merging #3242 into 0.11.x will increase coverage by 9.66%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff            @@
##           0.11.x   #3242      +/-   ##
=========================================
+ Coverage   81.14%   90.8%   +9.66%     
=========================================
  Files         256     252       -4     
  Lines       34050   33018    -1032     
=========================================
+ Hits        27629   29982    +2353     
+ Misses       6421    3036    -3385
Impacted Files Coverage Δ
datalad/distribution/tests/test_dataset.py 100% <100%> (ø) ⬆️
datalad/distribution/dataset.py 94.27% <100%> (+2.34%) ⬆️
datalad/distribution/create_sibling_github.py 45.45% <0%> (-38.16%) ⬇️
datalad/distribution/tests/test_create_github.py 93.02% <0%> (-2.93%) ⬇️
datalad/plugin/tests/test_plugins.py 89.15% <0%> (-2.67%) ⬇️
datalad/support/tests/test_sshrun.py 96% <0%> (-1.78%) ⬇️
datalad/downloaders/tests/test_http.py 91.08% <0%> (-1.3%) ⬇️
datalad/distribution/tests/test_create_sibling.py 98.61% <0%> (-0.14%) ⬇️
datalad/tests/test_utils.py 96.39% <0%> (-0.13%) ⬇️
datalad/ui/tests/test_dialog.py 98.79% <0%> (-0.11%) ⬇️
... and 118 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update fd38a1b...a543e07. Read the comment docs.

@yarikoptic
Copy link
Member Author

@yarikoptic yarikoptic commented Mar 19, 2019

Maybe it is worth implementing such argument transformation in the initial cmdline API layer

It could be done there but I do not see a real need to break concurrency between python and cmdline APIs for this.
Re #3230 - indeed, if we make commands behave similar with and without -d such shortcut might not be needed (as I noted in the PR description), so I am ok for this PR to linger around until we have a clear(er) picture.

@yarikoptic yarikoptic added the on hold - input required label Mar 19, 2019
@kyleam
Copy link
Contributor

@kyleam kyleam commented Mar 19, 2019

It makes sense to see how things settle, but

Maybe it is worth implementing such argument transformation in the initial cmdline API layer

It could be done there but I do not see a real need to break concurrency between python and cmdline APIs for this.

... from the standpoint of someone inspecting the code, I like that in this patch ^. is handled in the same location as ///.

@driusan
Copy link
Collaborator

@driusan driusan commented Apr 1, 2019

I think the usage of "^" is likely to cause some confusion, since it already has meaning to git and the usage here is context sensitive. With this change places referring to datasets it means one thing, while places specifying git revisions it means something completely unrelated.

@kyleam
Copy link
Contributor

@kyleam kyleam commented Apr 1, 2019

I think the usage of "^" is likely to cause some confusion, since it already has meaning to git and the usage here is context sensitive

Yeah, I could see that, though ^ (as opposed to this PR's ^.) has been around since 1c62887 (ENH: Dataset understands now ^ and /// aliases, 2016-09-30).

FWIW: #1024 proposed extending -d^ and tried to make a parallel with git's revision modifiers.

@yarikoptic yarikoptic added this to the 0.11.x milestone Apr 15, 2019
@mih
Copy link
Member

@mih mih commented Jun 25, 2019

Ok, I am with @kyleam now. Good to go.

It would be very good, though, to have these symbols and behavior documented somewhere. Thanks in advance!

@mih
Copy link
Member

@mih mih commented Oct 20, 2019

What shall we do with this one?

@mih
Copy link
Member

@mih mih commented Dec 9, 2019

Ping!

@mih
Copy link
Member

@mih mih commented Dec 9, 2019

Shall I change the base and merge this one into master?

yarikoptic added 2 commits Dec 10, 2019
It is often desired to reference current dataset. E.g. to add an existing
dataset as subdataset to the current dataset somewhere not at the top of the
dataset directories tree. Especially in quite nested hierarchies it
becomes a burden to figure out where is the boundary of the dataset to specify
it in -d option.  -d .  would point to the (not installed, and possibly illegit
see datalad#3211) subdataset in current directory, whenever actual need is to specify
smth like -d ../../..  .

With -d^. it becomes possible to say "refer to the current dataset I am in".

The only gotcha I am aware of is that shell might like to use "^" as
a quick substitution for previous command, that is why I specified it
as -d^., i.e. without space - then seems to work as intended -- running
"datalad add -d^. ./subds"  would add subds in current subdirectory to the
dataset without requiring full path in -d.

Depending on the outcome of datalad#3230 discussion, this change might not be really
needed, e.g. if default context for operations (including on subdatasets) would
migrate to be of "current dataset".  Otherwise, if -d would be needed to "bind"
the context to current dataset - at least with this helper it would be easy to
unambigously reference current dataset regardless of the position in the
directories hierachy.
@yarikoptic yarikoptic changed the base branch from 0.11.x to master Dec 10, 2019
@yarikoptic
Copy link
Member Author

@yarikoptic yarikoptic commented Dec 10, 2019

Without agreeing on smth like 1. of #3905 I guess we should indeed have something like this. I have rebased and added minimal doc mentioning. Note: Since --dataset is command specific, duplicated across all the commands, we do not mention /// or ^ anywere "in the API docs", so the same remains for ^.

kyleam added a commit that referenced this issue Dec 13, 2019
@kyleam kyleam merged commit a543e07 into datalad:master Dec 13, 2019
14 of 15 checks passed
@kyleam
Copy link
Contributor

@kyleam kyleam commented Dec 13, 2019

@yarikoptic Thanks for the updates.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
on hold - input required
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants