-
Notifications
You must be signed in to change notification settings - Fork 111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BF(DOC)+RF: report "present" for the state of a present/installed/fulfilled (sub)dataset #5655
Conversation
…filled dataset I think that docstring was left from old(er) times whenever `git submodule` was used to report the "state" of the subdatasets. After it was replaced with ad-hoc parsing of .gitmodules as far as I see it "state" no longer corresponds to any of those values besides "absent", and we simply do not report any state (as changed unittest shows) for "present" (a clear antonym to "absent") datasets. All those states are reported by "status" command though. E.g. compare: $> datalad -f '{type} {path}: {state} {status}' --report-type dataset status -d . | grep dataset dataset /tmp/testds3/mod-inited: clean ok dataset /tmp/testds3/not-committed: added ok dataset /tmp/testds3/not-mod-inited: clean ok dataset /tmp/testds3/openneuro: clean ok dataset /tmp/testds3/uninstalled: clean ok to (with these changes, without -- would say NA instead of "present" since there would be no "state") $> datalad -f '{type} {path}: {state} {status}' --report-type dataset subdatasets -d . | grep dataset dataset /tmp/testds3/mod-inited: present ok dataset /tmp/testds3/not-committed: present ok dataset /tmp/testds3/not-mod-inited: present ok dataset /tmp/testds3/openneuro: present ok dataset /tmp/testds3/uninstalled: absent ok to see that "status" does not care about "subdataset state" and reports it as "clean" if absent. To mitigate it, I have established "present" as a possible value for the subdataset's "state", thus overall making it a "bool" value as just a field with two possible states. ATM I do not see any other possible value we might want to report which would be complimentary (not a sub-category) to those two. This might establish a precedent for RFing of --fulfilled into --state={any,present,absent} for selection of subdatasets according to their state, making interface more coherent IMHO. But with that in mind I would still have consideration of the fact that "state" means different things depending on the command (subdatasets vs status) which might cause confusion/conflict here. But I think it is status="ok" (pun intended) as long as we document that it is "subdatasets state" and possibly use "--status-state" or just "--status" whenever we decide to harmonize options for that.
Codecov Report
@@ Coverage Diff @@
## master #5655 +/- ##
===========================================
- Coverage 90.35% 78.85% -11.50%
===========================================
Files 305 302 -3
Lines 41740 41705 -35
===========================================
- Hits 37713 32888 -4825
- Misses 4027 8817 +4790
Continue to review full report at Codecov.
|
failed travis run - conda connection fluke (hopefully), restarted |
well, if nobody objects, will merge monday. It seems to not break anything and makes returned records more consistent etc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think your reasoning is sound. Moreover, --state={any,present,absent}
feel natural to me.
I cannot say that I have a good picture of when and why this all "broke", but this seems like a reasonable way forward.
Thx!
PS: Please don't take this approval as a statement of confidence that this move would not break anything else -- however, nothing comes to my mind, right now.
awesome, thanks @mih! Let's proceed! |
I think that docstring was left from old(er) times whenever
git submodule
wasused to report the "state" of the subdatasets. After it was replaced with
ad-hoc parsing of .gitmodules as far as I see it "state" no longer corresponds
to any of those values besides "absent", and we simply do not report any state
(as changed unittest shows) for "present" (a clear antonym to "absent")
datasets. All those states are reported by "status" command though. E.g.
compare:
to (with these changes, without -- would say NA instead of "present" since
there would be no "state")
to see that "status" does not care about "subdataset state" and reports it as
"clean" if absent.
To mitigate it, I have established "present" as a possible value for the
subdataset's "state", thus overall making it a "bool" value as just a field
with two possible states. ATM I do not see any other possible value we might
want to report which would be complimentary (not a sub-category) to those two.
This might establish a precedent for RFing of --fulfilled into
--state={any,present,absent} for selection of subdatasets according to their
state, making interface more coherent IMHO. But with that in mind I would
still have consideration of the fact that "state" means different things
depending on the command (subdatasets vs status) which might cause
confusion/conflict here. But I think it is status="ok" (pun intended) as long
as we document that it is "subdatasets state" and possibly use "--status-state"
or just "--status" whenever we decide to harmonize options for that.
WDYT? if we decide to proceed with this one, then for
foreach
(#749) I would add--state={present,absent,any}
and possibly suggest a PR for--fulfilled
->--state={any,present,absent}
RFing/deprecation of fulfilled