-
Notifications
You must be signed in to change notification settings - Fork 111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Limit recursive install on a per-subdataset basis #1598
Conversation
If set to 'skip', the respective subdataset is skipped when DataLad | ||
is recursively installing its superdataset. However, the subdataset | ||
remains installable when explicitly requested, and no other features | ||
are impaired. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But what if I do want to get the full tree/hierarchy, since it must stop at some point, how would I do it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If that is an actual use case it would need a switch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we better just record dataset uuid and have a configurable per install strategy on what to do with datasets which were already installed elsewhere (above current dataset). With all the new ways of handling args, could we also pass such information inside?
My point is that I am not sure if it is the right place to decide... Sure thing we could add more switches and knobs as we discover more
WOW, replicated the |
So you are saying that anyone that installs /// should receive a list of
uuids that they should supply to prevent 7 levels of redundant datasets?
Maybe not...
…On Jun 21, 2017 18:27, "Yaroslav Halchenko" ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In datalad/distribution/subdatasets.py
<#1598 (comment)>:
> - notably slower (performs one call to Git per dataset versus a single call
- for all combined).
+ Performance note: Property modification, requesting `bottomup` reporting
+ order, or a particular numerical `recursion_limit` implies an internal
+ switch to an alternative query implementation for recursive query that is
+ more flexible, but also notably slower (performs one call to Git per
+ dataset versus a single call for all combined).
+
+ The following properties for subdatasets are recognized by DataLad
+ (without the 'gitmodule_' prefix that is used in the query results):
+
+ "datalad-recursiveinstall"
+ If set to 'skip', the respective subdataset is skipped when DataLad
+ is recursively installing its superdataset. However, the subdataset
+ remains installable when explicitly requested, and no other features
+ are impaired.
I wonder if we better just record dataset uuid and have a configurable per
install strategy on what to do with datasets which were already installed
elsewhere (above current dataset). With all the new ways of handling args,
could we also pass such information inside?
My point is that I am not sure if it is the right place to decide... Sure
thing we could add more switches and knobs as we discover more
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#1598 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAIVHwydfg0wfdzBpowQ99SYZYWT9EACks5sGUSEgaJpZM4OA_Gw>
.
|
Just to clarify: without a switch for a maintainer to indicate that a subdataset is not for install we will have a problem once studyforrest is part of /// Simply because some subdatasets will never be available, hence |
No. When you install top level you know uuids of datasets at that level. When you install a subdataset recursively, you can know what datasets available at the levels above |
It does not help in the situation that I outlined, such dataset will never be installed and always cause failure. |
Could you please point to that "outline"? ;-) |
5cm up: #1598 (comment) |
Ah, so for some internal subdatasets, not to be shared (yet) etc. Yeah, for those needs explicit marker. I thought it was too be used also for marking the ones included in multiple places within the bigger collection |
In studyforrest I use that same marker for both. There is simply no point in having them installed on |
That being said, the marker could have a different name. For example, it would be nice to be able to say "uninstall all input datasets --recursive" |
Codecov Report
@@ Coverage Diff @@
## master #1598 +/- ##
===========================================
- Coverage 85.54% 49.79% -35.75%
===========================================
Files 260 254 -6
Lines 29848 27750 -2098
===========================================
- Hits 25532 13817 -11715
- Misses 4316 13933 +9617
Continue to review full report at Codecov.
|
So there you have it. This PR has virtually no new code that actually runs, yet it still segfaults. My point: the cause is not in the PRs, but it is already in master and we are sampling random variations that make it finally go KABOOM. I am out of ideas. In any case, stopping to merge PRs because of this segfault seems counterproductive. |
I will look at segfault as soon as I find Ethernet Port/internet for the laptop |
28766dd
to
d0c5eb6
Compare
Docs are inside.
So now we could install study Forrest? :-) |
No, functionality is not in master...
…On Jul 7, 2017 03:00, "Yaroslav Halchenko" ***@***.***> wrote:
So now we could install study Forrest? :-)
—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
<#1598 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAIVHzKYkj_CERTfhTwblFnedRcTBQl9ks5sLYM7gaJpZM4OA_Gw>
.
|
Docs are in the diff.
Data dependency datasets in the studyforrest collection want to marked up with this, before they come on board.