-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Switch/abandon ORA abstraction paradigm #30
Comments
I see two things to think about: 1.)
This comes with the implication, that we can't have a local reconfig (which we recently introduced), since 2.) |
I could not come up with a use case that would require a local reconfiguration. AFAIR all such scenarios lead to problems down the line. Most, if not all, consumption scenarios are fully addressed via datalad/datalad#5835 (stale), in which committing a local reconfiguration is not an issue.
I don't understand what you are saying. The current system can stay in place forever. If it works for people with its limitations, nothing needs to be done on their end. And there are no redirections needed. |
I think we had a bunch of cases where one would want to have a local clone from a store that is also served over HTTP/SSH. Operations on such a local clone would ideally not go via HTTP/SSH. All issues with that, that I remember were that either we couldn't detect whether this is needed or that the reconfiguration was committed. Both led to changes and particularly with local reconfiguration a lot of trouble in that regard should be addressed.
Yes, that's what I am saying. It needs to stay in some shape. But it would ideally try to share code with the new special remotes you aim for, rather than us having two implementations, I think. Hence, it seems to me, that it would evolve into the very thing this approach tries to avoid. Anyway, that's not a fundamental objection. May be it helps getting there. However, the more special remote types there are (and are part of existing datasets) the more we need to maintain. If we figure a way to have a proper RIA abstraction along the way, that can be used with pretty much any (special) remote, that would be cool nevertheless. |
Can you describe a concrete case, where this is desired, and that is not a plain consumption (read-only) case? |
RIA store, which for consumption is set up to be served over HTTP. Dataset maintainer/curator making updates to datasets in a store. For large data additions, a local clone is desired, b/c of awesome network allowing to quickly download lots of stuff into a local clone, committing and pushing to the store locally, rather than downloading elsewhere and pushing over SSH. Am I making sense? |
However, this business might be addressable by having yet another |
I think the scenario you describe should be covered by the normal "ephemeral clone" setup, which can directly interface any store on the local machine. All file content is directly available. The only case not covered is a store that hosts file content in 7z archives. So taken together, it would not cover the use case of having to modify existing file content in a dataset, kept in a store with archive.7z, and push the modified content back to that store. That seems like a corner case. If there was an archive 7z before, there will likely have to be one after the update too. And if so, a push doesn't give that. Instead the archive file needs to be updated by a manual process outside the special remote universe. |
I just came across the need to turn a provided special remote configuration into a working one (configured URL was no accessible to me, but the location was accessible via another channel).
Has worked great. It promises to create a new special remote that is not shared and points to the original source. Worth nothing that one can override |
The ORA remote uses an internal IO abstraction that aims to make handling uniform across protocols (
file://
,ssh://
,http(s)://
) while everything is going through a single special remote implementation.This sounds nice on paper, but creates a complex problem of supporting a uniform set of operations a using these exact same operations across all implementations. The present implementation fails to deliver on this promise.
I'd argue that a simpler system can be implemented that is more in line with the paradigm preferred by git-annex. Rather than having a single complex beast, let's have the individual pieces implemented properly (one protocol per implementation). Rather than supporting push/pull URL combinations in a single remote, let's use two in such cases (with
--same-as
), one for pull, and possibly another one, or none at all for push. Rather than fiddling with the internal parameterization of a single special remote type, let's switchexternaltype=
when a reconfig is required.This will make the code base simpler, easier to maintain, and most importantly enable 3rd-party extensions without having to touch -core code.
The text was updated successfully, but these errors were encountered: