-
Notifications
You must be signed in to change notification settings - Fork 110
Non-installed Dataset
instance is using a separate, but non-global config manager
#7299
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Here is another side to the same problem. I now think I understood that the situation that triggered this issue led to two different config managers to be used by two different commands. The problem could be paved over by making these two particular commands behavior more uniformly, and thereby end up using the exact same instance. However, as a general setup, this seems undesirable as a minimum requirement. What I sketched about still makes sense to me to do. However, it does not address a related problem: Presume a GLOBAL config is modified via a DATASET-bound However, there is only one global instance. When any Such an operation can be triggered bluntly in |
When called with `dataset=None`, `credentials()` would auto-locate a dataset containing CWD (and use its configuration), whereas `download()` would use the global configuration in this case. This change makes the behavior uniform and chooses what `download()` did as the model behavior. This uniformatization is meaningful on its own. However, due to datalad/datalad#7299 it is also a necessity, because otherwise credentials deposited by `download()` remain invisible to `credentials()` -- which is the superficial trigger for datalad#249 Closes datalad#249
Forcing reload of global makes sense to me as a workaround. Workaround because in principle global, when queried, should detect global config file modification and reload automagically. But our magic overall is a bit flawed here AFAIK, and #5824 was previous general concern on that. |
Also ping datalad#7300 re diversity of exception behavior for one and the same issue.
This has seemlingly no purpose or use case. It only complicates the synchronization between, here needlessly, many manager instances. This addressed half of datalad#7299
...whenever the 'global' configuration scope was modified. Theoretically, this should also cover a 'system' scope, but presently `ConfigManager` does not recognize or support that. Closes datalad#7299
My my concern with update-on-read is that reads are FAR more frequent than writes. Updates are also expensive. When there is an updated-on-read expectation, it would be sensible to have that expectation cover any sources of configuration. Hence a modification check of the global config file is insufficient. The original reason for not doing that to start with was that it would require per filesystem implementation to get that right, and it might still be limited to the time resolution supported by each particular file system. |
Also ping datalad#7300 re diversity of exception behavior for one and the same issue.
This has seemlingly no purpose or use case. It only complicates the synchronization between, here needlessly, many manager instances. This addressed half of datalad#7299
...whenever the 'global' configuration scope was modified. Theoretically, this should also cover a 'system' scope, but presently `ConfigManager` does not recognize or support that. Closes datalad#7299
Issue fixed in |
* origin/maint: (47 commits) Set minimal version of typing_extensions to be 3.10.0.0 Fix Providers caching of config files [skip ci] Update docs/source/changelog.rst [skip ci] Update CHANGELOG Fix spelling pythonic object comparison Fix git identity test [release-action] Autogenerate changelog snippet for PR 7301 Reload global ConfigManager via any instance Stop returning a custom ConfigManager with Dataset[uninstalled].config Document datalad#7299 [release-action] Autogenerate changelog snippet for PR 7353 [release-action] Autogenerate changelog snippet for PR 7355 BF: save - force git based on particular repo having uuid, not the reference ds [release-action] Autogenerate changelog snippet for PR 7342 TST(BK): extend test to demonstrate datalad#7351 Ensure that GitRepo initiated repo remains without git-annex upon Dataset.create Fix test for incompatible annex and rsync versions Do use/pass repocls into get_convoluted_situation Refine type of `content` argument to `SignalingThread.signal()` ... Conflicts: setup.py -- tqdm dependency was versioned in master due to deprecations
Chasing datalad/datalad-next#249 led me to realize that calling
Dataset().config
for any Dataset that does not actually exist yet locally, the returned instance is NOT the same asdatalad.cfg
.It is a dedicated instance for a
Dataset
instance, which is parameterized like the global one, but still a separate instance.I think this does not make sense, and the docs provide no counter arguments.
They say:
There is a lot that is not persistent about this. For example, I can add overrides to this "persistent" configuration, but these overrides evaporate, as soon as a repo materializes for this
Dataset
.Moreover, calling
set()
on this separateConfigManager
instance will not trigger the update routines in the global manager. Even worse, theConfigManager
instance is bound to aDataset
instance and will persist, as long as that parent instance persists. Given that DataLad implements a Flyweight pattern for datasets, this means that such configuration containers will survive for a surprisingly long time. Commands would "rediscover" a dataset instance for a particular path, and if there continues to be no repo, manipulations done via the global config manager will NOT be reflected in this separate instance -- even if the config items come fromsystem
orglobal
scope!I think this is rather undesirable complexity. I cannot come up with a reason why
Dataset().config
for an uninstalled dataset should not simply returndatalad.cfg
.The text was updated successfully, but these errors were encountered: