Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RF: Change ConfigManager.get() behavior for multiple identical key specs #4924

Merged
merged 6 commits into from Sep 18, 2020

Conversation

mih
Copy link
Member

@mih mih commented Sep 16, 2020

Before this change, the values of all specifications of an identical key
were returned as a tuple. Now only the last value is returned, and
get_all=True has to be set to get the previous behavior.

Rationale: That a single value is returned is a sensible default
expectation. Valid use cases for multiple values are few, and require
custom handling in consuming code. The previous behavior would require
protection against the multi-value-return possibility all throughout
the code base. This switch is disruptive, but makes overall operation
more robust.

Fixes gh-2628
Fixes gh-4919

Before this change, the values of all specifications of an identical key
were returned as a tuple. Now only the last value is returned, and
`get_all=True` has to be set to get the previous behavior.

Rationale: That a single value is returned is a sensible default
expectation. Valid use cases for multiple values are few, and require
custom handling in consuming code. The previous behavior would require
protection against the multi-value-return possibility all throughout
the code base. This switch is disruptive, but makes overall operation
more robust.

Fixes dataladgh-2628
Fixes dataladgh-4919
@mih mih requested a review from kyleam September 16, 2020 13:21
With the last commit, config.get() now returns a single value by
default instead of a tuple in the case of multiple values for the same
key.  Adjust the config.get() callers that detect a tuple and pull out
the last value, as this is now what config.get() does by default.
These two methods are very similar, and in the next commit they are
going to start using the same helper.  Grouping them makes that a bit
more obvious.
The config get*() methods return a tuple if there are multiple values
for the same key (in the non-committed git configuration, if the key
is present there, or in the dataset configuration).  Very few get*()
callers in the code base account for the possibility of a tuple return
value.

With the first commit of this series, get() now returns the last value
of the tuple by default, following the behavior of `git config <key>`.
This eliminates the above problem for get() callers, but it doesn't
cover get_value() or the convenience methods that go through
get_value(): getint(), getfloat(), and getbool().

get_value() probably isn't worth adjusting; it isn't widely used and
it exists for historical reasons (compatibility with the now
irrelevant GitPython parser).  Aside from splitting the `key`
parameter into `section` and `option`, get_value() differs from get()
in one way: if `default` is None, a KeyError is raised rather than
returning None.

getbool(), on the other hand, is used a good amount in non-test code,
and it seems that all the callers expect the single value behavior of
`git config <key>`, as none account for the exception that would
happen in the tuple case (see dataladgh-2628).  Adjust getbool() to go
through get() for the tuple-to-single-value handling, while keeping
the 'default=None -> KeyError' handling for backward compatibility.
For completeness, do similar for getfloat() and getint(), though
combined these are only used in one non-test spot.

Re: datalad#4923
Fixes datalad#2628
Copy link
Contributor

@kyleam kyleam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rationale: That a single value is returned is a sensible default
expectation. Valid use cases for multiple values are few, and require
custom handling in consuming code. The previous behavior would require
protection against the multi-value-return possibility all throughout
the code base. This switch is disruptive, but makes overall operation
more robust.

That matches my view. I think the number of potential bugs/errors this squashes outweighs any disruption. And quickly checking the extensions, I didn't spot any callers that accounted for a possible tuple return value, though it'd be good to do a more thorough check. (Perhaps the -neuroimaging test failure is related, though I didn't look closely yet.)

I've pushed some additional changes (most notably, making getbool go through get). Of course feel free to drop anything you don't agree with.

The test failures on Travis seem to be in test_create_github.test_integration1_datalad_tester_org. I'm not sure if that's related.

@mih
Copy link
Member Author

mih commented Sep 17, 2020

Thanks for the additional changes @kyleam I'd prefer to keep them all. I will try to have a look at the test failures later today.

test_create_github.test_integration1_datalad_tester_org is also unhappy on appveyor. Seems to be causal... and replicates locally.

And indeed, we explicitly rely on multi-value returns for configured GitHub access tokens. Pushed a fix.

Previous implementation was suboptimal/wrong as it did not ensure
the "X in Y" test is happening with Y always being a tuple/list.

Now also uses the new ConfigManager.get(get_all=True) option to
explicitly request all values to be reported.

Adjusted a related test to trigger multi-value issues in the -core
tests, instead of solely relying on an associated -neuroimaging test
as an indicator.
@codecov
Copy link

codecov bot commented Sep 17, 2020

Codecov Report

Merging #4924 into master will increase coverage by 0.07%.
The diff coverage is 95.83%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #4924      +/-   ##
==========================================
+ Coverage   89.77%   89.85%   +0.07%     
==========================================
  Files         290      289       -1     
  Lines       40708    40711       +3     
==========================================
+ Hits        36546    36581      +35     
+ Misses       4162     4130      -32     
Impacted Files Coverage Δ
datalad/core/local/run.py 99.09% <ø> (-0.01%) ⬇️
datalad/support/github_.py 78.65% <ø> (ø)
datalad/config.py 96.65% <95.00%> (-0.16%) ⬇️
datalad/core/local/tests/test_resulthooks.py 100.00% <100.00%> (ø)
datalad/interface/run_procedure.py 91.41% <100.00%> (+0.50%) ⬆️
datalad/tests/test_config.py 100.00% <100.00%> (ø)
datalad/core/local/create.py 93.05% <0.00%> (-1.39%) ⬇️
datalad/support/annexrepo.py 88.99% <0.00%> (-0.01%) ⬇️
datalad/distribution/clone.py
datalad/support/tests/test_gitrepo.py 99.77% <0.00%> (+<0.01%) ⬆️
... and 4 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c33bd4e...250be7e. Read the comment docs.

@kyleam
Copy link
Contributor

kyleam commented Sep 17, 2020

And indeed, we explicitly rely on multi-value returns for configured GitHub access tokens. Pushed a fix.

Thanks. The only failure now is the unrelated -container issue.

@bpoldrack Since we've discussed the single/multi-value config behavior in the past: does this direction look okay to you?

@bpoldrack
Copy link
Member

@bpoldrack Since we've discussed the single/multi-value config behavior in the past: does this direction look okay to you?

Yes, it does. In most cases only the "effective" config is relevant indeed and wherever we want to account for several we do know that this is spot where we need get_all. There is one piece still missing, I guess: We cannot have a "multi-config" that still allows to be overwritten by more specific config files (like: ~/.gitconfig defines A and B, but I want to overwrite in .git/config by C and D). However, this doesn't become harder by this PR, so - perfectly fine with it.

@mih
Copy link
Member Author

mih commented Sep 18, 2020

Ok, seems we're on the some page. Thx much!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants