-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clear repo content type multi instance #40
Clear repo content type multi instance #40
Conversation
I see why the static checks failed. The analysis score regressed, and it's due to the addition of this warning from
Stylistically, I would prefer the allowed-in-Python3 way of being able to write the init method def __init__(self, *args, split_on=",", **kwargs) which is a syntax error in Python 2. However this way def __init__(self, split_on=",", *args, **kwargs) caused this warning in static analysis. I suppose I could forgo a keyword argument, but I'd like to be able to keep it to separate it from the keyword args of the argparse API. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The analysis score regressed
FYI the outcome of the test isn't based on whether the score regressed, it's strictly based on whether there were any complaints of severity fatal, error or warning.
The pylint complaint is valid, isn't it? There wouldn't be a way to pass split_on
as a keyword argument and also pass more positional arguments?
My suggestion would be one of:
(1) Extract split_on from kwargs
def __init__(self, *args, **kwargs):
split_on = kwargs.pop('split_on', ',')
Or, (2) don't implement any split_on argument, just support ','. Other characters aren't needed to meet the requirements. They may never be needed; if they later are, maybe we've managed to drop python2 by that time so we'd be able to use py3-only syntax freely.
Anyway, it needs to be made to pass somehow.
A separate issue - maybe you were already planning to remove this, but can we please not have commits like "Trigger update on upstream pull request due to GitHub outage" in PRs? You should be able to get the same effect by just a "git commit --amend" with no content. Maybe adding "--reset-author" (which also resets timestamp) if the former doesn't seem to do the trick.
Add an argparse Action subclass to help accumulate multiple instances of a command-line option in to a list.
abb3051
to
d3ec452
Compare
I see now that you are correct. I had experimented with what I thought was a recreation of this in a 2.7 interpreter and I was able to call it and print all the arguments, but now I'm questioning my experiment. I have fixed this in the latest commit by taking your first suggestion above using
Thank you for the suggestion! I will use this approach in the future rather than |
pubtools/_pulp/arguments.py
Outdated
# calling filter(None, Iterable) is equivalent to | ||
# filter(lambda item: bool(item), Iterable) | ||
# i.e. yields items which evaluate to True. | ||
split = list(filter(None, split)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we do this filtering?
# (1)
"value1," => ["value1"]
"a,,b" => ["a", "b"]
# (2)
"value1," => ["value1", ""]
"a,,b" => ["a", "", "b"]
As I understand it, this implements (1). Is there a reason the behavior in (1) is preferred over (2)?
I feel it's the other way around - (1) tries to guess what the caller probably meant, which is dangerous as the guess can be wrong. (2) accurately maintains the data passed in by the caller, who may have had a valid reason for passing an empty string. Generic code like this can't know that "silently ignore empty arguments" is the right thing to do for all use-cases.
If this was inspired by the [x for x in type_strs if x] or None
in content_type which is being replaced - I think that code was just a way of making it return None if no argument was provided, it wasn't a goal to filter out empty strings passed by the caller.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are correct this filtering implements (1) in your snippet. I noted in a comment within __call__
that the intent is to remove empty strings introduced by something like --content-type rpm,
as you demonstrated. The motivation is to prevent ""
from appearing in the list of content types.
While this code was not inspired by [x for x in type_strs if x] or None
, it does have the same effect as the first operand in that expression. So with [x for x in type_strs if x] or None
being replaced, removing the filter
statement would introduce the possibility of empty strings in the list of content types going forward.
You make a good point that this code can't know that filtering empty arguments is the right thing to do in all cases, so perhaps removing the filtering is the correct thing to do overall. I just want to make it known that this will be new behavior.
Perhaps it makes more sense that if ""
needs to be filtered out of the list of content types then that should be implemented in pubtools-pulplib
by the Repository or Client for operations where that is the case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perhaps removing the filtering is the correct thing to do overall
Yeah, can we please do that? I'm concerned the filtering will have unintended consequences in the long term.
removing the filter statement would introduce the possibility of empty strings in the list of content types going forward
I'm not concerned about that - I'm pretty certain that, in the previous code:
type_strs = (self.args.content_type or "").split(",")
return [x for x in type_strs if x] or None
The filtering of empty values was just a shortcut to deal with the fact that the default value calculated by the first line would end up as [""]
rather than []
. If callers actually pass empty strings through here, I'd say it's acceptable to crash.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Filtering empty strings has been removed and the test cases updated to reflect this.
tests/shared/test_arguments.py
Outdated
def test_split_and_extend_single_delimited_instance(): | ||
"""Test a single option instance, with all values in a delimited list.""" | ||
# test one instance, all delimited using different delimiters | ||
for delimiter in ",.-/": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could also consider using pytest.fixture or pytest.mark.parametrize for delimiter here.
The pytest.fixture approach might not be obvious... it'd look something like:
@pytest.fixture(params=[',', '.', '-', '/'])
def any_delimiter(request):
return request.param
Then any test which accepts 'any_delimiter' as an argument will be invoked once for each possible value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is much cleaner, great suggestion!
tests/shared/test_arguments.py
Outdated
assert args.option == FAKE_OPTION_VALUES | ||
|
||
|
||
def test_split_and_extend_multiple_instances_with_trailing_delimiters(parser): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The filtering of empty arguments seems like it'll affect more than just trailing delimiters, so it'd be nice to see tests for empty values appearing elsewhere as well, e.g. ",a,b" , "a,,b" .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went ahead and updated this test to use a variety of cases that would produce empty values in different places and quantities. I am happy to make separate test functions for each instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Honestly, if it were me I'd go the opposite direction and have more of these tests combined together and parametrized, such as:
@pytest.mark.parametrize("argv,expected_option", [
(["--option", "x"], ["x"]),
(["--option", "x,y"], ["x", "y"]),
(["--option", "x", "--option", "y"], ["x", "y"]),
(["--option", "x", "--option", "y,z"], ["x", "y", "z"]),
(["--option", "x,y", "--option", "a,b"], ["x", "y", "a", "b"]),
(["--option", "a,,b", "--option", "c,"], ["a", "", "", "b", "c", ""]),
])
def test_split_extend(argv, expected_option, parser, delimiter):
parser.add_argument("--option", type=str, action=SplitAndExtend, split_on=delimiter)
sys.argv = ["command"] + argv
args = parser.parse_args()
assert args.option == expected_option
i.e. there's just a mapping between input values and expected output values, and the test implementation is the same for all cases (you might only need different test functions if you have e.g. some crashing cases or something else where it doesn't work normally).
I find this easier to read, more maintainable and I think it can achieve greater coverage with less test code needed. This is all quite subjective though and I'm just happy if the relevant cases are tested at all, regardless of the style.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with you, so I went ahead and updated the test code to make use of this.
I started down the path of adding a dependency on delimiter
so that cases 1, 3, 4, and 5 in your example would use the given delimiter instead of ","
e.g. a,,b
, a--b
, a//b
, and a..b
. The code started getting messy, so I elected for a second simple test function to test that SplitAndExtend
can handle different delimiters for a basic case.
18cbc32
to
f16e5fa
Compare
@@ -73,6 +73,8 @@ def add_args(self): | |||
"--content-type", | |||
help="remove only content of these comma-separated type(s)", | |||
type=str, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Didn't notice this until after submit - I wonder does the 'type' argument do anything now?
If it has any effect now, it's probably meant to be list
. But maybe it just does nothing once you start using action
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like using type=str
isnt needed since values are read in as strings by default, but I don't mind it. It doesn't hurt to be explicit about the type of the parsed values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, the point is that parsed.content_type used to return a str, so that was consistent with type, but now it returns a list, so that's inconsistent with type.
Is it correct that type now means "type of elements within list"? Docs don't say anything about how type
and action
work together as far as I can tell.
If you find for example that setting type=int doesn't cause any tests to fail, I'd take that as evidence that the argument has no effect now and in that case I'd be in favor of removing it, because I think it just misleads the reader into thinking that it has some effect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I see what you mean. ClearRepo.content_type
returns a list
or None
as it did before, but ClearRepo.args.content_type
now returns a list
instead of a comma-separated str
.
I did not check for access to ClearRepo.args.content_type
directly, but no tests failed with that change so we might be safe.
I'd be a fan of adding an additional property to the ClearRepo
API to return the comma-separated string if it's a common need. But I can also understand the case to remove the type
argument all together for the reason you mentioned.
Overview
Allow the
--content-type
option toclear-repo
to be specified multiple times. Each instance of--content-type
still conforms to the current behavior of being a comma-separated list of content types.Approach
Add a new argparse.Action subclass called
SplitAndExtend
that can be re-used for any situation like this and allows for delimiters other than","
to be used (though","
is the default). This Action class is now used inClearRepo#add_args
to configure the parser to accept multiple instances of--content-type
The
ClearRepo#content_type
property accessor has been updated so that it no longer has to do string splitting and instead just returns the accumulated list parsed from the command-line.Multiple new test cases are added to cover some different situations that could arise when using the new Action, especially if the options list is being built programmatically.
One additional test case is added specifically for the
clear-repo
command to ensure it behaves as expected with multiple content types.Closes #25.