-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEATURE] Argument Parser: new member function that checks if an option was set. #1859
Conversation
31297e0
to
cc66b9a
Compare
Codecov Report
@@ Coverage Diff @@
## master #1859 +/- ##
==========================================
+ Coverage 98.08% 98.09% +0.01%
==========================================
Files 262 262
Lines 10679 10744 +65
==========================================
+ Hits 10474 10539 +65
Misses 205 205
Continue to review full report at Codecov.
|
Resolution 24.06.2020We have a preference for the member function design. |
cc66b9a
to
47c1534
Compare
auto end = std::find(cmd_arguments.begin(), cmd_arguments.end(), "--"); | ||
return detail::format_parse::find_option_id(cmd_arguments.begin(), end, id) != end; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm having trouble understanding what you are doing.
- Why search for
--
when you are looking at short ids? - Wouldn't you want to use
std::find_end
since you surely don't wantend
to point to the first occurrence of--
? - Why don't you just pass begin and end of cmd_arguments?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just noticed that you answered yesterday under my month old comment, maybe you can add the reasoning for the --
as a comment, because it is really confusing.
Also, where does the --
come from?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No this is something different.
I actually do want to search for the first occurrence of "--". Note: Not as a prefix (or other -fix) of another word but ONLY "--" because it signals the end of options and marks the beginning of arguments for positional options. It is special in its meaning. And since I am searching for option names I must only search before the arguments for positional options.
Was that explained in an understandable way? Otherwise I'll try again.
--
is a POSIX convention to singal that there are no more option names after this. That way the user can use dash in arguments without confusing the argument parser. This done like this in most argument parsers.
So to your questions: I explicitly do NOT want to search the whole command line but only that part where option names can be found.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you add the --
somewhere or is it magically there?
Or is this user input that may be there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is user input on the command line that may be there
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And while looking at it. You should make --
a static constexpr variable somewhere in the class's scope such that it also gets a proper name: end_of_options_marker
or something like that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some more thoughts about that. If I understand this correctly, then we have to call parse before we can ask whether an option was set by the user. So should it not throw in the case parse wasn't called before? I think we have a variable that caches this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The other thing I'd like to reconsider from the implementation is the duplication of this code.
First, I observed that an extra copy is made to store the parsed arguments in the argument parser.
Second, the arguments are only necessary if format_parse was called so also this option can only be available if format_parse was called.
So the code should be refactored that I can ask for the set option in the format_parse class. Calling it for any other parsing option can then throw, right? So since all formats inherit from a base class this can be achieved easily. Then only the format_parse class overloads this and inside of the argument_parser you call std::visit on the selected format.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some more thoughts about that. If I understand this correctly, then we have to call parse before we can ask whether an option was set by the user. So should it not throw in the case parse wasn't called before? I think we have a variable that caches this?
No, we do not have to call parse before checking this. It is independent. (see test case)
This is the case because we have the list argv
already on construction with all the information we need.
We can of course enforce this restriction anyway if we think it is confusing for the user or more intuitive 🤷.
The other thing I'd like to reconsider from the implementation is the duplication of this code.
Which duplication? Do you mean the extra argv
copy?
First, I observed that an extra copy is made to store the parsed arguments in the argument parser.
Yes. We now have to copies of argv
. One in the argument_parser
and one in format_parse
.
But this is needed either way, even if I move this list into format_parse
because format_parse is stripping it's copy of argv
of arguments while parsing the information s.t. it can afterwards detect "too many" or "unknown option identifiers". So in order to enable is_option_set
I need to store the original argv
in some place anyway.
Second, the arguments are only necessary if format_parse was called so also this option can only be available if format_parse was called.
Technically no, see above.
So the code should be refactored that I can ask for the set option in the format_parse class. Calling it for any other parsing option can then throw, right? So since all formats inherit from a base class this can be achieved easily. Then only the format_parse class overloads this and inside of the argument_parser you call std::visit on the selected format.
IMHO I think that is_option_set
is not strongly attached to format_parse
. It's a feature of the argument parser that doesn't involve actual parsing of argument values but only checking if an id can be found. Is this such a strong design decision, whether we call a static function of format_parse or use std::visit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, we do not have to call parse before checking this. It is independent. (see test case)
IMHO I think that is_option_set is not strongly attached to format_parse. It's a feature of the argument parser that doesn't involve actual parsing of argument values but only checking if an id can be found. Is this such a strong design decision, whether we call a static function of format_parse or use std::visit?
But the design doesn't add up in my opinion. If I understand it correctly, then right now the API requires to call parse in order to validate and set the parameters with the options from the command line or to print the help page. So the design deliberately forces us to call parse after we added the options to the parser. Everything is deferred. Accordingly, in the proper design the check if an option was set should be put after the parse method was called. The reason is that the interface should only work for valid options and throw if you try to ask for an option that wasn't added to the parser. But this can only be known after the option was added, and parse functions here as the terminator. So that's why I think that the current solution does not reflect the design of the argument parser.
I hope I have answered all your questions. If I missed a requested change please tell me again :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One small thing for the changelog, otherwise lgtm
11669f1
to
4d76bdc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I have some general considerations about the implementation of this feature. Please find my comments below.
auto end = std::find(cmd_arguments.begin(), cmd_arguments.end(), "--"); | ||
return detail::format_parse::find_option_id(cmd_arguments.begin(), end, id) != end; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we should definitely have a documentation entry that points to the POSIX conventions.
And second, it would be good to annotate the intention on the line above with a comment. This avoids confusion later on, because from the code it is not immediately clear why you are looking for --
.
auto end = std::find(cmd_arguments.begin(), cmd_arguments.end(), "--"); | ||
return detail::format_parse::find_option_id(cmd_arguments.begin(), end, id) != end; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the second though, this should also throw an exception if the id has a dash, since you explicitly requiring it not to.
auto end = std::find(cmd_arguments.begin(), cmd_arguments.end(), "--"); | ||
return detail::format_parse::find_option_id(cmd_arguments.begin(), end, id) != end; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some more thoughts about that. If I understand this correctly, then we have to call parse before we can ask whether an option was set by the user. So should it not throw in the case parse wasn't called before? I think we have a variable that caches this?
auto end = std::find(cmd_arguments.begin(), cmd_arguments.end(), "--"); | ||
return detail::format_parse::find_option_id(cmd_arguments.begin(), end, id) != end; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The other thing I'd like to reconsider from the implementation is the duplication of this code.
First, I observed that an extra copy is made to store the parsed arguments in the argument parser.
Second, the arguments are only necessary if format_parse was called so also this option can only be available if format_parse was called.
So the code should be refactored that I can ask for the set option in the format_parse class. Calling it for any other parsing option can then throw, right? So since all formats inherit from a base class this can be achieved easily. Then only the format_parse class overloads this and inside of the argument_parser you call std::visit on the selected format.
Can you rebase? |
Since there are several open discussion points I will wait with the work on this PR. |
EXPECT_TRUE(parser.is_option_set('l')); | ||
EXPECT_TRUE(parser.is_option_set("foobar")); | ||
|
||
EXPECT_FALSE(parser.is_option_set("foo")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please add a test with an empty string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see below
auto end = std::find(cmd_arguments.begin(), cmd_arguments.end(), "--"); | ||
return detail::format_parse::find_option_id(cmd_arguments.begin(), end, id) != end; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, we do not have to call parse before checking this. It is independent. (see test case)
IMHO I think that is_option_set is not strongly attached to format_parse. It's a feature of the argument parser that doesn't involve actual parsing of argument values but only checking if an id can be found. Is this such a strong design decision, whether we call a static function of format_parse or use std::visit?
But the design doesn't add up in my opinion. If I understand it correctly, then right now the API requires to call parse in order to validate and set the parameters with the options from the command line or to print the help page. So the design deliberately forces us to call parse after we added the options to the parser. Everything is deferred. Accordingly, in the proper design the check if an option was set should be put after the parse method was called. The reason is that the interface should only work for valid options and throw if you try to ask for an option that wasn't added to the parser. But this can only be known after the option was added, and parse functions here as the terminator. So that's why I think that the current solution does not reflect the design of the argument parser.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am still a little bit unhappy with the design. This is merely because I believe that the argument parser design might not be reflected properly. I tried to explain my concerns in the comments. Please feel free to put it into core discussion if you have a different opinion or if I did not clear things up. Thank you!
d6b52c8
to
bd7aa56
Compare
@rrahn polite ping |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nicely done. Some suggestions to make the code more concise.
/*!\brief Checks whether the option distinguished by `id` was set on the command line by the user. | ||
* \param[in] id The option identifier to search for (must not contain dashes!). | ||
* \returns `true` if option identifier `id` was set on the command line by the user. | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The documentation is part of the public API and should be a little bit more verbose. For example it is not clear that the function might throw. Second the documentation says that id should not have dashes. But the overloaded function would get the same information, while it is not true in this case.
bool is_option_set(char const id) | ||
{ | ||
if (!parse_was_called) | ||
throw design_error("You can only asked which options have been set after calling the function `parse()`."); | ||
|
||
auto constexpr allowed = is_alnum || is_char<'_'> || is_char<'@'>; | ||
|
||
if (!allowed(id)) | ||
throw design_error("Short option identifiers may only contain alphanumeric characters, '_', or '@'."); | ||
|
||
// we only need to search for an option before the `end_of_options_indentifier` (`--`) | ||
auto end_of_options = std::find(cmd_arguments.begin(), cmd_arguments.end(), end_of_options_indentifier); | ||
return detail::format_parse::find_option_id(cmd_arguments.begin(), end_of_options, id) != end_of_options; | ||
} | ||
|
||
// allows all types that are implicitly convertible to std::string | ||
//!\overload | ||
bool is_option_set(std::string const & id) | ||
{ | ||
if (!parse_was_called) | ||
throw design_error("You can only asked which options have been set after calling the function `parse()`."); | ||
|
||
if (id.size() <= 1) | ||
throw design_error("Long IDs must be longer than one character."); | ||
else if (is_char<'-'>(id[0])) | ||
throw design_error("First character of long ID cannot be '-'."); | ||
|
||
auto constexpr allowed = is_alnum || is_char<'_'> || is_char<'@'> || is_char<'-'>; | ||
std::for_each(id.begin(), id.end(), [&allowed] (char c) | ||
{ | ||
if (!allowed(c)) | ||
throw design_error("Long identifiers may only contain alphanumeric characters, '_', '-', or '@'."); | ||
}); | ||
|
||
// we only need to search for an option before the `end_of_options_indentifier` (`--`) | ||
auto end_of_options = std::find(cmd_arguments.begin(), cmd_arguments.end(), end_of_options_indentifier); | ||
return detail::format_parse::find_option_id(cmd_arguments.begin(), end_of_options, id) != end_of_options; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Second, what happens if I call is_option_set("a")? is that really a design error? It would be a user error, right? But it should also just work. And is is_option_set('_') allowed as an option?
In addition, I see a lot of potential to DRY this function. For example, you could write a detail function that get's the option identifier always as a string and in addition the set of valid characters. Then you can only need to handle the really different things in the tpo-level functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about this again and I propose the following: Since we do enforce that parse()
is called before this function can be used, I can now just check if the requested option identifier is in the list of used_option_identifiers
. Those have already been tested in the add_option
calls beforehand so I can omit all the testing completely?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But you specify the option by a string, which could contain invalid characters, aren't you? So you would need to have some checks but you could resue what is already there to test them properly, which is probably the same code as done when adding the options?
8f6be37
to
58a5ffb
Compare
Sorry I had to rebase because of conflicts so I rebased all the commits correctly right away |
58a5ffb
to
1a37788
Compare
bool is_option_set(std::string const & id) const | ||
{ | ||
if (!parse_was_called) | ||
throw design_error{"You can only asked which options have been set after calling the function `parse()`."}; | ||
|
||
if (id.size() == 1) | ||
throw design_error{"Long option identifiers must be longer than one character! If " + id + "' was meant to " | ||
"be a short identifier, please pass it as a char ('') not a string (\"\")!"}; | ||
|
||
if (std::find(used_option_ids.begin(), used_option_ids.end(), id) == used_option_ids.end()) | ||
throw design_error{"You can only ask for option identifiers that you added with add_option() before."}; | ||
|
||
// we only need to search for an option before the `end_of_options_indentifier` (`--`) | ||
auto end_of_options = std::find(cmd_arguments.begin(), cmd_arguments.end(), end_of_options_indentifier); | ||
return detail::format_parse::find_option_id(cmd_arguments.begin(), end_of_options, id) != end_of_options; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bool is_option_set(std::string const & id) const | |
{ | |
if (!parse_was_called) | |
throw design_error{"You can only asked which options have been set after calling the function `parse()`."}; | |
if (id.size() == 1) | |
throw design_error{"Long option identifiers must be longer than one character! If " + id + "' was meant to " | |
"be a short identifier, please pass it as a char ('') not a string (\"\")!"}; | |
if (std::find(used_option_ids.begin(), used_option_ids.end(), id) == used_option_ids.end()) | |
throw design_error{"You can only ask for option identifiers that you added with add_option() before."}; | |
// we only need to search for an option before the `end_of_options_indentifier` (`--`) | |
auto end_of_options = std::find(cmd_arguments.begin(), cmd_arguments.end(), end_of_options_indentifier); | |
return detail::format_parse::find_option_id(cmd_arguments.begin(), end_of_options, id) != end_of_options; | |
} | |
bool is_option_set(char const id) const | |
{ | |
return is_option_set_impl(std::string{id}, false); | |
} | |
bool is_option_set(std::string const & id) const | |
{ | |
return is_option_set_impl(id, true); | |
} | |
bool is_option_set_impl(std::string const & id, bool const is_long_option_id) const | |
{ | |
if (!parse_was_called) | |
throw design_error{"You can only ask which options have been set after calling the function `parse()`."}; | |
if (is_long_option_id && id.size() == 1) | |
throw design_error{"Long option identifiers must be longer than one character! If " + id + "' was meant to " | |
"be a short identifier, please pass it as a char ('') not a string (\"\")!"}; | |
if (std::find(used_option_ids.begin(), used_option_ids.end(), id) == used_option_ids.end()) | |
throw design_error{"You can only ask for option identifiers that you added with add_option() before."}; | |
// we only need to search for an option before the `end_of_options_indentifier` (`--`) | |
auto end_of_options = std::find(cmd_arguments.begin(), cmd_arguments.end(), end_of_options_indentifier); | |
return detail::format_parse::find_option_id(cmd_arguments.begin(), end_of_options, id) != end_of_options; | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This also also reduces the typo to only one occurrence :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does not work because
is_option_set_impl(std::string{id}, false);
will trigger the exception of long ids, that they must be longer than one char.
I'll try to refactor otherwise
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait what? I did care of this by passing this additional bool parameter to the internal interface. See, the check whether the long_id has only one element is only evaluate if is_long_option_id
is set to true. But it is not in the case of a short id. Other than that the code was exactly the same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, no. This goes on in the parse operations as well. I see.
* \details | ||
* | ||
* You can only ask for option identifiers that were added to the parser beforehand via | ||
* seqan3::argument_parser::add_option. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The API documentation is missing the throw statement and a description when and what this function throws.
@rrahn polite ping :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm! only renaming of the id_ variable. You can already rebase. Thank you!
"be a short identifier, please pass it as a char ('') not a string (\"\")!"}; | ||
// the detail::format_parse::find_option_id call in the end expects either a char or std::string | ||
using char_or_string_t = std::conditional_t<std::same_as<id_type, char>, char, std::string>; | ||
char_or_string_t id_ = {id}; // e.g. convert char * to string here if necessary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
char_or_string_t id_ = {id}; // e.g. convert char * to string here if necessary | |
char_or_string_t short_or_long_id{id}; // e.g. convert char * to string here if necessary |
ae80f7c
to
b801b4f
Compare
Fixes #1741
Current look:
would return 1 if called with
./test_parser -f 1
would return 0 if called with
./test_parser -g 1
There are open design questions:
option_was_set
,option_is_set
,is_set
,was_set
,option_is_overwritten
... suggestions?(a) a respective option was added (
parser.add_option('f', ...)
)(b)
parser.parse()
was called and all options values were parsedDo we want to allow this? We can easily introduce these semantic requirements.
Resolution:
parser.is_option_set
The function will return an enum that can be implicitly cast to bool.