Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove uses of char* in app CLI #2911

Merged
merged 13 commits into from
Aug 13, 2024
Merged

Remove uses of char* in app CLI #2911

merged 13 commits into from
Aug 13, 2024

Conversation

daljit46
Copy link
Member

@daljit46 daljit46 commented May 21, 2024

This PR aims to replace uses of C-style const char* arguments with std::string in our CLI. C-style arrays for list of arguments/options are also replaced in favour of std::vector<std::string>. Partially fulfils one of the objectives in #2585 and #2111.

@daljit46 daljit46 self-assigned this May 21, 2024
@daljit46
Copy link
Member Author

daljit46 commented May 21, 2024

@Lestropie @jdtournier I will need your help with migrating this part of the code in src/gui/mrview.cpp:721-755:

  if (!MR::App::argument.empty()) {
    if (!MR::App::option.empty()) {
      // check that first non-standard option appears after last argument:
      size_t last_arg_pos = 1;
      for (; MR::App::argv[last_arg_pos] != MR::App::argument.back().c_str(); ++last_arg_pos)
        if (MR::App::argv[last_arg_pos] == nullptr)
          throw Exception("FIXME: error determining position of last argument!");

      // identify first non-standard option:
      size_t first_option = 0;
      for (; first_option < MR::App::option.size(); ++first_option) {
        if (size_t(MR::App::option[first_option].opt - &MR::App::__standard_options[0]) >=
            MR::App::__standard_options.size())
          break;
      }
      if (MR::App::option.size() > first_option) {
        first_option = MR::App::option[first_option].args - MR::App::argv;
        if (first_option < last_arg_pos)
          throw Exception("options must appear after the last argument - see help page for details");
      }
    }

    std::vector<std::unique_ptr<MR::Header>> list;
    for (size_t n = 0; n < MR::App::argument.size(); ++n) {
      try {
        list.push_back(std::make_unique<MR::Header>(MR::Header::open(MR::App::argument[n])));
      } catch (CancelException &e) {
        for (const auto &msg : e.description)
          CONSOLE(msg);
      } catch (Exception &e) {
        e.display();
      }
    }
    add_images(list);
  }

@daljit46 daljit46 marked this pull request as draft May 21, 2024 11:08
github-actions[bot]

This comment was marked as outdated.

@Lestropie
Copy link
Member

Message ""options must appear after the last argument - see help page for details"" isn't quite right; it's specifically *mrview-specific` options, ie. not standard options, that must appear after all positional arguments and standard options (at least from looking at what the code is doing).

@Lestropie
Copy link
Member

Decision during yesterday's meeting was that the scope of this PR would be limited to the CLI; while there are uses of char* elsewhere in the code base, it would be better to address those separately in the future.

@daljit46 daljit46 force-pushed the no_char_arguments branch 3 times, most recently from 25736a7 to f0ae136 Compare May 28, 2024 17:42
@daljit46
Copy link
Member Author

Ok so the above snipped in window.cpp has now been rewritten. @Lestropie let me know if the logic still makes sense.

github-actions[bot]

This comment was marked as outdated.

@Lestropie
Copy link
Member

It seems counter-intuitive to me that you'd use iterator offsets to interrogate one argument position, but then rely on upstream internal storage of argument position of each command-line option to interrogate another argument position. I would have thought that if you need to determine two different command-line argument positions, the same mechanism should be used for both.

  1. Find last item that is an argument or standard option
    (or an argument that is provided to a standard option)
  2. Find first item that is not an argument or standard option
    (ie. it's an mrview-specific option)
  3. If there's at least one instance of both 1 and 2, make sure that position 1 does not exceed position 2.

Ultimately, whether this does as intended is best verified by adding command tests, with both valid and invalid uses, and make sure that the former succeeds and the latter fails with the intended error message.

@daljit46
Copy link
Member Author

It seems counter-intuitive to me that you'd use iterator offsets to interrogate one argument position, but then rely on upstream internal storage of argument position of each command-line option to interrogate another argument position.

That's a good point and I agree. To improve this, we could add a new index variable for ParsedArgument (just like I did for ParsedOption), which keeps track of the position in the raw command line argument list. Alternatively, we can loop over the raw argument list and compare the id of a ParsedOption against the raw arguments.
I don't know which one I prefer.

Ultimately, whether this does as intended is best verified by adding command tests, with both valid and invalid uses, and make sure that the former succeeds and the latter fails with the intended error message.

Yes, but since this change concerns mrview it'd be difficult to verify it with an ordinary bash test. The app may fail to launch since GUI commands may not work on the CI machines (e.g. due to missing GUI libraries). This is a case where having the possibility to add unit tests that verify the correctness of a single function may be useful.

@daljit46
Copy link
Member Author

daljit46 commented May 30, 2024

@Lestropie btw can an option be repeated? In other words, is this syntax valid: command arg1 arg2 -opt1 -opt1?
If not, then I guess this could work:

      const auto first_non_standard_option_pos = std::distance(
          MR::App::raw_arguments_list.begin(),
          std::find(MR::App::raw_arguments_list.begin(),
                    MR::App::raw_arguments_list.end(),
                    first_non_standard_option->opt->id)
      );

@jdtournier
Copy link
Member

can an option be repeated?

Yes, but not by default - only if it's been labelled with .allow_multiple(). There are quite a few examples of that in the existing commands (both for arguments or options).

@Lestropie
Copy link
Member

Also, with respect to the option repeating: you'll sometimes see the value of an argument passed via a command-line option accessed as something like opt[0][0]. The first index references one of possibly multiple appearances of that particular command-line option; the second index references one of potentially multiple arguments expected to appear after that command-line option. Good example would be something like mredit -plane: usage; access.

@daljit46 daljit46 force-pushed the no_char_arguments branch 2 times, most recently from 155c1b4 to ed8fdd6 Compare June 6, 2024 11:16
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

@@ -1128,10 +1140,10 @@ void init(int cmdline_argc, const char *const *cmdline_argv) {

terminal_use_colour = !ProgressBar::set_update_method();

argc = cmdline_argc;
argv = cmdline_argv;
raw_arguments_list = std::vector<std::string>(cmdline_argv, cmdline_argv + cmdline_argc);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: do not use pointer arithmetic [cppcoreguidelines-pro-bounds-pointer-arithmetic]

  raw_arguments_list = std::vector<std::string>(cmdline_argv + 1, cmdline_argv + cmdline_argc);
                                                             ^

@@ -1128,10 +1140,10 @@

terminal_use_colour = !ProgressBar::set_update_method();

argc = cmdline_argc;
argv = cmdline_argv;
raw_arguments_list = std::vector<std::string>(cmdline_argv, cmdline_argv + cmdline_argc);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: do not use pointer arithmetic [cppcoreguidelines-pro-bounds-pointer-arithmetic]

  raw_arguments_list = std::vector<std::string>(cmdline_argv + 1, cmdline_argv + cmdline_argc);
                                                                               ^

argv = cmdline_argv;
raw_arguments_list = std::vector<std::string>(cmdline_argv, cmdline_argv + cmdline_argc);
NAME = Path::basename(raw_arguments_list.front());
raw_arguments_list.erase(raw_arguments_list.begin());
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: do not use pointer arithmetic [cppcoreguidelines-pro-bounds-pointer-arithmetic]

  NAME = Path::basename(cmdline_argv[0]);
                        ^

@@ -22,7 +22,7 @@

namespace MR::DWI::Tractography::Connectome {

const char *statistics[] = {"sum", "mean", "min", "max", NULL};
std::vector<std::string> statistics = {"sum", "mean", "min", "max"};
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: variable 'statistics' is non-const and globally accessible, consider making it const [cppcoreguidelines-avoid-non-const-global-variables]

std::vector<std::string> statistics = {"sum", "mean", "min", "max"};
                         ^

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

@@ -1128,10 +1140,10 @@ void init(int cmdline_argc, const char *const *cmdline_argv) {

terminal_use_colour = !ProgressBar::set_update_method();

argc = cmdline_argc;
argv = cmdline_argv;
raw_arguments_list = std::vector<std::string>(cmdline_argv, cmdline_argv + cmdline_argc);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: do not use pointer arithmetic [cppcoreguidelines-pro-bounds-pointer-arithmetic]

  raw_arguments_list = std::vector<std::string>(cmdline_argv, cmdline_argv + cmdline_argc);
                                                                           ^

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

core/app.cpp Outdated
for (size_t i = 0; i < OPTIONS.size(); ++i)
get_matches(candidates, OPTIONS[i], root);
get_matches(candidates, __standard_options, root);
if (isdigit(no_dash_arg.front() != 0) || no_dash_arg.front() == '.') {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: implicit conversion 'int' -> bool [readability-implicit-bool-conversion]

Suggested change
if (isdigit(no_dash_arg.front() != 0) || no_dash_arg.front() == '.') {
if ((isdigit(no_dash_arg.front() != 0) != 0) || no_dash_arg.front() == '.') {

core/app.cpp Outdated
for (size_t i = 0; i < OPTIONS.size(); ++i)
get_matches(candidates, OPTIONS[i], root);
get_matches(candidates, __standard_options, root);
if (isdigit(no_dash_arg.front() != 0) || no_dash_arg.front() == '.') {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: implicit conversion bool -> 'int' [readability-implicit-bool-conversion]

Suggested change
if (isdigit(no_dash_arg.front() != 0) || no_dash_arg.front() == '.') {
if (isdigit(static_cast<int>(no_dash_arg.front() != 0)) || no_dash_arg.front() == '.') {

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

core/app.cpp Outdated
const Option *match_option(std::string_view arg) {
// let's check if arg starts with a dash
auto no_dash_arg = without_leading_dash(arg);
if (arg.size() == no_dash_arg.size() || no_dash_arg.empty() || isdigit(no_dash_arg.front() != 0) != 0 ||
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: implicit conversion bool -> 'int' [readability-implicit-bool-conversion]

Suggested change
if (arg.size() == no_dash_arg.size() || no_dash_arg.empty() || isdigit(no_dash_arg.front() != 0) != 0 ||
if (arg.size() == no_dash_arg.size() || no_dash_arg.empty() || isdigit(static_cast<int>(no_dash_arg.front() != 0)) != 0 ||

@daljit46 daljit46 force-pushed the no_char_arguments branch 2 times, most recently from c3eeefe to d73ce09 Compare June 14, 2024 13:29
@Lestropie Lestropie mentioned this pull request Jun 19, 2024
3 tasks
Copy link
Member

@Lestropie Lestropie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few minor comments. Might be worth adding tests for some of the bespoke CLI handling. Otherwise happy to be merged once you're content.

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

cmd/mrcalc.cpp Show resolved Hide resolved
core/app.cpp Show resolved Hide resolved
@daljit46
Copy link
Member Author

Review feedback addressed

Copy link
Member

@Lestropie Lestropie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be good once the C++ CLI failure on Windows is identified.

Edit: Guessing it's this one:

testing_cpp_cli: [ERROR] Expected exactly 0 arguments (2 supplied)
testing_cpp_cli: [ERROR] Usage: testing_cpp_cli
testing_cpp_cli: [ERROR] Yours: testing_cpp_cli �bool false

MSYS2 is breaking either the em or en dash?

@daljit46
Copy link
Member Author

daljit46 commented Jul 8, 2024

@Lestropie I've spent a lot of time to try to understand why the parsing of utf-8 characters is failing on MSYS2, but without success. I've found that the MSYS2 port of GCC only supports the "C" and "POSIX" locales. Then I thought that the best thing to do would just to call std::setlocale(LC_ALL,"") to try to automatically select the user's preferred locale. This seem to work at first try with a simple test application I wrote, but when trying to parse out dash characters, the program didn't parse them correctly. To make things worse, when running the executables under CTest the detection of UTF-8 characters didn't even work.

I'm not sure what more I could try as the documentation on these matters is rather limited and sparsed all over the internet with no real source of truth.

@Lestropie
Copy link
Member

If that magnitude of effort invested hasn't yielded a solution, then I suggest that when running under MSYS2 we bypass the tests that verify handling of non-ASCII dashes. It's a nice-to-have to handle cases of people copy&pasting code examples from programs that have auto-bastardised text, but it's not mission critical.

daljit46 and others added 13 commits August 2, 2024 12:50
- Logic has been rewritten using STL algorithms. Not that previously we
were using pointer arithmetic to achieve this, which was technically
more efficient but less readable and safe. However, performance is
not a concern considering the data size involved.

- A new `index` member has been added to ParsedOption to identify an
option's position in the list of raw command-line arguments list.
This change reflects the same idea implemented in ParsedOption
Also adds a new test that verifies that mrcalc parsing of arguments
behaves as expected when using MRtrix3 standard options for commands.
Co-authored-by: Robert Smith <robert.smith@florey.edu.au>
@daljit46 daljit46 merged commit d3bb1b4 into dev Aug 13, 2024
6 checks passed
@daljit46 daljit46 deleted the no_char_arguments branch August 13, 2024 14:24
daljit46 added a commit that referenced this pull request Aug 21, 2024
In 6286158, as part of #2911,
ColourMap::maps was converted from a null-terminated C array to a
std::vector. However, the loops iterating over this list weren't updated
and were still relying on null-termination causing an invalid memory
access. This also resulted mrview crashing on MacOS.
daljit46 added a commit that referenced this pull request Aug 21, 2024
In 6286158, as part of #2911,
ColourMap::maps was converted from a null-terminated C array to a
std::vector. However, the loops iterating over this list weren't updated
and were still relying on null-termination causing an invalid memory
access. This also resulted mrview crashing on MacOS.
daljit46 added a commit that referenced this pull request Aug 21, 2024
In 6286158, as part of #2911,
ColourMap::maps was converted from a null-terminated C array to a
std::vector. However, the loops iterating over this list weren't updated
and were still relying on null-termination causing an invalid memory
access. This also resulted mrview crashing on MacOS.
Lestropie pushed a commit that referenced this pull request Aug 23, 2024
In 6286158, as part of #2911,
ColourMap::maps was converted from a null-terminated C array to a
std::vector. However, the loops iterating over this list weren't updated
and were still relying on null-termination causing an invalid memory
access. This also resulted mrview crashing on MacOS.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants