Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace Clap with a simpler crate #1549

Closed
Minoru opened this issue Mar 25, 2021 · 19 comments · Fixed by #1845
Closed

Replace Clap with a simpler crate #1549

Minoru opened this issue Mar 25, 2021 · 19 comments · Fixed by #1845
Labels
good first issue Working on this issue is an easy way to start with Newsboat development refactoring This issue describes a way in which some particular part of the code could be improved rust-port
Milestone

Comments

@Minoru
Copy link
Member

Minoru commented Mar 25, 2021

Out of all our dependencies, Clap takes the longest to compile according to cargo +nightly build -Z timings --package libnewsboat-ffi --release on my machine. We already disable all the Cargo features of this crate, but it's still too big. And we probably don't need it at all: we only use it to parse the arguments, while all the validation and help messages are handled by our code.

There must be simpler crates for this, something closer to C's getopt(). We should migrate to that.

The module to edit is rust/libnewsboat/src/cliargsparser.rs.

Be mindful of #1357. The new crate should support OsString, maybe even use it by default.

This sounds like an easy issue: pick a crate, change our module to use it, check that the tests still pass. So I'm adding a "good first issue" label. Please drop a comment below if you need some advice or run into problems — I'm glad to help!

@Minoru Minoru added good first issue Working on this issue is an easy way to start with Newsboat development rust-port refactoring This issue describes a way in which some particular part of the code could be improved labels Mar 25, 2021
@amritbrar
Copy link
Contributor

amritbrar commented Jul 16, 2021

lexopt supports OsString, has no dependencies, but is very new. Lexopt also provided a handy table at the bottom of the README comparing other parsers. Pico-args also looks interesting.

@Minoru
Copy link
Member Author

Minoru commented Jul 17, 2021

Thanks! Have you looked at the code of those two? Does they inspire confidence? I'll try to take a look tomorrow.

@amritbrar
Copy link
Contributor

I have taken a look at both, but my understanding of Rust is limited (currently learning). They are both minimal, thus easy for a Rust programmer to audit.

@Minoru
Copy link
Member Author

Minoru commented Jul 18, 2021

i spent about an hour trying to understand what the real difference between those two is, and failed. They aren't as simple as I expected them to be :) I'll make another go at them tomorrow.

I also noticed that our test suite for CliArgsParser doesn't check the behaviour for combined short arguments (.e.g. -vc /path/to/cache.db), unseparated options (e.g. -c/path/to/cache.db), and short options with = (e.g. -c=/path/to/cache.db). It's just a matter for adding new assertions to existing tests, so shouldn't be hard to do while replacing Clap.

@Minoru
Copy link
Member Author

Minoru commented Jul 19, 2021

I now reviewed both crates and I like lexopt slightly better. It's less mature, but OsString support is much better: in lexopt it's built-in, in pico-args you have to roll your own handling of -ovalue and --option=value.

@amritbrar, please add the tests I mentioned in the previous comment, and let's try lexopt!

amritbrar added a commit to amritbrar/newsboat that referenced this issue Jul 19, 2021
Add tests for:
- combined short options (e.g. `-vc /path/to/cache.db`)
- combined arguments and options (e.g. `-c/path/to/cache.db`)
- short options with `=` (e.g. `-c=/path/to/cache.db`)

See also: newsboat#1549
amritbrar added a commit to amritbrar/newsboat that referenced this issue Jul 20, 2021
Add cpp and rust tests for:
- combined short options (e.g. `-vc /path/to/cache.db`)
- combined arguments and options (e.g. `-c/path/to/cache.db`)
- short options with `=` (e.g. `-c=/path/to/cache.db`)

See also: newsboat#1549
amritbrar added a commit to amritbrar/newsboat that referenced this issue Jul 20, 2021
Add cpp and rust tests for:
- combined short options (e.g. `-vc /path/to/cache.db`)
- combined arguments and options (e.g. `-c/path/to/cache.db`)
- short options with `=` (e.g. `-c=/path/to/cache.db`)

See also: newsboat#1549
@amritbrar
Copy link
Contributor

amritbrar commented Jul 20, 2021

Now that I think about it, should we wait for lexopt to mature a little bit before replacing Clap? Maybe wait for it to reach 1.0.0? Or maybe more downloads?

@Minoru
Copy link
Member Author

Minoru commented Jul 21, 2021

Did you find any problems with the crate itself? I read its code, it seems good enough for our use-case, but that's just my understanding — you're the one who's doing the actual work, so you are better informed than me. Please share your findings :)

For this issue, I expect there to be two inconveniences:

  1. lexopt's approach to parsing is different from Clap's. You have to rewrite the entire parser;
  2. lexopt returns OsStrings so you have to think carefully what to do with them: you can convert to Path, you can convert to String lossily, or you can try to convert to String and panic if that fails.

The first one is just part of the job, no way around it. (If that helps, you can look for the PR that introduced Clap, and see how out previous parser looked like). The second one is split off into #1357, so for this issue, you can just call into_string().expect(…) everywhere and be done with it.

@amritbrar
Copy link
Contributor

I had a look at the crate and didn't find any issues with it, it's well documented and small enough for me to slowly get through. I was thinking that the crate is super new, has very few downloads, but that probably doesn't matter since it's fairly small.

@amritbrar
Copy link
Contributor

lexopt was also posted on r/rust. Good engagement going on, with author of pico-args also involved.
Reddit thread

@Minoru
Copy link
Member Author

Minoru commented Jul 22, 2021

Ah, cool! This comment and this document are particularly insightful.

@theIDinside
Copy link
Contributor

I will be investigating, crates other than clap. As far as I was aware, it was a wrapper around getopt, but at first glance it seems I was wrong in that asssumption (I am still pretty sure there is a thin wrapper crate around getopts though, out there).

@theIDinside
Copy link
Contributor

theIDinside commented Oct 18, 2021

Lexopt as previously mentioned, seems like a good candidate. However, it seems as though specific scenarios do not work, that currently work, at least according to tests. If we're willing to live without these then lexopt is fine:

  • short=value, like for instance -c=/some/file (if one uses short options, it has to be in the format of -c/some/file/or/value
    or -c /some/file/or/value)
  • single dash long options, such as -loglevel

We can also add some functionality of checking the long options provided, so that for instance, there's a long option called log-level and if the user inputs --lgo-levl we can provide some helpful error message like
"Unknown option "lgo-levl": Perhaps you meant log-level?" depending on a ceiling number of edits of the input required to reach a specific option, and then exit the application (we could even ask the user "did you mean 'log-level' y/n" and then replace it and continue parsing & run the application)

If (any of) this is desirable, I'll start tinkering with it and change the tests that currently test for inputs that the lexopt crate do not support.

@Minoru
Copy link
Member Author

Minoru commented Oct 18, 2021

short=value, like for instance -c=/some/file

Oh, good catch! I missed that, even though it's mentioned right in lexopt's README >_<

It doesn't sound catastrophic though. When Newsboat encounters a short option, it can just strip the leading = from the option's value.

single dash long options, such as -loglevel

I don't think those are supported by Newsboat. This particular one leads to an error saying "newsboat: oglevel: invalid loglevel value".

We can also add some functionality […]

Please file a separate issue about this. For this one, let's concentrate on refactoring, i.e. replacing Clap without affecting anything else.

@theIDinside
Copy link
Contributor

Oh, good catch! I missed that, even though it's mentioned right in lexopt's README >_<

Yeah :P I found some article some where going through different build times of different crates and lexopt is in the bottom 2 I believe in compile time (meaning fastest 2)

It doesn't sound catastrophic though. When Newsboat encounters a short option, it can just strip the leading = from the option's value.

Right, that's true, that seems rather trivial to handle actually. Just grab the underlying bytes of the OsString, convert s[0] to char and if == '=' then handle it accordingly.

I'll get to actually fixing this issue before filing a new issue with respect to the "extra helpful" error messages. It's just implementing a levensteihn distance calculation, which is fairly easy to do, but not important right now. I'll probably have something working within the next few days, maybe sooner.

@theIDinside
Copy link
Contributor

theIDinside commented Oct 18, 2021

Oh. My. F* god. I spent a few hours writing super error prone code (since we have to do a bunch of extra checking and splitting of = if we run into them) and guess what... .from_args(some_into_iter_type) is static - which makes the entire thing useless. That's what I get for not double, triple, quadruple, quintuple proof read and I think I'm pretty good at that.

So, it looks like forking the crate and hacking it to do what we want instead is better at this point. The crate is just 1k lines long, throwing in a bunch of functions, is probably easier, maybe I can even hack in -s=value options into it? That would make our life a lot easier. Because the code as it looks now (that isn't even working, due to from_args being bound to a 'static life time 😠) is uuuuuuuuugly.

It's way too late to be hacking away, I'll get to it tomorrow after work.

@Minoru
Copy link
Member Author

Minoru commented Oct 18, 2021

Okay, okay, calm down :) I'm not sure I understand the problem just yet. If I had to guess, you might be calling lexopt::Parser::from_args(opts.iter()) in CliArgsParser::new(); if that's the case, try opts.into_iter() instead. If there's something else going on, please make a draft PR, comment on the line that causes all this grief, and let me think what to be done here. Hopefully we can get out of this without forking lexopt :)

@theIDinside
Copy link
Contributor

theIDinside commented Oct 19, 2021

Okay, okay, calm down :) I'm not sure I understand the problem just yet. If I had to guess, you might be calling lexopt::Parser::from_args(opts.iter()) in CliArgsParser::new(); if that's the case, try opts.into_iter() instead. If there's something else going on, please make a draft PR, comment on the line that causes all this grief, and let me think what to be done here. Hopefully we can get out of this without forking lexopt :)

I'm calm and good now after a good night's sleep :) I was obviously way, way too tired and confused 😆 from_args should obviously work, I just saw errors involving "static" and thought "omg" - but using into_iter will obviously work. Lifetimes in Rust is something that never quite sticks with me to the degree that I feel comfortable knowing that I "got it right".

None the less though, hacking on it to make it accept -o=value types would be nice. I suppose I could file an issue with the repo, it's not exactly needed right now.

The code is going to look rather convoluted though/one big massive pattern match, but that's the price we pay for using lexopt - I still think it's the right choice though (initially I tried writing a macro to reduce the repeating patterns, I couldn't figure out how to do it though). I'll get back to you later today with what I got so far, once I get home.

theIDinside added a commit to theIDinside/newsboat that referenced this issue Oct 19, 2021
Added lexopt and required parsing loop.
Code is a bit more verbose, due to the nature of lexopt.
Added LexoptWrapper to be able to track last seen
option type, if it was in long form or short form.
This adds an additional check per option (not the values though).
The reason for this additional check was to reduce code bloat
and thus make it a lot easier to read, I figured this was a trade off
we're willing to do, since it's not in a time critical section of the
application.

Lexopt also does not come with built in functionality
for handling short options "-o=value", this
therefore also required some patching over,
by the local function strip_eq(). An issue
has been filed with the crate maintainer who has
agreed to implement the feature, so in the future
we can change this (and thus also remove the
"double checking" as mentioned above).
@theIDinside
Copy link
Contributor

theIDinside commented Oct 19, 2021

So that was what I came up with - if you think this code is ok enough to send a PR for (so we can discuss possible changes), let me know. I managed to reduce some of the verbosity and as stated in the commit, if/when the feature is implemented that I requested of the maintainer, some more verbosity / double checking can be removed with ease (since then, we don't have to ask for stripping the equal signs out of option values).

The compile time in comparison for lexopt is 0.7s on my machine compared to the 9.1s for clap, so quite a large reduction in compile time, with a total rebuild of the rust side reduced with more than 1.4s.

All the tests also pass.

@Minoru
Copy link
Member Author

Minoru commented Oct 24, 2021

This is fixed by #1845.

@Minoru Minoru closed this as completed Oct 24, 2021
@Minoru Minoru linked a pull request Oct 24, 2021 that will close this issue
6 tasks
@Minoru Minoru added this to the 2.26 milestone Dec 27, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Working on this issue is an easy way to start with Newsboat development refactoring This issue describes a way in which some particular part of the code could be improved rust-port
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants