Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewrite builtin functions in rust #9944

Merged
merged 6 commits into from Aug 13, 2023
Merged

Conversation

faho
Copy link
Member

@faho faho commented Aug 10, 2023

Description

This is a rough draft of rewriting builtin functions in rust. Basically the first time it passed the tests.

It fixes two pre-existing bugs:

  1. functions --details foo bar will mention whatever the last option is in the error - so functions --details --verbose foo bar will mention "--verbose". I don't believe the error is all that great anyway, but now it always uses --details because that's the "mode" that functions is in
  2. The event::print code compared the enum values too deeply, and so it printed the "Event thing" header too often

Honestly the second felt extremely roundabout and awkward, and I'm pretty sure there's a nicer way.

In terms of implementation decisions I mostly decided not to make a report_function_metadata helper function because using that many mutable variables felt like going against the grain and didn't seem like it would have helped a lot.

TODOs:

  • Changes to fish usage are reflected in user documentation/manpages.
  • Tests have been added for regressions fixed
  • User-visible changes noted in CHANGELOG.rst

if opts.list || args.is_empty() {
let mut names = function::get_names(opts.show_hidden);
names.sort();
if !streams.out_is_redirected && unsafe { isatty(STDOUT_FILENO) == 1 } {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've had this a few times, a helper method might be nice to avoid multiple "unsafe" blocks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once IO is ported, we can bump our MSRV to 1.70 so we can use is_terminal. Unless that has some behavior-difference I am not aware of, have not seen the internals of it

Copy link
Member Author

@faho faho Aug 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless that has some behavior-difference I am not aware of, have not seen the internals of it

There is no behavior difference because the internals are:

https://github.com/rust-lang/rust/blob/4d7a80d48697171ed151c4667b74fc7d784c1836/library/std/src/sys/unix/io.rs#L79-L82

It does the exact same thing we already do - excluding the "out_is_redirected" check, so it would only replace the "isatty" part. (isatty is defined to only return 1 or 0, so the "==" vs "!=" makes no difference)

I would simply add a helper method to streams - streams.out_is_terminal() that encapsulates both checks.

I know that projects in the rust ecosystem like to move very fast wrt MSRV, and like to include a lot of dependencies, but that's not to my taste and I consider it to be one of the annoyances of the ecosystem. I would like to require a more compelling reason to increase MSRV or take dependencies than to replace one line with a slightly shorter one.

Of course once we have an MSRV of 1.70 for other reasons, it's fine to use is_terminal - but even then we would still keep the helper because of the "out_is_redirected" check. (unless we decide to change that behavior)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

22d92b8 should be the answer here.

res += 1;
}
} else if let Some(props) = function::get_props_autoload(arg, parser) {
let path = props.definition_file().unwrap_or(L!(""));
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is awkward, taken from type. This should probably be a match, and could also be a helper function.

Would that go in builtins/shared.rs?

.into_iter()
.filter(|b| *b)
.count()
> 1
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seemed like the cleanest way to do this sort of "only one" check. The C++ just added the bools together, that's possible here but would require an as i32 cast for every bool.

} else {
res += 1;
}
first = false;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(this always feels extremely clunky in any programming language for what is a very common thing - "only print a newline for every following entry")

fish-rust/src/event.rs Outdated Show resolved Hide resolved
@@ -9,6 +9,10 @@ end
functions --details f1 f2
#CHECKERR: functions: --details: expected 1 arguments; got 2

# Verify that it still mentions "--details" even if it isn't the last option.
functions --details --verbose f1 f2
#CHECKERR: functions: --details: expected 1 arguments; got 2
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This used to print "--verbose: expected"


functions --handlers-type signal
# CHECK: Event signal
# CHECK: SIGTRAP fish_sigtrap_handler
Copy link
Member Author

@faho faho Aug 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This used to print a second "Event signal" after the SIGTRAP line

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not print an extra "Event signal" on LastC++11, and the output was sorted reverse of what it now prints, this appears to then 1. fix an earlier regression introduced in the Rust code, and 2. there is a behavior difference in the sorting, which is probably fine, but should perhaps be mentioned in the release notes?

I did a git bisect and it points to 9ac6cbe being the first commit that sorted it to how it is now.

This was the output of running the test after git checkout LastC++11

Testing file checks/functions.fish ... Failure:

  The CHECK on line 186 wants:
    SIGTERM term1

  which failed to match line stdout:58:
    SIGTERM term3

  Context:
    [...] from line 164 (stdout:55):
    end
    Event signal
    SIGTRAP fish_sigtrap_handler
     <= nothing to match CHECK on line 186: SIGTERM term1
     <= nothing to match CHECK on line 187: SIGTERM term2
    SIGTERM term3
    SIGTERM term2 <= no more checks
    SIGTERM term1

  when running command:
    ../test/root/bin/fish checks/functions.fish

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix an earlier regression introduced in the Rust code

Yes, the issue was new in the rust version, caused by enum comparisons being deep.

Probably something to look out for - if we replace C++ enums with Rust enums, the C++ version only compares variants.

  1. there is a behavior difference in the sorting, which is probably fine, but should perhaps be mentioned in the release notes?

I do not believe this is a thing that is expected to be stable. This isn't sorted alphabetically, it's sorted by definition order - and used to be in reverse definition (oldest last) order.

@faho faho marked this pull request as ready for review August 12, 2023 15:56
Copy link
Contributor

@henrikhorluck henrikhorluck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly nits, some wrong short-options that should have tests, and the event-signal appears to have a sorting difference that should be mentioned in release-notes or reverted, otherwise LGTM

}
}

const NO_METADATA_SHORT: char = 2 as char;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: We use NONOPTION_CHAR_CODE elsewhere, except for abbr.rs which needs 1–3. No need to change

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I quite misunderstood what NONOPTION_CHAR_CODE meant, then?

So it's used to be the char code for the one option that is long-only?

And if we have multiple options that are long-only?

Tbh I prefer naming the constant after the option.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It appears I might be one misunderstanding how NONOPTION_CHAR_CODE is supposed to be used then, I just assumed it should be used instead of 1 as char / '\x01' / '\u{1}', which is how I have used it in builtins/string/{escape,unescape} and builtins/path.

Looking at the commit that introduced it, it appears '\x01' has a special meaning as a non-option argument (duh), but only if the ordering is REQUIRE_ORDER | RETURN_IN_ORDER, which is set by + and - in short_opts respectively.

When the ordering is the default aka PERMUTE, \x01 does not have that special meaning.

My takeaway here is:

  1. I dislike wgetoptr_t, since it mixes special meanings with legitimate return values, only configured through a char in a string. A minimal way of improving it would be to instead return an enum like:
enum WGetOptRetVal {
    LongOpt(usize), // where the usize is the "id" of the long-only option
    Opt(char),
    NonOption,
}

We could also make wopt accept using a literal number, making it result in a different enum-variant, to make this difference in usage much clearer. (This requires a IntoOption trait)

  1. We should probably avoid using \x01 (regardless of whether it is as NONOPTION_CHAR_CODE or another named constant) as a short-option for long-only arguments to avoid accidentally mixing this usage, in that way your change to 2 instead of 1 as was present in the original code is good.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dislike wgetoptr_t, since it mixes special meanings with legitimate return values, only configured through a char in a string

Yup, it has some real crusty C heritage.

A minimal way of improving it would be to instead return an enum like:

This would return a different value for -D and --details, which is awkward. I don't think you could match like

LongOpt(id) | Opt(char) => {
}

even if you called the variables the same?

We should probably avoid using \x01 (regardless of whether it is as NONOPTION_CHAR_CODE or another named constant) as a short-option for long-only arguments to avoid accidentally mixing this usage

Agreed. Tho note that "PERMUTE" is the overwhelming majority of uses and changing from that to a different ordering is a pretty big change that is never going to happen for most uses.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would return a different value for -D and --details, which is awkward. I don't think you could match like
LongOpt(id) | Opt(char)
even if you called the variables the same?

Ah, I named it a bit unspecificly, I intended LongOpt to be LongOnlyOpt, if that makes more sense? And then keep the same behaviour for Opt if the option has both long and short variants

But a move towards something like your interpretation would make it trivial to determine whether we got the argument as a short- or long-opt, just with another enum-variant.

We could also add more variants to move the optarg to this enum, avoiding all the w.optarg.unwrap()-s.

fish-rust/src/builtins/functions.rs Outdated Show resolved Hide resolved
fish-rust/src/builtins/functions.rs Outdated Show resolved Hide resolved
fish-rust/src/builtins/functions.rs Outdated Show resolved Hide resolved
fish-rust/src/builtins/functions.rs Outdated Show resolved Hide resolved
fish-rust/src/builtins/functions.rs Outdated Show resolved Hide resolved
fish-rust/src/builtins/functions.rs Outdated Show resolved Hide resolved
fish-rust/src/builtins/functions.rs Outdated Show resolved Hide resolved

functions --handlers-type signal
# CHECK: Event signal
# CHECK: SIGTRAP fish_sigtrap_handler
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not print an extra "Event signal" on LastC++11, and the output was sorted reverse of what it now prints, this appears to then 1. fix an earlier regression introduced in the Rust code, and 2. there is a behavior difference in the sorting, which is probably fine, but should perhaps be mentioned in the release notes?

I did a git bisect and it points to 9ac6cbe being the first commit that sorted it to how it is now.

This was the output of running the test after git checkout LastC++11

Testing file checks/functions.fish ... Failure:

  The CHECK on line 186 wants:
    SIGTERM term1

  which failed to match line stdout:58:
    SIGTERM term3

  Context:
    [...] from line 164 (stdout:55):
    end
    Event signal
    SIGTRAP fish_sigtrap_handler
     <= nothing to match CHECK on line 186: SIGTERM term1
     <= nothing to match CHECK on line 187: SIGTERM term2
    SIGTERM term3
    SIGTERM term2 <= no more checks
    SIGTERM term1

  when running command:
    ../test/root/bin/fish checks/functions.fish

fish-rust/src/event.rs Outdated Show resolved Hide resolved
@faho
Copy link
Member Author

faho commented Aug 13, 2023

Okay, reformat_for_screen is currently broken - it does not seem to ever finish. Running functions interactively let's fish spin and isn't even interruptible with ctrl-c.

I'm gonna try to find out where the issue is, if I can't find it I'm gonna remove it from functions for now.

Edit: Alright, found it. Mainly one check was the wrong way around, and another incorrectly compared to 0 instead of a variable that is initially set to 0.

@faho faho force-pushed the riir-functions-2 branch 2 times, most recently from b737e78 to 73bc55a Compare August 13, 2023 11:56
event filter names, function::set_desc, common::reformat_for_screen

This is the first use for each
Turns out doing `==` on Enums with values will do a deep comparison,
including the values.

So EventDescription::Signal(SIGTERM) is !=
EventDescription::Signal(SIGWINCH).

That's not what we want here, so this does a bit of a roundabout thing.
This had an infinite loop because it had two checks broken
This encapsulates a "is our output going to the terminal" check we do
in a few places - functions, type, set_color, possibly test
And the C++ reformat_for_screen and event_filter_names as there are no more users.
@faho faho merged commit 5f0df35 into fish-shell:master Aug 13, 2023
7 checks passed
@faho faho deleted the riir-functions-2 branch August 13, 2023 12:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants