New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Default `string split` to removing empty entries with option to keep #4779

Closed
wants to merge 1 commit into
base: master
from

Conversation

Projects
None yet
3 participants
@mqudsi
Copy link
Contributor

mqudsi commented Mar 5, 2018

The official fish documentation makes no mention of how string split treats
empty tokens, e.g. splitting 'key1##key2' on '#' or (more confusingly)
splitting '/path' on '/'. With this commit, string split by default excludes
zero-length substrings from the resulting array, but gains a new
--keep-empty/-k option to explicitly include them in the results.

Author's note: I'm unsure if there is any code out there that will be broken
by this. Whether we default to keeping empty substrings or not, I believe we
need to officially document it one way or the other for fish 3.0. My
preference is to exclude empty substrings because there isn't much you can do
with them and they can result in unexpected (but wholly correct) output
depending on the nature of the input, e.g. while string split 'key1##key2' '#' returning an empty string in the middle makes sense, string split /some/path/to/a/file / returning an empty string at the start of the array is
less obvious.

Default `string split` to removing empty entries with option to keep
The official fish documentation makes no mention of how `string split`
treats empty tokens, e.g. splitting 'key1##key2' on '#' or (more
confusingly) splitting '/path' on '/'. With this commit, `string split`
by default excludes zero-length substrings from the resulting array, but
gains a new `--keep-empty/-k` option to explicitly include them in the
results.

Author's note: I'm unsure if there is any code out there that will be
broken by this. Whether we default to keeping empty substrings or not, I
believe we need to officially document it one way or the other for fish
3.0. My preference is to exclude empty substrings because there isn't
much you can do with them and they can result in unexpected (but wholly
correct) output depending on the nature of the input, e.g. while `string
split 'key1##key2' '#' returning an empty string in the midddle makes
sense, `string split /some/path/to/a/file /` returning an empty string
at the start of the array is less obvious.
@mqudsi

This comment has been minimized.

Copy link
Contributor

mqudsi commented Mar 5, 2018

As mentioned, I believe this needs to be implemented one way or the other (with either this commit's --keep-empty or its complement, --no-empty). I'll hold off on updating the docs until I see how everyone feels about defaulting to --no-empty-like behavior.

@faho

This comment has been minimized.

Copy link
Member

faho commented Mar 6, 2018

I'm kinda wary of changing the default here. That would break a bunch of stuff.

E.g. setenv has:

set -gx $var (string split -- ':' $val)

given that empty elements in $PATH have a specific meaning, that would change behavior.

Other things would benefit from this - the hg completions have

set -l parts (string trim $line | string split " " -m 1)

which just looks like an attempt to remove the empty elements at the beginning and end.

But that would be alright as an additional option.

So: "--no-empty" please!

@faho faho added this to the fish-3.0 milestone Mar 6, 2018

@mqudsi

This comment has been minimized.

Copy link
Contributor

mqudsi commented Mar 10, 2018

Nice catch on the path splitting. I question whether that particular case had been considered at the time or if it's just a lucky coincidence! I actually didn't know there could be an empty entry in the PATH, what meaning does that have?

It's perhaps a question of utility, however. If we are taking this opportunity to break things, I suppose we should ask "how much will break" and "is the future benefit worth it" (i.e. to which degree is either option "the intended behavior" when used)?

Actually, to take the example of PATH we were just discussing. Is the effect of an empty entry in the PATH more often a confusing side effect one wasn't expecting, or purposeful behavior desired and consciously achieved?

@faho

This comment has been minimized.

Copy link
Member

faho commented Mar 13, 2018

I question whether that particular case had been considered at the time or if it's just a lucky coincidence!

We have had discussions about empty $PATH components, but IIRC not in that context.

I actually didn't know there could be an empty entry in the PATH, what meaning does that have?

Same thing as "." - i.e. the current $PWD. (Yeah, not very exciting)

If we are taking this opportunity to break things, I suppose we should ask "how much will break" and "is the future benefit worth it"

I'd add a third and fourth question:

  • How easy is it to find breakage?

  • How easy is it to be cross-compatible with older fishes?

And the former is the big issue here. You need to check every single use of string split. And you need to actually think about it. There's no easy way to grep for "string split, but with this option".

Compared with e.g. my {,,,} change. To check, you simply grep for {.*,,.*} or similar. And to be cross-compatible, you just avoid that syntax. (same goes for {}, but that was useless before 3.0)

Or the read -s change - you grep for read -s, and then you get a couple results (it's used much less than string split is). You look at that and it's not too hard to see if it was supposed to mean "silent" or "shell".

It's not that I'm against changing things for the better, it's that I want to have a smooth upgrade path for users.

@mqudsi

This comment has been minimized.

Copy link
Contributor

mqudsi commented Mar 13, 2018

That's certainly fair enough and very well put. Obviously it makes my job easier, too 😉

@faho

This comment has been minimized.

Copy link
Member

faho commented Mar 25, 2018

I'm closing this PR now.

@mqudsi: If you change the default, you can just merge it. (I'd probably call it "--remove-empty" or "--filter-empty" or something then)

@faho faho closed this Mar 25, 2018

@mqudsi mqudsi deleted the mqudsi:string_split_keep_empty branch Mar 29, 2018

@zanchey zanchey modified the milestones: fish-3.0, will-not-implement Aug 10, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment