New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add `read --delimiter` #4256

Closed
wants to merge 23 commits into
base: master
from

Conversation

Projects
None yet
2 participants
@faho
Copy link
Member

faho commented Jul 27, 2017

Description

This is a continuation of #4160. The delimiter is now used as a complete string like in string split.

The IFS fallback remains - we should not remove it in 3.0, but we should deprecate it somehow.

Fixes issue #4156.

TODOs:

  • Changes to fish usage are reflected in user documentation/manpages.
  • Tests have been added
  • User-visible changes noted in CHANGELOG.md

krader1961 and others added some commits Jul 20, 2017

Copy local-exported variables
When executing a function, local-exported (`set -lx`) variables
previously were not accessible at all. This is weird e.g. in case of
aliases, since

```fish
set -lx PAGER cat
git something # which will call $PAGER
```

would not work if `git` were a function, even if that ends up calling
`command git`.

Now, we copy these variables, so functions get a local-exported copy.

```fish
function x
    echo $var
    set var wurst
    echo $var
end
set -lx var banana
x # prints "banana" and "wurst"
echo $var # prints "banana"
```

One weirdness here is that, if a variable is both local and global,
the local-copy takes precedence:

```fish
set -gx var banana
set -lx var pineapple
echo $var # prints "pineapple"
x # from above, prints "pineapple" and "wurst"
echo $var # still prints "pineapple"
set -el var # deletes local version
echo $var # "banana" again
```

I don't think there is any more consistent way to handle this - the
local version is the one that is accessed first, so it should also be
written to first.

Global-exported variables are _not_ copied, instead they still offer
full read-write access.
Fix local-exported vars with "--no-scope-shadowing"
This used to create copies even then, which meant it couldn't modify them.
Document local-exported variable change
This is a bit minimal, but I'm not sure how often it should be mentioned.
change how `argparse` handles boolean flags
When reporting whether a boolean flag was seen report the actual flags
rather than a summary count. For example, if you have option spec `h/help`
and we parse `-h --help -h` don't do the equivalent of `set _flag_h 3`
do `set _flag_h -h --help -h`.

Partial fix for #4226
fix `argparse` handling of short flag only specs
@faho noticed that option specs which don't have a long flag name are
not handled correctly. This fixes that and adds unit tests.

Fixes #4232
simplify `history` function
The fix for #4232 allows us to simplify the `history` function slightly.
first step in refactoring the `set` implementation
The *src/builtin_set.cpp* code needs a major refactoring. This is the
first baby step in doing so.

Partial fix for #4236
refactor `set` builtin
This completes the refactoring of the `set` builtin. It also removes a
seemingly never used feature of the `set` command. It also eliminates all
the lint warnings about this module.

Fixes #4236
remove some uses of `$IFS`
This is a step towards resolving issue #4156. It replaces uses of `$IFS`
with other solutions.
remove some uses of `$IFS`
This is a step towards resolving issue #4156. It replaces uses of `$IFS`
with other solutions.
Extract split_about from string
Put it into wcstringutil for use with builtin_read.
Implement `read --delimiter`
This takes a string that is then split upon like `string split`.

Unlike $IFS, the string is used as one piece, not a set of characters.

There is still a fallback to IFS if no delimiter is given, that
behaves exactly as before.

Work towards #4156.

@faho faho added the enhancement label Jul 27, 2017

@faho faho added this to the fish-3.0 milestone Jul 27, 2017

@faho

This comment has been minimized.

Copy link
Member

faho commented Jul 27, 2017

Note: I created this branch before pulling major again, so this shows a bunch of unrelated commits. The interesting ones are the last 5.

I'll merge this locally and push it once it has been accepted.

@faho

This comment has been minimized.

Copy link
Member

faho commented Jul 27, 2017

Also, split_about should probably interpret max = 0 to mean unlimited splits.

Currently, we pass a max of LONG_MAX instead, which (in theory) still limits the number of splits. Also, it requires including limits.h in builtin_read.

@krader1961

This comment has been minimized.

Copy link
Contributor

krader1961 commented Jul 27, 2017

Currently, we pass a max of LONG_MAX instead, which (in theory) still limits the number of splits.

While I agree that's the better API design I'm not worried about it. Because with 32 bit ints we would have to do more than 2 billion splits before unintentionally and prematurely terminating the loop. Something that is unlikely to ever happen. Even with 16 bit ints it is pretty unlikely to ever be a problem.

@faho

This comment has been minimized.

Copy link
Member

faho commented Jul 27, 2017

While I agree that's the better API design I'm not worried about it.

Sure. There's no real rush.

One more thing that would change is that currently string split -m 0 does no split. I'm not sure if that's useful or not.

Anyway, I just wanted to mention it since I'm touching the function and making it more prominent.

// not on the entire thing at once.
wcstring tokens;
tokens.reserve(buff.size());
bool empty = true;

This comment has been minimized.

@krader1961

krader1961 Jul 28, 2017

Contributor

I know you just moved this code but it seems like we can eliminate the empty var and just test tokens.empty().

@@ -39,7 +41,7 @@ The following options are available:

- `-z` or `--null` reads up to NUL instead of newline. Disables interactive mode.

`read` reads a single line of input from stdin, breaks it into tokens based on the `IFS` shell variable, and then assigns one token to each variable specified in `VARIABLES`. If there are more tokens than variables, the complete remainder is assigned to the last variable. As a special case, if `IFS` is set to the empty string, each character of the input is considered a separate token.
`read` reads a single line of input from stdin, breaks it into tokens based on the delimiter set via `-d`/`--delimiter` as a complete string or, if that has not been given the (deprecated) `IFS` shell variable as a set of characters, and then assigns one token to each variable specified in `VARIABLES`. If there are more tokens than variables, the complete remainder is assigned to the last variable. As a special case, if `IFS` is set to the empty string, each character of the input is considered a separate token.

This comment has been minimized.

@krader1961

krader1961 Jul 28, 2017

Contributor

Should there also be a reference to how read --delimiter behaves the same as string split? And vice-versa. The string split description points out that read --delimiter can be used to sometimes avoid the need to subsequently call string split.

@krader1961

This comment has been minimized.

Copy link
Contributor

krader1961 commented Jul 28, 2017

LGTM. One minor code simplification and a suggestion for the documentation to make it clear how this new feature relates to string split.

@faho

This comment has been minimized.

Copy link
Member

faho commented Jul 28, 2017

Merged as 78889cc..b1866b1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment