Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add read --delimiter #4256

Closed
wants to merge 23 commits into from
Closed

Add read --delimiter #4256

wants to merge 23 commits into from

Conversation

@faho
Copy link
Member

@faho faho commented Jul 27, 2017

Description

This is a continuation of #4160. The delimiter is now used as a complete string like in string split.

The IFS fallback remains - we should not remove it in 3.0, but we should deprecate it somehow.

Fixes issue #4156.

TODOs:

  • Changes to fish usage are reflected in user documentation/manpages.
  • Tests have been added
  • User-visible changes noted in CHANGELOG.md
krader1961 and others added 23 commits Jul 20, 2017
When executing a function, local-exported (`set -lx`) variables
previously were not accessible at all. This is weird e.g. in case of
aliases, since

```fish
set -lx PAGER cat
git something # which will call $PAGER
```

would not work if `git` were a function, even if that ends up calling
`command git`.

Now, we copy these variables, so functions get a local-exported copy.

```fish
function x
    echo $var
    set var wurst
    echo $var
end
set -lx var banana
x # prints "banana" and "wurst"
echo $var # prints "banana"
```

One weirdness here is that, if a variable is both local and global,
the local-copy takes precedence:

```fish
set -gx var banana
set -lx var pineapple
echo $var # prints "pineapple"
x # from above, prints "pineapple" and "wurst"
echo $var # still prints "pineapple"
set -el var # deletes local version
echo $var # "banana" again
```

I don't think there is any more consistent way to handle this - the
local version is the one that is accessed first, so it should also be
written to first.

Global-exported variables are _not_ copied, instead they still offer
full read-write access.
This used to create copies even then, which meant it couldn't modify them.
This is a bit minimal, but I'm not sure how often it should be mentioned.
When reporting whether a boolean flag was seen report the actual flags
rather than a summary count. For example, if you have option spec `h/help`
and we parse `-h --help -h` don't do the equivalent of `set _flag_h 3`
do `set _flag_h -h --help -h`.

Partial fix for fish-shell#4226
@faho noticed that option specs which don't have a long flag name are
not handled correctly. This fixes that and adds unit tests.

Fixes fish-shell#4232
The fix for fish-shell#4232 allows us to simplify the `history` function slightly.
The *src/builtin_set.cpp* code needs a major refactoring. This is the
first baby step in doing so.

Partial fix for fish-shell#4236
This completes the refactoring of the `set` builtin. It also removes a
seemingly never used feature of the `set` command. It also eliminates all
the lint warnings about this module.

Fixes fish-shell#4236
This is a step towards resolving issue fish-shell#4156. It replaces uses of `$IFS`
with other solutions.
This is a step towards resolving issue fish-shell#4156. It replaces uses of `$IFS`
with other solutions.
Put it into wcstringutil for use with builtin_read.
This takes a string that is then split upon like `string split`.

Unlike $IFS, the string is used as one piece, not a set of characters.

There is still a fallback to IFS if no delimiter is given, that
behaves exactly as before.

Work towards fish-shell#4156.
@faho faho added this to the fish-3.0 milestone Jul 27, 2017
@faho
Copy link
Member Author

@faho faho commented Jul 27, 2017

Note: I created this branch before pulling major again, so this shows a bunch of unrelated commits. The interesting ones are the last 5.

I'll merge this locally and push it once it has been accepted.

Loading

@faho
Copy link
Member Author

@faho faho commented Jul 27, 2017

Also, split_about should probably interpret max = 0 to mean unlimited splits.

Currently, we pass a max of LONG_MAX instead, which (in theory) still limits the number of splits. Also, it requires including limits.h in builtin_read.

Loading

@krader1961
Copy link
Contributor

@krader1961 krader1961 commented Jul 27, 2017

Currently, we pass a max of LONG_MAX instead, which (in theory) still limits the number of splits.

While I agree that's the better API design I'm not worried about it. Because with 32 bit ints we would have to do more than 2 billion splits before unintentionally and prematurely terminating the loop. Something that is unlikely to ever happen. Even with 16 bit ints it is pretty unlikely to ever be a problem.

Loading

@faho
Copy link
Member Author

@faho faho commented Jul 27, 2017

While I agree that's the better API design I'm not worried about it.

Sure. There's no real rush.

One more thing that would change is that currently string split -m 0 does no split. I'm not sure if that's useful or not.

Anyway, I just wanted to mention it since I'm touching the function and making it more prominent.

Loading

// not on the entire thing at once.
wcstring tokens;
tokens.reserve(buff.size());
bool empty = true;
Copy link
Contributor

@krader1961 krader1961 Jul 28, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know you just moved this code but it seems like we can eliminate the empty var and just test tokens.empty().

Loading

@@ -39,7 +41,7 @@ The following options are available:

- `-z` or `--null` reads up to NUL instead of newline. Disables interactive mode.

`read` reads a single line of input from stdin, breaks it into tokens based on the `IFS` shell variable, and then assigns one token to each variable specified in `VARIABLES`. If there are more tokens than variables, the complete remainder is assigned to the last variable. As a special case, if `IFS` is set to the empty string, each character of the input is considered a separate token.
`read` reads a single line of input from stdin, breaks it into tokens based on the delimiter set via `-d`/`--delimiter` as a complete string or, if that has not been given the (deprecated) `IFS` shell variable as a set of characters, and then assigns one token to each variable specified in `VARIABLES`. If there are more tokens than variables, the complete remainder is assigned to the last variable. As a special case, if `IFS` is set to the empty string, each character of the input is considered a separate token.
Copy link
Contributor

@krader1961 krader1961 Jul 28, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there also be a reference to how read --delimiter behaves the same as string split? And vice-versa. The string split description points out that read --delimiter can be used to sometimes avoid the need to subsequently call string split.

Loading

@krader1961
Copy link
Contributor

@krader1961 krader1961 commented Jul 28, 2017

LGTM. One minor code simplification and a suggestion for the documentation to make it clear how this new feature relates to string split.

Loading

@faho
Copy link
Member Author

@faho faho commented Jul 28, 2017

Merged as 78889cc..b1866b1.

Loading

@faho faho closed this Jul 28, 2017
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Apr 17, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

2 participants