Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

Various problems with word recognition #384

Closed
SanskritFritz opened this Issue · 40 comments

9 participants

@SanskritFritz

This concerns Alt-Left/Right and Alt-Backspace.
The recognition of a word on the command line has been changed to be more permissive. For example this is considered as one word by fish:
firstword|secondword;
hence Alt-Backspace deletes it by one stroke.
Also I remember "file.ext" being two words, now it is considered as one. I would like having the former behaviour again.

Also a bug with Alt-Backspace:
I have this on the command line: "locate completions" and the cursor stands on the c of "completions". When I hit Alt-Backspace, the word "locate" will be correctly deleted, but also the first character of "completions", leaving me with "ompletions".

@leoboiko

there's at least three definitons of word boundaries to think about: emacs, vi, and unicode… presumably users of emacs keybindings would prefer emacs words, and similarly for vi? unicode always sounds nice, but I'm not sure it's necessarily a good idea for the command-line...

@SanskritFritz

I just want the old behaviour back.

@stestagg
@leoboiko

Well, Emacs speakers will really expect alt+b to stop in the underscore in ls foo/my_dir[alt+b], whereas vi natives will really expect esc+b to stop on slash, and esc+shift+B to stop on space (and emacs+vi bilinguals like me will expect it to depend on keybinding). This is how bash does it, depending on whether it's in vi or emacs mode.

However, I agree that one could argue that both emacs and vim are living fossils, and that the "friendly interactive" shell could well choose more sensible word boundaries for commandline arguments. I'm just reviewing all the options.

@maxfl

I also have problems with new behaviour.
@stestagg is right about the context. I do not want word symbols to be like in emacs or like in vim. You want them to be comfortable for the commandline editing: First of all you want cursor to stop on space,/,=,-,.,>,^ and possibly _, because _ is often used as inner separator.

@SanskritFritz

Honestly I think we should find out what commit caused the change and revert that part.

@maxfl

It seems that behaviour's changed after commit: 8eb53ea

commit 8eb53ea
Author: ridiculousfish corydoras@ridiculousfish.com
Date: Thu Oct 4 14:35:03 2012 -0700

Rewrite kill behavior (aka control-W) to do something better
Fixes https://github.com/fish-shell/fish-shell/issues/327
@ridiculousfish

The following change makes word movement stop at separator boundaries: |, ;, <, etc. It also fixes the issue where the character under the cursor is deleted as well.

My thinking is that for command line shells, 'words' are either path components, which are separated by /, or tokens, which are separated by whitespace and characters like |, ;, <. However I am interested in hearing more discussion on the issue.

To git@github.com:fish-shell/fish-shell.git
2667868..44ce3e6 master -> master

@maxfl

It also should be comfortable to:
1) Work with option arguments: change --option=arg1 to --option=arg2, i.e. it word is divided by '='
2) Sometimes it also useful on changing long options:
--long-option1 -> --long-option2
which is possible when words are separated by '-'.
3) Changing the file extension (very often issue). One could think of file extension as path component. i.e. stop on '.'.
4) Work with comma-separated list. Stop on ','. {foo,bar,bar} is now a single word.
5) Work with long filenames with underscores. A lot of people use underscores to add some suffixes to the filename and it's often needed to change only a suffix. It would be very helpful to stop on underscore.
6) Quotes: "$foo"bar is considered to be a single word now. To change bar comfortably one should introduce '"' as word separator.
7) Brackets/parentheses: these are all words now $var[1], text(echo), text{text}.
8) Dollar sign: $PATH$PATH is also a single word.
9) '@' and ':' are also used in paths: username@hostname.com:some/path.
10) Wildcards to flexibly change patterns: foo*bar?
11) Tilde, plus sign, percent sign, backslash.

The difference with for example --option=argument1 and --option=argument2 is:
Either you press ^W twice/three times to delete the whole token, but you are able to delete argument1 with single ^W.
Either you can delete the whole token with a single ^W, but have to press backspace 9 times or to hold it for some time to delete the argument1. It works with all the examples, one can save several ^W/AltLeft/AltRight key pressures by killing ability to save much more backspace/left/right.

@leoboiko

@maxfl : Notice though that Emacs word separation (and therefore bash's default) stops in all the cases you give (namely [ "'/=-.>^_,]). So, with the cursor following 'r', meta-backspace in emacs/bash will delete the three characters "bar" in all of the following:

"$foo"bar
'$foo'bar
foo bar
foo/bar
foo=bar
foo-bar
foo.bar
foo>bar
foo^bar
foo,bar
foo_bar
@maxfl

@leoboiko, I'm pretty aware about it. The same will do vim and and most of the other sane editors.
I'm almost sure that we can simply copy emacs's/bash/vim behavior without any problems. But there is nothing wrong in attempt to understand which word separators we want to use and why.

@SanskritFritz

@ridiculousfish It is much better now, thank you. But please, pretty please, could you make it so that it would stop at dashes and periods as well? For example deleting a file extension with alt-backspace is very common for me and is natural in any editor I use.

@maxfl

By the way a possible solution for this is to introduce a variable holding word separators. It is not very fishy, I know. But I find this acceptable because it generally doesn't affect fish performance, but only user interaction. In this way it would be comparable to the color variables.

@darthdeus

Another argument for this is that both zsh and bash stop at ._- while fish doesn't.

A dot . it very handy when you're just changing a file extensions, or part of a URL.

Dashes - and underscores _ on the other hand are often used for kind of namespacing things and in URLs. For example you might have something like some-app-with-very-long-name-1.example.com. It is impossible to navigate parts of the URL with M-b and M-f and you basically have to use arrows to find the right place.

Another example might be git branches, where you can have hotfix-hamburger-issue and you want to change it to hotfix-cheesburger-issue. In zsh you can press M-b M-b M-d and end up with hotfix-<cursor here>-issue. But in fish it means you have to use arrows for this.

In VIM you can easilly do something like F-cT-, or F_cT_ in case of underscores (because b in VIM would also jump over the whole thing in case of underscores). So having it behave the same way as VIM does also makes no sense, because it is a completely different context.

@SanskritFritz

I fully agree with @darthdeus here. Please, pretty please make fish stop again at periods and dashes and stuff.

@maxfl

No much discussion still.

@ridiculousfish, can you try the following example:
Use this command line and try to change longvariableB/suffixB to something else. Start from the first symbol and try to use AltF to jump the word boundaries:
echo $longvariableA$longvariableB$onemorelongvaria
echo /some/long/path/directoryA_{suffix1,suffixB,OneMoreSuffixC}_continuation/file.dat

Sorry for being bothering.

@stestagg

If this is that important to you, I would suggest you apply patch #410

This allows you to set the word separator characters as you need, by changing the FISH_WORD_SEPARATORS environment variable

@maxfl
@stestagg

Yeah, I know @ridiculousfish doesn't want to put it into the repo, but with five of us wanting the same thing, I figured advertising it here would allow some people to stash it locally if they're compiling off the default branch.

Thanks

@JanKanis
Collaborator

I'm also in favor of the old behavior!

@ridiculousfish

I'm still considering it and reading discussions.

@ridiculousfish

@maxfl, in your example, stopping at commas and { } seems good. I think it was just an oversight that I omitted those.

@darthdeus, when I try with bash and zsh, it does not stop at . or _:

echo hello_world.txt
                    ^

I hit control-W and it results in:

echo 
    ^

with both bash and zsh. Is there some configuration option that we have set differently? As far as I know I'm using the default behavior.

@SanskritFritz

I thought fish boasts of being more convenient and easy to use than bash ;)
In my eyes not stopping at periods is a huge disadvantage of both bash and zsh.

@ridiculousfish

@SanskritFritz I agree, fish should do the right thing, regardless of what bash and zsh do. I'm just interested in understanding the behavior that darthdeus sees.

Here's one case I run into a lot:

cd /Users/me/github/fish-shell/fish.o<Ctrl-w>

I want to cd to a directory open in Finder, so I drag a file from Finder into Terminal, then hit Ctrl-W to delete the last path component. Deleting only the extension would be frustrating.

Maybe I want to go back further:

cd /Users/me/github/fish-shell/<Ctrl-w>

I would again find it frustrating to stop at cd /Users/me/github/fish- because there's no other directory with that prefix.

It's possible my resistance to the old behavior extends only to cd. How would people feel about different behavior when navigating a cd command? When using cd it stops at whole path components, and in other cases it stops at '.-_' within a path component.

(I'm also considering whether it would make sense to stop at path extensions only if there is a file with the same name but different extension. However that would require doing I/O, which would introduce a lot of complications, so I'm leaning against it.)

@SanskritFritz

@ridiculousfish So maybe configurability isn't so bad after all? We all have our preferences, staying with your first example for me it is frustrating that ctrl-w doesn't stop at the extension :)
We have sacrificed against zero configurability already enough with the ability to set colours to make another exception here and apply the FISH_WORD_SEPARATORS patch maybe.

@pgan002

@ridiculousfish
Bash has two relevant functions:
unix-word-rubout (C-w)
Kill the word behind point, using white space as a word boundary. The killed text is saved on the kill-ring. backward-kill-word (M-Rubout)
Kill the word behind point. Word boundaries are the same as those used by backward-word.
backward-word (M-b)
Move back to the start of the current or previous word. Words are composed of alphanumeric characters (letters and digits).

> echo hello_world.txt<C-w>
> echo 
> echo hello_world.txt<M-backspace>
> echo hello_world<M-backspace>
> echo hello_<M-backspace>
> echo 
@pgan002

@SanskritFritz
Configuration makes the user think about the interface instead of use it. It tempts us away from making a solution that does the right thing in all (most) cases. It lets the user mess up their shell, and makes documentation harder to write and understand, and code harder to maintain. For those who like configurability, we already have Zsh.

Sometimes configuration is necessary because we have very disparate users or use cases, but let's try very hard to make it easy to use for all use cases. We can do that in this case.

@pgan002

For usability, I think these principles are important:

  1. Moving and deleting should use the same word separators
  2. Only one way to delete, unlike Bash
  3. No context-specific behavior (such as whether you are editing an existing path)
  4. No configuration, except maybe inputrc.
  5. A simple rule
  6. Few keystrokes for common use cases
  7. Be similar to text editors and other shells (least surprise)

I propose to stop at any non-alphanumeric characters.

This is also what most modern editors do, and what Bash does. This is the set the 11 points proposed by @maxfl a month ago, plus the following:
12. > - Easy to edit averylongcommandname>averylongfilename
13. <%\^!& - Simplicity and consistency (Principe 5). Also for easy editing of regular expressions and expressions for test, sed, awk, expr etc.

Should we stop at repeated word delimiters, like echo >> filename? My gut says no.

Should we stop at adjacent word delimiters, like ls abc*{,1,2,3}?zzz?

Should we stop only before the word delimiter, like Bash:

hello_world<C-backspace>
hello

or before and after, like many text editors:

hello_world<C-backspace>
hello_<C-backspace>
hello
@darthdeus

@ridiculousfish C-w is not what I meant. There's a difference between C-w and M-b and M-backspace (not sure how those are exactly represented), but

echo hello_world.txt<M-backspace>
echo hello_world.<M-backspace>
echo hello_

M-d also behaves the same way, so that it is easy to use M-b to jump in between of somewhere, and then just forward delete a part of something with M-d. For example if you have

ssh foo@bar.com:/var/apps

try doing M-b M-b M-b M-d M-backspace or M-b M-b M-b M-b M-d M-d

which will yield in

ssh foo@<cursor here>:/var/apps

on the other hand, in fish you have to use arrow keys, because you can only navigate like this

ssh foo@bar.com:/<cursor here>var/apps

and from that point if you try M-backspace, you'll lose the whole expression

ssh <cursor here>var/apps

There are a lot of times when I need to just delete a part of some command and I have to use arrow keys and keep tapping C-d or backspace so delete a word, because M-d would delete the whole command.

@maxfl

I agree with @pgan002 about avoiding and complexities like context-dependent behaviour or different behaviour for different keys.

@darthdeus

@pgan002 different behaviour for different keys is the default how all of the other shells behave. That does not mean it is more complicated, because everyone who has been using emacs-like keybindings in shells is used to these.

@ridiculousfish

@darthdeus Thank you for the explanation - I was unaware of the difference between C-w and M-backspace.

I see no reason to expect that C-w and M-backspace, being totally different key combinations, should have the same behavior; in fact doing so would seem like needless duplication. Realizing that other shells have different behavior for C-w and M-backspace makes me drop any objection to different word navigation, so I'll restore the behavior of stopping on punctuation for M-f, M-b, and M-backspace, and we can discuss further from there.

@pgan002

Having the C-w behavior as a second method of deleting has two advantages that I can think of:
1. It is more flexible
2. Users of other shells won't miss it

It has several disadvantages:
1. It is not orthogonal
2. One of the methods is not very discoverable once you know the other (as illustrated by ridiculousfish)
3. If you do discover it, it can be confusing. Also (as far as I can see) there is no corresponding method of navigating.
3. It is more complicated to explain
4. It impedes habituation because the user has to think about which method to use

I think advantage 1 is minor. I expect most users would naturally habituate to using one method because otherwise they would have to think about which method is appropriate every time they want to delete. This is a big reason to strive for orthogonality in general.

Advantage 2 is minor because:
a. Getting used to using the only way to do something is relatively easy, especially if it is the way you are already used to using in GUI text editors
b. Being like other shells is not a high priority for Fish.

Sure, some of the disadvantages are minor too. But on the whole, I feel the arguments are strong against having an additional deletion behavior.

@cben
@ridiculousfish

These two commits should cause meta-f, meta-b, and meta-backspace to stop at more punctuation and more closely match what fish 1.x did.

6b35250
0b1e371

I'm leaving this issue open to permit more discussion.

@maxfl

@ridiculousfish, thank you very much!

Obviously it's now missing 'kill-path-component' in addition to 'backward-kill-path-component'.

There is also an open question: what should be used for partial auto-completion by AltF: word separator or path separator? I myself have not decided yet, so I just raise the question.

@SanskritFritz

I'm happy now :) Thanks @ridiculousfish !

@darthdeus

@ridiculousfish perfect, thank you sooooo much :beers: It works exactly as I imagined :)

@SanskritFritz

May I close this issue, or should we wait for more comments?

@davidzchen davidzchen referenced this issue from a commit
@ridiculousfish ridiculousfish Changes to work recognition per #384
Word movement should be very similar to fish 1.x
backward-kill-word remains more liberal, but now stops at any of {,'"=}
0b1e371
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.