Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tab completion menu causes inconsistent editor state in emacs #71

Closed
posguy99 opened this issue Jul 11, 2020 · 37 comments
Closed

Tab completion menu causes inconsistent editor state in emacs #71

posguy99 opened this issue Jul 11, 2020 · 37 comments
Labels
bug Something is not working

Comments

@posguy99
Copy link

Line editing is set thus:

[44] mbp13 $ set -o
emacs                    on
vi                       off
viraw                    on

Start typing, get to bin/p, hit TAB. Nothing happens. Hit TAB again. (why do I have to hit TAB twice?) Get:

[04:59 PM][ttys001][~]
[45] mbp13 $ bin/p
1) pkg-config
2) pscal
3) pscal.saf
[45] mbp13 $ bin/pscal       

Hit '2', hit TAB again, shell fills out bin/pscal.

[45] mbp13 $ bin/pscal

Now, backspace over the command line...

[04:59 PM][ttys001][~]
[45] mbp13 $ bin/p
1) pkg-config
2) pscal
3) pscal.saf
[45] mbp13 $ b 

I'm left with the 'b', which isn't actually there, I can hit ENTER and not get an error message, and it's not stored in history.

10.15.5, ksh93u+m, ksh93u+ does it too.

@McDutchie
Copy link

McDutchie commented Jul 11, 2020

Start typing, get to bin/p, hit TAB. Nothing happens. Hit TAB again. (why do I have to hit TAB twice?) Get:

I'm pretty sure that having to hit TAB twice there is normal, but I don't have access to the Korn Shell book right now to verify it.

edit: It's in sh.1: "When using a tab for completion that does not yield a unique match, a subsequent tab will provide a numbered list of matching alternatives."

Hit '2', hit TAB again, shell fills out bin/pscal.

Ah. And it also wrongly moves the cursor two positions to the right of the file name instead of one position. Which would explain why you can't backspace past the first character. In fact you are deleting it, the terminal display just isn't reflecting it because the cursor is positioned wrong.

I'll try to hunt down that bug in the coming days. Thanks for the report.

@McDutchie McDutchie added the bug Something is not working label Jul 11, 2020
@McDutchie
Copy link

Just to confirm that ksh2020 also has this bug, so there is no bugfix from there or the preceding 93v- beta to backport.

And it looks like the menu completion in vi mode is also buggy, but in a different way; an inconsistent state occurs if you cause the menu to appear with two TABs (which also puts you in command mode) but then re-enter insert mode (e.g. with a) instead of entering a number. Fun fun fun.

$ ksh -o vi
$ cd /
$ bin/p            [press TAB twice]
1) pax
2) ps
3) pwd             [now press 'a' to re-enter insert mode, type 'wd' and then return]
$ bin/pwd
>                  [the PS2 prompt wrongly appears; pressing return then executes 'pwd']
/
$ 

@McDutchie McDutchie changed the title Line-editing leaves extraneous characters Tab completion menu causes inconsistent editor state in emacs and vi modes Jul 11, 2020
@posguy99
Copy link
Author

Is this one part of the same issue or is it a new issue? Found this on the old ast-users ML, apparently the user complaining about his issue got told it was normal behavior, but in testing that, I found this:

[07:14 PM][ttys000][~/test]
[135] mbp13 $ ls
JHS_REFRESH-1.txt  JHS_REFRESH-A.txt
[07:14 PM][ttys000][~/test]
[136] mbp13 $ JH 

hit TAB

[136] mbp13 $ JH      

Four spaces get inserted. Why? Hit TAB again

[136] mbp13 $ JH      JHS_REFRESH-

Huh? This time it did the filename completion. Hit TAB again

[136] mbp13 $ JH      JHS_REFRESH-
1) JHS_REFRESH-1.txt
2) JHS_REFRESH-A.txt
[136] mbp13 $ JH      JHS_REFRESH-   

And now it displays the completion list menu?

There's some confusion going on here between what KSH does if it's trying to complete a command (the first token on the line) and trying to complete a filename (everything else).

The original thread: https://www.mail-archive.com/ast-users@lists.research.att.com/msg00419.html
The simpler testcase: https://www.mail-archive.com/ast-users@lists.research.att.com/msg00436.html

@McDutchie
Copy link

McDutchie commented Jul 11, 2020

Thanks for the links. They correctly report another bug with menu tab completion. According to my testing, the linked bug can only be reproduced in emacs mode and not in vi mode, and ksh2020 still has this bug so no fix is available to backport. Shame the original reporter was ignored back in 2013. I'll try to find a fix for it as part of this issue.

Re your own observations...

hit TAB

[136] mbp13 $ JH      

Four spaces get inserted. Why?

What was inserted was a literal tab character (note that it takes only one backspace to remove it), because "JH" does not match any command. ksh falls back to a literal tab if there is no match, which would have been the behaviour without any edit mode active.

I'm pretty sure this is the intended behaviour and not really a bug. However, I think this particular ksh behaviour is annoying and ill thought out, and no other shell acts this way (not even pdksh/mksh). So I might be willing to consider changing it in 93u+m, but that would have to be dealt with in a separate issue.

Hit TAB again

[136] mbp13 $ JH      JHS_REFRESH-

Huh? This time it did the filename completion.

That's correct behaviour (although your surprise at it does illustrate that the literal tab behaviour above is ill thought out). A command name has been entered and followed by a blank separator (the literal tab character), so at this point it doesn't matter if the command exists or not; you're now completing file names for a JH command.

By the same token, displaying the file completion menu after a second TAB is also normal behaviour.

@JohnoKing
Copy link

Now, backspace over the command line...

[04:59 PM][ttys001][~]
[45] mbp13 $ bin/p
1) pkg-config
2) pscal
3) pscal.saf
[45] mbp13 $ b 

I'm left with the 'b', which isn't actually there, I can hit ENTER and not get an error message, and it's not stored in history.

I can only reproduce this bug when the multiline option is enabled. If multiline editing is disabled with set +o multiline, the bug vanishes.

@McDutchie
Copy link

I wrote:

ksh falls back to a literal tab if there is no match, which would have been the behaviour without any edit mode active.

I'm pretty sure this is the intended behaviour and not really a bug.

While this does seem to work correctly in vi mode, it turns out to be very broken in emacs mode. If you try to type a second literal tab, you get a completion menu with all possible commands, filling your scrollback buffer. So that's another bug.

@posguy99
Copy link
Author

I can only reproduce this bug when the multiline option is enabled. If multiline editing is disabled with set +o multiline, the bug vanishes.

Confirmed. I didn't think of that.

posguy99 referenced this issue Feb 22, 2021
In the emacs editor:
 1.  press ESC
 2.  change the size of your terminal window
and your screen is mysteriously cleared. (Until recent fixes, the
shell probably also crashed somewhere in the job control code.)

The cause is the way SIGWINCH is handled by ed_read() in edit.c.
For the emacs editor, it sends a Ctrl+L character to the input
buffer. The Ctrl+L command refreshes the command line. And it so
happens that ESC plus Ctrl+L is a command to clear the screen in
the emacs editor.

With the exeption of vi insert/command mode for which it uses a
shared flag, edit.c does not know the state of the editor, because
their data are internal to emacs.c and vi.c. So it doesn't know
whether you're in some mode that treats keyboard input specially.
Which means this way of dealing with SIGWINCH is fundamentally
misdesigned and is not worth fixing.

It gets sillier: in addition to sending keyboard commands, edit.c
was also communicating directly with emacs.c and vi.c via a flag,
e_nocrnl, which means 'please don't make Ctrl+L emit a linefeed'
(it normally refreshes on a new line but that is undesirable for
SIGWINCH). So there is already a hack that breaks the barrier
between edit.c and emacs.c/vi.c. Let's do that properly instead.

As of this commit, ed_read() does not send any fake keystrokes.
Instead, two extern functions, emacs_redraw() and vi_redraw(), are
defined for redrawing the command line. These are put in emacs.c
and vi.c so they have access to relevant static data and functions.
Then, instead of sending keyboard commands to the editor and
returning, ed_read() simply calls the redraw function for the
active editor, then continues and waits for input. Much cleaner.

src/cmd/ksh93/include/edit.h:
- Remove e_nocrnl flag from Edit_t struct.
- Define externs emacs_redraw() and vi_redraw(). Since Emacs_t and
  Vi_t types are not known here, we have to declare void* pointers
  and the functions will have to use typecasts.

src/cmd/ksh93/edit/edit.c:
- ed_read(): Call emacs_redraw() or vi_redraw() as per above.
- ed_getchar(): Remove comment about a nonexistent while loop.

src/cmd/ksh93/edit/emacs.c:
- Updates corresponding to removal of e_nocrnl flag.
- Add emacs_redraw(). This one is pretty simple. Refresh the
  command line, then ed_flush() to update the cursor display.

src/cmd/ksh93/edit/vi.c:
- Updates corresponding to removal of e_nocrnl flag. Also remove a
  similar internal 'nonewline' flag which is now equally redundant.
- Move the Ctrl+L handling code (minus writing the newline) into
  the vi_redraw() function.
- Change two cases where vi set nonewline and sent Ctrl+L to itself
  into simple vi_redraw() calls.
- Add vi_redraw(). This is more complicated as it incorporates the
  previous Ctrl+L code. It needs an added refresh() call with a
  check whether we're currently in command or insert mode, as those
  use different refresh methods. Luckily edit.c already maintains
  an *e_vi_insert flag in ed_getchar() that we can use. Since vi's
  refresh() already calls ed_flush(), we don't need to add that.
@McDutchie
Copy link

McDutchie commented Feb 24, 2021

So, I did many fruitless attempts to debug the completion code, only to find the bug may not in the completion code at all. It could be in these two lines in draw(), REFRESH mode. Or at least they are involved.

ksh/src/cmd/ksh93/edit/emacs.c

Lines 1556 to 1557 in caf7ab6

if(ep->ed->e_multiline && option == REFRESH)
ed_setcursor(ep->ed, ep->screen, ep->cursor-ep->screen, ep->ed->e_peol, -1);

The following diff seems to fix it, but I don't understand exactly why yet nor do have I fully tested for possible side effects…

edit: obsolete patch, do not apply

diff --git a/src/cmd/ksh93/edit/emacs.c b/src/cmd/ksh93/edit/emacs.c
index aa6fcd3..091e7fc 100644
--- a/src/cmd/ksh93/edit/emacs.c
+++ b/src/cmd/ksh93/edit/emacs.c
@@ -1554,7 +1559,7 @@ static void draw(register Emacs_t *ep,Draw_t option)
 #endif /* SHOPT_MULTIBYTE */
 	}
 	if(ep->ed->e_multiline && option == REFRESH)
-		ed_setcursor(ep->ed, ep->screen, ep->cursor-ep->screen, ep->ed->e_peol, -1);
+		ed_setcursor(ep->ed, ep->screen, ep->cursor-ep->screen, ep->ed->e_peol - 1, -1);
 
 	
 	/******************

This could mean that there is an off-by-one error in this ed_setcursor() call and that subtracting 1 from ep->ed->e_peol in the call is the correct fix, or it could mean that the off-by-one is in the value of ep->ed->e_peol to begin with, in which case I need to track down where that is set.

@McDutchie
Copy link

McDutchie commented Feb 24, 2021

ep->ed->e_peol is also often referred to as eol via a macro, or (in completion.c) *eol via a pointer. When I press ^X^D to show debug info (enabled now in non-release builds), then when the cursor is at the end of the line, eol and cur have the same value. That suggests that the ed_setcursor() call above is correct, and the bug is that an off-by-one value is stored in ep->ed->e_peol somewhere. Either that, or the bug is in the ed_setcursor() function itself.

@McDutchie
Copy link

McDutchie commented Feb 24, 2021

No off-by-one value appears to be stored in ep->ed->e_peol. When I cause the cursor to be one too far to the right and then press ^X^D, cur and eol show the same value. When I press ^A and then ^E as a workaround to correct the cursor display, cur and eol have the same value as before, even though the cursor display has been corrected. This suggests the bug must be somewhere in ed_setcursor() itself.

@McDutchie
Copy link

McDutchie commented Feb 24, 2021

The diff below fixes the bug reported in the top comment for me, and I haven't been able to discover any breakage introduced by it, which doesn't mean there isn't any. I'm still not quite sure what's going on or if this is the correct fix. Nonetheless, please test this and try to break it.

The clear flag in this code is set if ed_setcursor() is called with -1 as its last argument. The only other call that does this is in the vi.c refresh() function. So we also need to look for side effects of this diff in the vi editor.

edit: obsolete patch, do not apply

diff --git a/src/cmd/ksh93/edit/edit.c b/src/cmd/ksh93/edit/edit.c
index 3338bd3..01b6afd 100644
--- a/src/cmd/ksh93/edit/edit.c
+++ b/src/cmd/ksh93/edit/edit.c
@@ -1338,8 +1338,9 @@ int ed_setcursor(register Edit_t *ep,genchar *physical,register int old,register
 			delta = new-first;
 		}
 	}
-	while(delta-->0)
-		ed_putchar(ep,physical[old++]);
+	if(!clear)
+		while(delta-- > 0)
+			ed_putchar(ep,physical[old++]);
 	return(new);
 }
 #endif /* SHOPT_ESH || SHOPT_VSH */

@McDutchie
Copy link

Meanwhile here's another diff I'm more sure of. It should fix the buggy emacs literal-tab behaviour detailed here and here and make ksh act like bash, mksh, zsh, etc. – to insert a literal tab you have to use your lnext character (^V by default) or the backslash. It does not fix the issue reported on the old mailing list; that's a separate problem.

diff --git a/src/cmd/ksh93/edit/emacs.c b/src/cmd/ksh93/edit/emacs.c
index aa6fcd3..4b21379 100644
--- a/src/cmd/ksh93/edit/emacs.c
+++ b/src/cmd/ksh93/edit/emacs.c
@@ -371,6 +371,8 @@ int ed_emacsread(void *context, int fd,char *buff,int scend, int reedit)
 				}
 				ep->ed->e_tabcount = 0;
 			}
+			beep();
+			continue;
 		do_default_processing:
 		default:
 

@McDutchie
Copy link

McDutchie commented Feb 24, 2021

The fix for the mailing list issue is also a fairly simple one. ed_expand() has three modes, and for a change they've documented them in the comments:

/*
* file name generation for edit modes
* non-zero exit for error, <0 ring bell
* don't search back past beginning of the buffer
* mode is '*' for inline expansion,
* mode is '\' for filename completion
* mode is '=' cause files to be listed in select format
*/
int ed_expand(Edit_t *ep, char outbuff[],int *cur,int *eol,int mode, int count)
{

The menu is printed here, with the sh_menu() call:

if(mode=='=')
{
if (strip && !cmd_completion)
{
register char **ptrcom;
for(ptrcom=com;*ptrcom;ptrcom++)
/* trim directory prefix */
*ptrcom = path_basename(*ptrcom);
}
sfputc(sfstderr,'\n');
sh_menu(sfstderr,narg,com);
sfsync(sfstderr);
ep->e_nlist = narg;
ep->e_clist = com;
goto done;
}

So the fix is simply to check even in = mode if there are actually two or more arguments to choose from before printing a menu. If not, switch back to ordinary \ file name completion mode and skip the menu.

diff --git a/src/cmd/ksh93/edit/completion.c b/src/cmd/ksh93/edit/completion.c
index 1efc09b..a6086f9 100644
--- a/src/cmd/ksh93/edit/completion.c
+++ b/src/cmd/ksh93/edit/completion.c
@@ -343,6 +343,8 @@ int ed_expand(Edit_t *ep, char outbuff[],int *cur,int *eol,int mode, int count)
 		}
 		if(mode=='\\' && out[-1]=='/'  && narg>1)
 			mode = '=';
+		else if(mode=='=' && narg<2)
+			mode = '\\';  /* no filename menu if there is only one choice */
 		if(mode=='=')
 		{
 			if (strip && !cmd_completion)

@McDutchie
Copy link

McDutchie commented Feb 24, 2021

That leaves the fourth bug in this issue, the vi one. I've no clue where to look for the cause yet -- it may or may not even be in vi.c. Any ideas anyone? @JohnoKing @hyenias @lkujaw @hvdijk

@posguy99
Copy link
Author

posguy99 commented Feb 24, 2021

Oh, man, number 3 is so cool, it's like a pain you don't know how to get rid of until it's suddenly just... gone. What crazy person EVER thought the original behavior was a correct one?

Ahem.

Ok, number 1... yup, fixes it. behavior is correct now with multiline on or off... no extra space and no hanging characters.

For number 2, also fixes it, just get a terminal beep now with a TAB rather than a jump to the next tabstop and then a menu full of everything.

@McDutchie
Copy link

McDutchie commented Feb 24, 2021

The last fix for number 1 may accidentally have worked, but is not correct. It did not sit right with me that the same bug did not occur in vi mode even though it uses that same ed_setcursor() function. I believe I have now found the correct fix, and it's in emacs.c after all.

These two lines are the wrong way around. The cursor needs to be set for the internal screen buffer after increasing that pair of pointers, not before. Setting it before caused this ed_setcursor() call (which updates the actual screen cursor) to calculate a delta (position difference) of one, which is what caused the cursor to be positioned one position too far to the right.

Correct patch: edit: obsolete patch, do not apply

diff --git a/src/cmd/ksh93/edit/emacs.c b/src/cmd/ksh93/edit/emacs.c
index aa6fcd3..6d6502f 100644
--- a/src/cmd/ksh93/edit/emacs.c
+++ b/src/cmd/ksh93/edit/emacs.c
@@ -1540,8 +1542,8 @@ static void draw(register Emacs_t *ep,Draw_t option)
 			sptr++;
 			continue;
 		}
-		setcursor(ep,sptr-ep->screen,*nptr);
 		*sptr++ = *nptr++;
+		setcursor(ep,sptr-ep->screen,*nptr);
 #if SHOPT_MULTIBYTE
 		while(*nptr==MARKER)
 		{

@hvdijk
Copy link

hvdijk commented Feb 24, 2021

That leaves the fourth bug in this issue, the vi one. I've no clue where to look for the cause yet -- it may or may not even be in vi.c. Any ideas anyone? @JohnoKing @hyenias @lkujaw @hvdijk

What little I can quickly find without trying to understand the code: with the following debug patch:

--- a/src/cmd/ksh93/sh/lex.c
+++ b/src/cmd/ksh93/sh/lex.c
@@ -397,6 +397,7 @@ int sh_lex(Lex_t* lp)
                                fcseek(-LEN);
                                goto breakloop;
                        case S_EOF:
+                               write(2, "S_EOF\n", 6);
                                sp = fcfile();
                                if((n=lexfill(lp)) > 0)
                                {

After the Enter after tab completion, the S_EOF case is hit again, and the call to lexfill(lp) is what causes the prompt for more input (I think).

After the Enter without tab compltion, the S_EOF case is not hit again.

Hopefully this will speed up a deeper investigation by someone who actually understands the code.

@McDutchie
Copy link

Hopefully this will speed up a deeper investigation by someone who actually understands the code.

Which, as far as I can tell, is no one. I'm not convinced Korn understood his own code by 2009 or so. That's what you get for not commenting stuff properly. Thanks, though, this will probably help me get started on that one.

@McDutchie
Copy link

Here's another reproducer of the vi bug, suggesting the problem is a write past the end of the screen buffer (which also explains the S_EOF case being hit where it shouldn't be):

$ set -o vi
$ cd /
$ bin/p            [press TAB twice]
1) pax
2) ps
3) pwd             [now press '0' for start of line, then '$' for end of line]
$ bin/p            [cursor is positioned one too far to the right! should be on the 'p', not next to it]

And from there on, if you press a and then type wd(return) to complete the command, you get unpredictable symptoms because we're now past the screen buffer, accessing presumably uninitialised memory. I've gotten everything from

$ bin/p????????wd

to

$ bin/pd

to an outright crash.

@hvdijk
Copy link

hvdijk commented Feb 24, 2021

On the lex side, the reason S_EOF is hit is because the newline character is not part of its input buffer. At that point, _Fcin.fcptr is "bin/pwd\n" when the bad tab completion is not performed, or "bin/pwd" if it is. If "bin/pwd" has been read without a terminating newline, getting more input is the right thing to do.

@McDutchie
Copy link

McDutchie commented Feb 24, 2021

Yes. That looks like correct behaviour on the lexer side of things. The bug is most likely somewhere in edit/vi.c, possibly in this block or a function called by it.

@posguy99
Copy link
Author

The patch referenced in #71 (comment):

With this patch, ksh no longer deletes on-screen characters when you backspace, the cursor just moves back over them. They ARE being removed from the input buffer.

@McDutchie
Copy link

McDutchie commented Feb 25, 2021

The patch referenced in #71 (comment):

With this patch, ksh no longer deletes on-screen characters when you backspace, the cursor just moves back over them. They ARE being removed from the input buffer.

Thanks for catching that. I'd noticed that symptom but I mislead myself into thinking it was due to wonky terminal settings. :-/

So those two lines were not the wrong way around. I still think the correct fix is something in that general direction. Current hypothesis: we need an extra setcursor() call to sync the screen buffer cursor state again after updating sptr and nptr, but before calling ed_setcursor().

Undo the previous diff and try:
edit: obsolete patch, do not apply

diff --git a/src/cmd/ksh93/edit/emacs.c b/src/cmd/ksh93/edit/emacs.c
index aa6fcd3..7b7eab1 100644
--- a/src/cmd/ksh93/edit/emacs.c
+++ b/src/cmd/ksh93/edit/emacs.c
@@ -1552,6 +1555,7 @@ static void draw(register Emacs_t *ep,Draw_t option)
 			ep->cursor++;
 		}
 #endif /* SHOPT_MULTIBYTE */
+		setcursor(ep,sptr-ep->screen,*nptr);
 	}
 	if(ep->ed->e_multiline && option == REFRESH)
 		ed_setcursor(ep->ed, ep->screen, ep->cursor-ep->screen, ep->ed->e_peol, -1);

@McDutchie
Copy link

McDutchie commented Feb 25, 2021

I'm now confident about the two emacs mode fixes and the completion.c fix, so I'll push those and mark this issue resolved. If any breakage in those should be discovered, I'll reopen it. The remaining vi bug is proving elusive and warrants its own issue, so I'll open a new one for that.

@posguy99
Copy link
Author

Sorry, but... this is with patches 2&3 and the latest patch 1...

[12:10 PM][ttys000 +1][~/src/ksh-71][master@82d6272]
[1305] mbp13 $ bin/shtests pty                                     
#### Regression-testing /Users/mwilson/src/ksh-71/arch/darwin.i386-64/bin/ksh ####
test pty begins at 2021-02-25+12:11:27
	pty.sh[664]: emacs backslash escaping: line 673: expected "^:test-2: set -o emacs$", got ":test-2: set -o emacs em\bma\bac\bcs\bs"
	pty.sh[664]: emacs backslash escaping: line 677: expected "true \^C", got ""
test pty failed at 2021-02-25+12:11:40 with exit code 2 [ 28 tests 2 errors ]
test pty(C.UTF-8) begins at 2021-02-25+12:11:40
	pty.sh[664]: emacs backslash escaping: line 673: expected "^:test-2: set -o emacs$", got ":test-2: set -o emacs em\bma\bac\bcs\bs"
	pty.sh[664]: emacs backslash escaping: line 677: expected "true \^C", got ""
test pty(C.UTF-8) failed at 2021-02-25+12:11:54 with exit code 2 [ 28 tests 2 errors ]
test pty(shcomp) begins at 2021-02-25+12:11:54
	shcomp-pty.ksh[664]: emacs backslash escaping: line 673: expected "^:test-2: set -o emacs$", got ":test-2: set -o emacs em\bma\bac\bcs\bs"
	shcomp-pty.ksh[664]: emacs backslash escaping: line 677: expected "true \^C", got ""
test pty(shcomp) failed at 2021-02-25+12:12:08 with exit code 2 [ 28 tests 2 errors ]
Total errors: 6
CPU time       user:      system:
main:      0m00.011s    0m00.023s
tests:     0m00.738s    0m01.224s

@posguy99
Copy link
Author

For clariity, with only this patch:

diff --git a/src/cmd/ksh93/edit/emacs.c b/src/cmd/ksh93/edit/emacs.c
index aa6fcd3..7b7eab1 100644
--- a/src/cmd/ksh93/edit/emacs.c
+++ b/src/cmd/ksh93/edit/emacs.c
@@ -1552,6 +1555,7 @@ static void draw(register Emacs_t *ep,Draw_t option)
                        ep->cursor++;
                }
 #endif /* SHOPT_MULTIBYTE */
+               setcursor(ep,sptr-ep->screen,*nptr);
        }
        if(ep->ed->e_multiline && option == REFRESH)
                ed_setcursor(ep->ed, ep->screen, ep->cursor-ep->screen, ep->ed->e_peol, -1);

the test failures above occur.

@McDutchie McDutchie changed the title Tab completion menu causes inconsistent editor state in emacs and vi modes Tab completion menu causes inconsistent editor state in emacs Feb 25, 2021
@McDutchie
Copy link

McDutchie commented Feb 25, 2021

Yup. Thanks, again.

Try this instead... and I did run the tests this time and found no regressions. Sorry about that.
edit: obsolete patch, do not apply

diff --git a/src/cmd/ksh93/edit/emacs.c b/src/cmd/ksh93/edit/emacs.c
index aa6fcd3..9922ff5 100644
--- a/src/cmd/ksh93/edit/emacs.c
+++ b/src/cmd/ksh93/edit/emacs.c
@@ -1554,7 +1556,10 @@ static void draw(register Emacs_t *ep,Draw_t option)
 #endif /* SHOPT_MULTIBYTE */
 	}
 	if(ep->ed->e_multiline && option == REFRESH)
+	{
+		setcursor(ep, sptr - ep->screen, *nptr);  /* need extra sync to avoid off-by-one cursor display */
 		ed_setcursor(ep->ed, ep->screen, ep->cursor-ep->screen, ep->ed->e_peol, -1);
+	}
 
 	
 	/******************

@McDutchie
Copy link

McDutchie commented Feb 25, 2021

Nope. That one gives a strange side effect when you do a completion at the bottom line of the terminal -- your terminal scrolls up and you get a number of empty lines at the bottom.

I'm about ready to pull my hair out here. None of the fixes that seem correct actually work. This one seems to work without any side effects, but it just seems wrong. If there is a bug in that function, it should also occur in vi mode and it does not. Obviously, I have no idea what I'm doing, and have no business maintaining this project.

@hyenias
Copy link

hyenias commented Feb 25, 2021

@McDutchie: Just hang in there. I, for one, know that you are doing a good job with this new ksh93 endeavor. Please take a break and get some rest. Perhaps take a day off and relax. Because, there is other fruit to pick from the issues tree.

@McDutchie
Copy link

McDutchie commented Feb 26, 2021

Thanks for the encouragement. :)

OK, since I can't sleep, I'm back to my original hypothesis that there is somehow an off-by-one error related to the ed_setcursor() call that gets executed when in multiline mode. I cannot confirm whether that off-by-one error is actually in the call itself, or occurs sometime earlier on one of the many possible occasions where ep->cursor is changed. But everything else appears to work correctly, so it's not unlikely that the problem is in the call itself.

For reference, this is the original version of that call in emacs.c:

ksh/src/cmd/ksh93/edit/emacs.c

Lines 1556 to 1557 in df2b9bf

if(ep->ed->e_multiline && option == REFRESH)
ed_setcursor(ep->ed, ep->screen, ep->cursor-ep->screen, ep->ed->e_peol, -1);

There is a corresponding call in the vi.c refresh() function (which does the same thing as draw() in emacs.c), where the third (old) and fourth (new) arguments are actually identical:

ksh/src/cmd/ksh93/edit/vi.c

Lines 2086 to 2087 in df2b9bf

if(vp->ed->e_multiline && vp->ofirst_wind==INVALID)
ed_setcursor(vp->ed, physical, last_phys+1, last_phys+1, -1);

The expectation for this particular call is in fact that they should be identical, so that a delta of zero is calculated in that function. Delta not being zero is what causes the cursor to be positioned wrong.

In vi.c, last_phys is a macro that is defined as editb.e_peol, and editb is a macro that is defined as (*vp->ed). Which means last_phys means vp->ed->e_peol, which is the same as ep->ed->e_peol in emacs.c. (These editors were originally separate programs by different authors, and I suppose this is how it shows. Korn didn't want to change all the variable names to integrate them, so made macros instead.)

That leaves the question of why vi.c adds 1 to both last_phys a.k.a. e_peol arguments, and emacs.c uses e_peol for new without adding anything. Analysing the ed_setcursor() code could answer that question.

So, this patch makes emacs.c do it the same way vi.c does. Let's make the third argument identical to the fourth. My brief testing shows the bug is fixed, and the regression tests yield no failures. This fix is also the most specific change possible, so there are few opportunities for side effects (I hope). Please test…

diff --git a/src/cmd/ksh93/edit/emacs.c b/src/cmd/ksh93/edit/emacs.c
index aa6fcd3..3f56625 100644
--- a/src/cmd/ksh93/edit/emacs.c
+++ b/src/cmd/ksh93/edit/emacs.c
@@ -1554,7 +1556,7 @@ static void draw(register Emacs_t *ep,Draw_t option)
 #endif /* SHOPT_MULTIBYTE */
 	}
 	if(ep->ed->e_multiline && option == REFRESH)
-		ed_setcursor(ep->ed, ep->screen, ep->cursor-ep->screen, ep->ed->e_peol, -1);
+		ed_setcursor(ep->ed, ep->screen, ep->ed->e_peol, ep->ed->e_peol, -1);
 
 	
 	/******************

@lkujaw
Copy link

lkujaw commented Feb 26, 2021

Not sure if this helps, but I noticed that if you comment out the following in completion.c (normally responsible for adding a space after a non-directory is printed), the problem also goes away.

			if(cp[strlen(cp)-1]!='/')
			{
				if(var=='$' && begin[-1]=='{')
					*out = '}';
				else {
					*out = ' ';
                                }
				out++;
			}
			else if(out[-1] =='"' || out[-1]=='\'')
				out--;
			*out = 0;

@McDutchie
Copy link

A little hint: if you select multiple lines on a blob page like https://github.com/ksh93/ksh/blob/master/src/cmd/ksh93/edit/completion.c, then choose Copy permalink from the three-dots menu and paste that link in your comment, it's automagically replaced by a syntax-coloured source code extract like this:

if(cp[strlen(cp)-1]!='/')
{
if(var=='$' && begin[-1]=='{')
*out = '}';
else
*out = ' ';
out++;
}
else if(out[-1] =='"' || out[-1]=='\'')
out--;
*out = 0;

Anyway, that's an interesting datapoint. Commenting that out breaks things in an informative way.

@lkujaw
Copy link

lkujaw commented Feb 26, 2021

Thanks for the tip. It seems like that's the only time the two values diverge (ep->cursor-ep->screen being one less) in my very limited testing.

@lkujaw
Copy link

lkujaw commented Feb 26, 2021

Based on

ksh/src/cmd/ksh93/edit/edit.c

Lines 1246 to 1260 in df2b9bf

int ed_setcursor(register Edit_t *ep,genchar *physical,register int old,register int new,int first)
{
static int oldline;
register int delta;
int clear = 0;
Edpos_t newpos;
delta = new - old;
if(first < 0)
{
first = 0;
clear = 1;
}
if( delta == 0 && !clear)
return(new);

It seems like setting old and new to the same value would normally be a noop (if it wasn't set to clear.) Otherwise, the following segment gets run:

			ed_nputchar(ep,clear,' ');
			ed_nputchar(ep,clear,'\b');
			return(new);

@lkujaw
Copy link

lkujaw commented Feb 26, 2021

I did a bit of research on this, and I think the fix to have the Emacs editing mode do the same as Vi is correct.

From RELEASE:
08-05-01 In multiline edit mode, the refresh operation will now clear
the remaining portion of the last line.

Here's a fragment from the completion.c of the venerable but dated CDE DtKsh:

                else
                        while (*com)
                        {
                                *out++  = ' ';
                                out = strcopy(out,*com++);
                        }
                *cur = (out-outbuff);
                /* restore rest of buffer */
                out = strcopy(out,stakptr(0));
                *eol = (out-outbuff);

Noticeably missing is the code to add a space after file name completions. So, it seems plausible that if multiline editing mode was added beforehand,the ep->ed->p_eol != ep->cursor-ep->screen case might never have occurred during testing.

Setting the 'first' parameter to -1 seems to be a pretty explicit indicator that the author(s) intended the line clearing code to run, hence the entry in RELASE.

The real issue is that if we update the cursor by calling ed_setcursor on line 1554 with old != new, the later call to setcursor on line 1583, here:

	I = (ncursor-nscreen) - ep->offset;
	setcursor(ep,i,0);

will use outdated screen information to call setcursor, which, coincidentally, calls ed_setcursor.

Well, that was quite the adventure... Could we add issue IDs/bug numbers to the regression tests, perhaps? It seems like useful information to catalog the defect delta from the last AST release, considering the effort going into this project.

@McDutchie
Copy link

McDutchie commented Feb 26, 2021

Many thanks Lev. That does properly make sense of this bug.

I'm not sure yet how to go about a regression test for this one. pty is good at verifying that the expected output is written to the terminal, but I don't think it's capable of checking what position the cursor is at. Maybe it can test a backspace operation and detect the resulting display corruption, though.

For seeing what was fixed by what commit, git blame is your friend, and Github's interface to that is even nicer. In README.md, I've used that for the NEWS file to link significant user-visible fixes with their respective commits. But that doesn't have everything in it. For the complete catalogue of fixes, see the commit log. As you're probably aware, I have a policy of documenting the what, how and why of every significant change there.

@McDutchie
Copy link

I can't seem to get a regression test working for this bug, and I'm ready to mark this issue resolved. If anyone feels there should be a regression test, please feel free to have a shot at it. I'd be happy to accept a pull request for a working one. The existing ones are in tests/pty.sh. Use arch/*/bin/pty --man to get documentation for the one-letter pty dialogue commands.

posguy99 added a commit to posguy99/dotfiles that referenced this issue Feb 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something is not working
Projects
None yet
Development

No branches or pull requests

6 participants