Skip to content
This repository has been archived by the owner on Apr 19, 2024. It is now read-only.

Intermittent False EOF or panic when run under screen #408

Closed
mgabeler-lee-6rs opened this issue Feb 16, 2022 · 1 comment · Fixed by #409
Closed

Intermittent False EOF or panic when run under screen #408

mgabeler-lee-6rs opened this issue Feb 16, 2022 · 1 comment · Fixed by #409
Labels

Comments

@mgabeler-lee-6rs
Copy link
Contributor

What operating system and terminal are you using?

  • Linux, Debian bullseye
  • screen --version: Screen version 4.08.00 (GNU) 05-Feb-20
  • gnome-terminal --version: # GNOME Terminal 3.38.3 using VTE 0.62.3 +BIDI +GNUTLS +ICU +SYSTEMD

An example that showcases the bug.
https://github.com/mgabeler-lee-6rs/survey-screen-eof-demo

What did you expect to see?

  • Doesn't report EOF when there isn't one
  • Doesn't panic
  • Doesn't cause my terminal to be filled with garbage after exiting

What did you see instead?

This is a bit on and off for whether it reproduces, but given a screen session that is "doing it", it becomes 100% reproducible in producing one of the failure modes, though which failure happens seems to be a bit variable run to run.

The demo app linked above is just a trivial call to AskOne with a Confirm, but it seems that pretty much any Ask will do it. This spun out of problems from multiple users with the gh CLI: cli/cli#4739

Failure mode 1: panic

This looks superficially similar to #394, but I'm unsure if it is really the same.

? Does it work? (y/N) panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x4ec8ae]

goroutine 1 [running]:
github.com/AlecAivazis/survey/v2/terminal.Coord.CursorIsAtLineEnd(...)
        /home/mgl/src/go/pkg/mod/github.com/!alec!aivazis/survey/v2@v2.3.2/terminal/cursor.go:157
github.com/AlecAivazis/survey/v2/terminal.(*RuneReader).ReadLineWithDefault.func2(...)
        /home/mgl/src/go/pkg/mod/github.com/!alec!aivazis/survey/v2@v2.3.2/terminal/runereader.go:65
github.com/AlecAivazis/survey/v2/terminal.(*RuneReader).ReadLineWithDefault(0xc0001b26c0, 0xe010, {0x691658, 0x0, 0x0}, {0x0, 0x40caf4, 0x52bd80})
        /home/mgl/src/go/pkg/mod/github.com/!alec!aivazis/survey/v2@v2.3.2/terminal/runereader.go:321 +0x58e
github.com/AlecAivazis/survey/v2/terminal.(*RuneReader).ReadLine(...)
        /home/mgl/src/go/pkg/mod/github.com/!alec!aivazis/survey/v2@v2.3.2/terminal/runereader.go:37
github.com/AlecAivazis/survey/v2.(*Confirm).getBool(0xc000098000, 0x0, 0xc00009a048)
        /home/mgl/src/go/pkg/mod/github.com/!alec!aivazis/survey/v2@v2.3.2/confirm.go:57 +0x186
github.com/AlecAivazis/survey/v2.(*Confirm).Prompt(0xc000098000, 0xc00009a048)
        /home/mgl/src/go/pkg/mod/github.com/!alec!aivazis/survey/v2@v2.3.2/confirm.go:136 +0x105
github.com/AlecAivazis/survey/v2.Ask({0xc0000bbe00, 0x1, 0x1}, {0x526380, 0xc000096000}, {0x0, 0x0, 0x0})
        /home/mgl/src/go/pkg/mod/github.com/!alec!aivazis/survey/v2@v2.3.2/survey.go:308 +0x214
github.com/AlecAivazis/survey/v2.AskOne({0x581798, 0xc000098000}, {0x526380, 0xc000096000}, {0x0, 0x0, 0x4125b1})
        /home/mgl/src/go/pkg/mod/github.com/!alec!aivazis/survey/v2@v2.3.2/survey.go:251 +0x94
main.main()
        /home/mgl/src/survey-screen-eof-demo/main.go:28 +0x1a5

Failure mode 2: TTY garbage:
This is a much more common failure mode

? Does it work? (y/N) failed: EOF
^[[67;274R^[[5;23Rmgl@nigripes:~/src/survey-screen-eof-demo$ 7;274R23R

Note that the "gobbledygook" is inserted into my terminal buffer.

Attempted Analysis:

Based on the "TTY garbage" mode, and from attaching a debugger to the demo app, it appears that one or more of the terminal control commands being sent is producing unexpected responses that are treated as user input. Running it under strace -r -v -f unfortunately seems to heisenbug it and produce a third failure mode where it gets the false EOF, but no TTY garbage. In this case, the only read that shows in strace is like so:

915027      0.000037 read(0, "", 4096)  = 0

Which ... does look like an EOF. FWIW, the preceding TCSETS was:

919540      0.000018 ioctl(0, SNDCTL_TMR_START or TCSETS, {c_iflags=0x506, c_oflags=0x1805, c_cflags=0x8bd, c_lflags=0x8a30, c_line=0, c_cc[VMIN]=0, c_cc[VTIME]=0, c_cc="\x03\x1c\x7f\x15\x04\x00\x00\x00\x11\x13\x1a\x00\x12\x0f\x17\x16\x00\x00\x00"}) = 0

(full strace: trace.txt)

Going back to stepping through the debugger, it looks like the "TTY garbage" is coming from the terminal responding to the DSR escape sequence sent during cursor.Cursor.Location(), and I think it may be happening because the second call in this sequence, in runereader.go, is returning an error that is ignored:

	// we get the terminal width and height (if resized after this point the property might become invalid)
	terminalSize, _ := cursor.Size(rr.Buffer())
	// we set the current location of the cursor once
	cursorCurrent, _ := cursor.Location(rr.Buffer())

However, for whatever reason, with the debugger attached, it seems to be usually failing on the first one instead, which produces the "panic" failure mode (I think) if it ends up seeing the actual DSR response as user input, or as the EOF if that somehow doesn't happen in time to prevent it getting an EOF error when it gets here in runereader.go:

		// wait for some input
		r, _, err := rr.ReadRune()
		if err != nil {
			return line, err
		}

Trying to trace down where some of the EOFs are coming from, I see the EOF seeming to come from this bit in stdlib's bufio.go:

	// Read new data: try a limited number of times.
	for i := maxConsecutiveEmptyReads; i > 0; i-- {
		n, err := b.rd.Read(b.buf[b.w:])
		if n < 0 {
			panic(errNegativeRead)
		}
		b.w += n
		if err != nil {
			b.err = err
			return
		}

This brought me back to the TCSETS call, with its VMIN and VTIME values of 0. It definitely doesn't seem like the code is expecting these settings, and is instead expecting the terminal i/o to still be in blocking i/o mode, even when having cleared canonical mode (ref: https://man7.org/linux/man-pages/man3/tcflow.3.html#:~:text=MIN%20%3D%3D%200%2C%20TIME%20%3D%3D%200%20(polling%20read)).

On a working terminal, I get the TCGETS/TCSETS calls having (among the other flags), c_cc[VMIN]=1, c_cc[VTIME]=0.

Which finally tells me how I can submit a PR to fix this :D I don't know why my screen sessions sometimes have VMIN=0, but since this application is configuring the terminal, it should configure it to the way it needs instead of assuming someone else will have :)

@mislav
Copy link
Collaborator

mislav commented Feb 16, 2022

Wow, awesome job tracking this down, and thank you for the fix! 🎉

abayer added a commit to abayer/cli that referenced this issue Jun 28, 2022
I was having trouble debugging some tests I was working on refactoring, and eventually realized this was due to AlecAivazis/survey#408, which was fixed in survey v2.3.3.

This also ends up upgrading hinshun/vt10x, getting rid of the `NewVT10xTerminal` function we'd been calling in a few places, so I band-aided that by adding an equivalent function in `test/helper/helpers.go`.

Signed-off-by: Andrew Bayer <andrew.bayer@gmail.com>
abayer added a commit to abayer/cli that referenced this issue Jun 28, 2022
I was having trouble debugging some tests I was working on refactoring, and eventually realized this was due to AlecAivazis/survey#408, which was fixed in survey v2.3.3.

This also ends up upgrading hinshun/vt10x, getting rid of the `NewVT10xTerminal` function we'd been calling in a few places, so I band-aided that by adding an equivalent function in `test/helper/helpers.go`.

Signed-off-by: Andrew Bayer <andrew.bayer@gmail.com>
tekton-robot pushed a commit to tektoncd/cli that referenced this issue Jun 29, 2022
I was having trouble debugging some tests I was working on refactoring, and eventually realized this was due to AlecAivazis/survey#408, which was fixed in survey v2.3.3.

This also ends up upgrading hinshun/vt10x, getting rid of the `NewVT10xTerminal` function we'd been calling in a few places, so I band-aided that by adding an equivalent function in `test/helper/helpers.go`.

Signed-off-by: Andrew Bayer <andrew.bayer@gmail.com>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants