Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

read1 in cbreak mode returns 0x0D (instead of 0x0A) when <enter> is entered. #114328

Closed
johannesnoordanus opened this issue Jan 19, 2024 · 18 comments
Assignees
Labels
3.12 bugs and security fixes type-bug An unexpected behavior, bug, or error

Comments

@johannesnoordanus
Copy link

johannesnoordanus commented Jan 19, 2024

Bug report

Bug description:

A read1 of 'stdin' in cbreak mode returns 0x0D (instead of 0x0A) when <enter> is entered.
That is for Python version 3.12.1, version 3.11.6 returns 0x0A.

The following code demonstrates this. When run enter <enter>.

"""
cbreak <enter> test
"""

import sys
import tty
import select
import termios
from time import sleep

# save stdin attributes
prevStdinAttributes = termios.tcgetattr(sys.stdin)

# set cbreak mode:
# "Enter cbreak mode. In cbreak mode (sometimes called “rare” mode) normal tty line buffering
#  is turned off and characters are available to be read one by one. However, unlike raw mode,
#  special characters (interrupt, quit, suspend, and flow control) retain their effects on the
#  tty driver and calling program. Calling first raw() then cbreak() leaves the terminal in cbreak mode."
tty.setcbreak(sys.stdin, when = termios.TCSANOW)

c = b''
# wait for a byte
while True:
    if select.select([sys.stdin], [], [], 0) == ([sys.stdin], [], []):
        c = sys.stdin.buffer.read1(1)
        break
    sleep(.1)

# restore stdin attributes
termios.tcsetattr(sys.stdin, termios.TCSADRAIN, prevStdinAttributes)

# print result byte in hex
print(f"bytes: '\\x{c.hex().upper()}'")

CPython versions tested on:

3.11, 3.12

Operating systems tested on:

Linux, macOS

Linked PRs

@ronaldoussoren
Copy link
Contributor

Looks like this was introduced in 486bc8e, based on issue #85984, PR #101832. This user-visible change of behaviour is not mentioned anywhere though.

@johannesnoordanus
Copy link
Author

johannesnoordanus commented Jan 19, 2024

Looks like this was introduced in 486bc8e, based on issue #85984, PR #101832. This user-visible change of behaviour is not mentioned anywhere though.

Linux and MacOs require/define <enter> (LF) mapped to 0x0A. This cannot be correct in my opinion (user-visible or not). See https://en.wikipedia.org/wiki/Newline (Unicode is similar.)

@ronaldoussoren
Copy link
Contributor

ronaldoussoren commented Jan 19, 2024

Looks like this was introduced in 486bc8e, based on issue #85984, PR #101832. This user-visible change of behaviour is not mentioned anywhere though.

Linux and MacOs require/define (LF) mapped to 0x0A. This cannot be correct in my opinion (user-visible or not). See https://en.wikipedia.org/wiki/Newline (Unicode is similar.)

The change feels accidental given that it isn't mentioned anywhere (including in the discussion on the PR and issue)

cc @gpshead (who merged #101832)

@gpshead gpshead added the 3.12 bugs and security fixes label Jan 19, 2024
@gpshead
Copy link
Member

gpshead commented Jan 19, 2024

I'm assuming it's caused by these opening lines in the new tty.cfmakecbreak function:

def cfmakecbreak(mode):
    """Make termios mode cbreak."""
    # Do not map CR to NL on input.
    mode[IFLAG] &= ~(ICRNL)
    ...

@8vasu - any thoughts?

@gpshead
Copy link
Member

gpshead commented Jan 19, 2024

To restore the behavior and retain compatibility we could presumably just remove that ICRNL clearing. It feels worth looking things up to reference where in what standard (posix?) this intent is defined one way or another in a comment.

presumably: https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap11.html

@gpshead
Copy link
Member

gpshead commented Jan 19, 2024

The posix doc is a bit hard to reason about w.r.t. this as it does not mention mode names as used in stty... But is what underlies all these things.

The Linux stty man page doesn't explicitly mention icrnl in cbreak AKA -icanon mode... suggesting it should just leave the bit alone.

There appear to be other opinions on this, AIX's "understanding terminals" docs explicitly say cbreak has ICRNL cleared.

Yet the very name cbreak implies to me in this context that the c stands for Canonical in which one could reasonably expect ICRNL to be set to make the read data... canonical. Regardless of keyboard or terminal type. (Enter aka Return keys typically producing a CR aka \r aka 0x0d - which some terminals remap to a NL aka \n aka 0x0a)

So I agree we should just remove the explicit clearing of ICRNL as a bugfix for this 3.12 code (and note this in our docs). I believe that means we just inherit the current setting for that flag, which is what our previous behavior had long been.

@gpshead
Copy link
Member

gpshead commented Jan 19, 2024

The FreeBSD and macOS stty man pages appear to agree with my reading of "icrnl shouldn't be touched for cbreak".

@gpshead gpshead self-assigned this Jan 19, 2024
gpshead added a commit to gpshead/cpython that referenced this issue Jan 19, 2024
CR -> NL mapping should be inherited in cbreak mode as OSes do not
specify altering it as part of their `stty` cbreak mode definition.
@johannesnoordanus
Copy link
Author

Question: this issue should be covered by a specific test I think?

@8vasu
Copy link
Contributor

8vasu commented Jan 20, 2024

@gpshead

To my knowledge, the cbreak/rare mode is defined neither in the POSIX manual for stty (https://pubs.opengroup.org/onlinepubs/9699919799/) nor in the POSIX General Terminal Interface.

I think the authoritative source in this case is X/Open Curses, Issue 7 (https://pubs.opengroup.org/onlinepubs/9699909599/toc.pdf). A quick search though the pdf reveals the following definition of cbreak mode, which posits clearing ICRNL:

Characters typed by the user are immediately available to the
application and Curses does not perform special processing on
either the erase character or the kill character. An application can
select cbreak mode to do its own line editing but to let the abort
character be used to abort the task. This mode achieves the same
effect as non-canonical-mode, Case B input processing (with
MIN set to 1 and ICRNL cleared) as specified in the XBD
specification. The state of the ISIG and IXON flags are not
changed upon entering this mode.

For an example implementation, please refer to the source of NCURSES_SP_NAME(cbreak) in ncurses, which clears ICRNL: https://github.com/mirror/ncurses/blob/87c2c84cbd2332d6d94b12a1dcaf12ad1a51a938/ncurses/tinfo/lib_raw.c#L177

In my opinion, the functions in tty.py before #101832 were making arbitrary changes to the termios, but now they comply with standards, which might unfortunately break earlier behavior of Python. I tried to keep as much of earlier behavior intact as possible while complying to the standard; for example, before #101832, tty.setcbreak() was clearing ECHO, which I have retained (again, an arbitrary change not specified in the standard; the linked ncurses implementation is not doing this).

@johannesnoordanus
Copy link
Author

johannesnoordanus commented Jan 20, 2024

@8vasu

Following the standards is of course the way to go, but cbreak seems not to be fully defined.
From https://linux.die.net/man/1/stty (for example) I get that:

cbreak
    same as -icanon
-icanon
    disable erase, kill, werase, and rprnt special characters
sane
    same as cread -ignbrk brkint -inlcr -igncr ICRNL -iutf8 -ixoff -iuclc -ixany imaxbel opost -olcuc -ocrnl onlcr -onocr -onlret -ofill -ofdel nl0 cr0 tab0 bs0 vt0 ff0 isig icanon iexten echo echoe echok -echonl -noflsh -xcase -tostop -echoprt echoctl echoke, all special characters to their default values```
and non of <i>disable erase, kill, werase, and rprnt special characters</i> seem to indicate a that ICRNL should be cleared.

Also from the macOS stty manpage I get:

     sane        Resets all modes to reasonable values for interactive terminal use.

     cbreak      If set, enables brkint, ixon, imaxbel, opost, isig, iexten,
                 and -icanon.  If unset, same as sane.

If I combine the two, I get that ICRNL must be set when cbreak is cleared (but not specifically cleared when cbreak is set)
My two cents.

@8vasu
Copy link
Contributor

8vasu commented Jan 20, 2024

It is not going to be fully defined in the standard. Only what it must do will be specified; rest can be implementation-specific as long as it is consistent with what it must do.

Expect severe fragmentation and platform-specific chaos if you deviate from the standard (which is what the stty(1) authors of the various operating systems did unfortunately, and here we are wasting our time trying to patch these manpages together to derive a possibly non-existent sane middle ground).

Since in this case the original behavior of tty.setcbreak() is desired, and since it was a very short function definition, maybe users can define their private functions for this?

Also, maybe we should include a link to the relevant section of X/Open Curses, Issue 7 or The Single UNIX Specification, Version 4 in tty.py adjoining the line that unsets ICRNL so that future contributors are aware of the origin of that? Again, ultimately it is up to the maintainers :)

@johannesnoordanus
Copy link
Author

johannesnoordanus commented Jan 20, 2024

That makes sense. I agree especially about the source comment.
I think this info can also be added to the python documentation of these function(s)?
Because I - and I assume others - didn’t expect this quirky behavior at all.

@gpshead
Copy link
Member

gpshead commented Jan 20, 2024

thanks for the additional research. Agreed with "keep compatibility with prior pythons" in terms of setcbreak behavior. tty.py logic on these hasn't changed in 30 years and there is no trail to indicate why the current values were chosen - they no doubt worked for whatever Steen Lumholt was trying to do at the time on a very old OS. They appear to match what Linux and macOS stty cbreak actually do.

I've updated my draft PR with better documentation to along with the behavior restoration.


Testing actual behaviors of the stty command on Linux and macOS: stty cbreak matches the documentation: it does not touch ICRNL at all on its own. If I stty -icrnl before a stty cbreak, it remains disabled. if I re-enable it, it remains enabled.

You can take the example program up top and change the tty.setcbreak call to os.system("stty cbreak") to see this in action. (and do additional test runs that explicitly set flags to verify that running stty is actually working on your stdin such as os.system("stty -icrnl -echo -icanon"))

Reading some of the linked docs, the "c" in "cbreak" might stand for "Cooked" rather than "Canonical" as I had earlier assumed? Regardless of what the presumed abbreviation might have meant at some ancient point in time, it doesn't matter. We've found the behavior users expect.

@gpshead
Copy link
Member

gpshead commented Jan 20, 2024

I suppose a question about setcraw having also changed to do more could be raised. In that case I don't expect the changes cause problems. The point of raw mode is basically disabling most behaviors. That the implementation after 486bc8e appears more pedantic about doing that seems unlikely to be an observable problem unlike the behavior change in setcbreak.

@8vasu
Copy link
Contributor

8vasu commented Jan 21, 2024

I sent a message to GNU coreutils and to FreeBSD about stty. I think as usual macos is using the FreeBSD stty.

I suspect that since the standard linked above pertains to Curses and not to stty, no implementation of stty needs to follow it. However, it will still be interesting to see their reply.

@8vasu
Copy link
Contributor

8vasu commented Jan 21, 2024

Here is the conversation with GNU Coreutils: https://lists.gnu.org/archive/html/coreutils/2024-01/msg00016.html

Therefore, the rigorous conclusion is this: we should do exactly as @gpshead suggested. At the same time, I will try to add a cursescbreak option to stty for those that prefer it.

@ronaldoussoren
Copy link
Contributor

thanks for the additional research. Agreed with "keep compatibility with prior pythons" in terms of setcbreak behavior. tty.py logic on these hasn't changed in 30 years and there is no trail to indicate why the current values were chosen - they no doubt worked for whatever Steen Lumholt was trying to do at the time on a very old OS. They appear to match what Linux and macOS stty cbreak actually do.

I've updated my draft PR with better documentation to along with the behavior restoration.

That PR looks good.

I agree that we should restore the previous behaviour. That's what Python has done for a long time and there is no strong reason to change the behaviour.

gpshead added a commit that referenced this issue Jan 21, 2024
The terminal CR -> NL mapping setting should be inherited in cbreak mode as OSes do not specify altering it as part of their stty cbreak mode definition.
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Jan 21, 2024
)

The terminal CR -> NL mapping setting should be inherited in cbreak mode as OSes do not specify altering it as part of their stty cbreak mode definition.
(cherry picked from commit fd49e22)

Co-authored-by: Gregory P. Smith <greg@krypto.org>
@gpshead
Copy link
Member

gpshead commented Jan 21, 2024

Thanks for the good report, reproducer, and discussions! Fixed in main, the 3.12 backport is set to automerge.

@gpshead gpshead closed this as completed Jan 21, 2024
gpshead added a commit that referenced this issue Jan 21, 2024
…114410)

The terminal CR -> NL mapping setting should be inherited in cbreak mode as OSes do not specify altering it as part of their stty cbreak mode definition.
(cherry picked from commit fd49e22)

Co-authored-by: Gregory P. Smith <greg@krypto.org>
kulikjak pushed a commit to kulikjak/cpython that referenced this issue Jan 22, 2024
The terminal CR -> NL mapping setting should be inherited in cbreak mode as OSes do not specify altering it as part of their stty cbreak mode definition.
aisk pushed a commit to aisk/cpython that referenced this issue Feb 11, 2024
The terminal CR -> NL mapping setting should be inherited in cbreak mode as OSes do not specify altering it as part of their stty cbreak mode definition.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.12 bugs and security fixes type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

4 participants