Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-37640: Fix telnetlib crash in Python3 while receiving un-printable characters from server #22814

Closed
wants to merge 11 commits into from
Closed

bpo-37640: Fix telnetlib crash in Python3 while receiving un-printable characters from server #22814

wants to merge 11 commits into from

Conversation

Sh3idan
Copy link
Contributor

@Sh3idan Sh3idan commented Oct 20, 2020

Hi,
This pull request is based on the bpo-37640, this closed pull request #14877 and the pseudocode of sys.displayhook.

This fix doesn't break the current telnetlib tests:

python -m unittest test_telnetlib.py
...................
----------------------------------------------------------------------
Ran 19 tests in 0.035s

OK

If you have any suggestions for adding tests, let me know.

As requested in #14877, I added an entry in Misc/NEWS.

This fix should be backported by creating a pull request for each version above or equal to 3.6. Are you agree?

https://bugs.python.org/issue37640

Copy link
Member

@Fidget-Spinner Fidget-Spinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution! Referring to the original pull request, you may need to write new tests to ensure the error doesn't happen again (probably test against an integer value not found in the utf-8/utf-16 range that would cause the old code to error but not the new one). To find out how to run/write the tests, you can check out https://devguide.python.org/runtests/. I'm guessing the tests should go into Lib/test/test_telnetlib.py.

Edit: I also have one worry - dumping the bytes to the terminal seems like it might have security problems, you might need to convert it to a string representation first.

Lib/telnetlib.py Outdated Show resolved Hide resolved
Lib/telnetlib.py Outdated Show resolved Hide resolved
@Sh3idan
Copy link
Contributor Author

Sh3idan commented Oct 20, 2020

Thank you for this first review!

I also have one worry - dumping the bytes to the terminal seems like it might have security problems, you might need to convert it to a string representation first.

Can you tell me more about this security problems? (any documentation is also appreciated).

Based on your comments, the available data are written to stdout as string by decoding these data as sys.stdout.encoding. That allow to use the variable environment PYTHONIOENCODING if needed.
Finally, as expected, I added two tests for this interact method to ensure that we don't have any regression from an ascii and utf-8 strings.

Copy link
Member

@Fidget-Spinner Fidget-Spinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, the new code doesn't seem to have that problem anymore, I was just worried that sys.stdout.buffer.write(text) # text is bytes object would allow writing bytes directly to the terminal. A quick search leads to this useful post on why that isn't that great https://superuser.com/questions/1520750/can-binary-strings-in-a-terminal-do-anything-dangerous.

I think the current fixes might not fix the scenario give in the bug report. The tests also don't seem to be testing for that case too. The main issue lies with text.decode(), rather than telnetlib per se. From cursory searching, /dev/urandom can produce bytes that either

  1. do not contain correct start bytes for unicode or
  2. are out of range for ascii

For example here's some code that also raises a similar UnicodeDecodeError:

import struct
import ctypes

# create a 16 byte buffer
data = ctypes.create_string_buffer(16)
struct.pack_into(">l", data, 1, 0xFFFFFFF)

# will throw an error
bytes(data).decode('ascii')

# will throw an error
bytes(data).decode('utf-8')

# will probably throw an error too
bytes(data).decode(sys.stdout.encoding)

The problem is, I don't know what's the right way forward for this - technically writing the bytes directly to sys.stdout.buffer would fix the problems. However, I don't know if the responsibility lies on Python to prevent the security issues mentioned above, or that it's on the user to be more cautious.

Edit: I think a triage member or core dev should be added to the issue's nosy list, however I have no clue who to ping since https://devguide.python.org/experts/ 's telnetlib entry is empty.

@Sh3idan
Copy link
Contributor Author

Sh3idan commented Oct 21, 2020

A quick search leads to this useful post on why that isn't that great https://superuser.com/questions/1520750/can-binary-strings-in-a-terminal-do-anything-dangerous.

Thanks for the link about binary strings in terminal.

I don't know if the responsibility lies on Python to prevent the security issues mentioned above, or that it's on the user to be more cautious.

I think it's up to the user to be careful or the designer of the terminals to add security. Terminal supports ANSI/VT escape sequences that's why it can lead to a security problem. Anyway, it is already possible from file:

$ cat -v script.sh#!/bin/sh

echo "evil!"
exit 0
^[[2Aecho "Hello World!"

Then, let's print the content in python:

with open('script.sh') as fd:
    print(fd.read())

The result will be:

#!/bin/sh

echo "Hello World!"
exit 0

Edit: I think a triage member or core dev should be added to the issue's nosy list, however I have no clue who to ping since https://devguide.python.org/experts/ 's telnetlib entry is empty.

As it is a unicode issue, you could ping one of unicodedata expert.

By the way, in binary mode and according to the RFC856, Section 5. Description:

the receiver should interpret characters received from the transmitter which are not preceded with IAC as 8 bit binary data, with the exception of IAC followed by IAC which stands for the 8 bit binary data with the decimal value 255.

With this naive fix, some characters are not interpreted. This is my draft test:

    @unittest.mock.patch('telnetlib.sys.stdin', new_callable=io.StringIO)
    def test_interact_bin(self, stdin):
        # [bpo-37640](https://bugs.python.org/issue37640)
        encoding = 'utf-8'
        want = [bytes(range(256))]
        f = io.TextIOWrapper(io.BytesIO(), encoding)
        telnet = test_telnet(want)
        with contextlib.redirect_stdout(f):
            telnet.interact()
        diff = set(want[0]).difference(f.buffer.getvalue())
        self.assertEqual(
            f.buffer.getvalue(),
            want[0],
            msg=str(diff)
        )

These bytes are not supported:

{0, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 254, 255}

We should implement the binary transmission at another time.

@Fidget-Spinner
Copy link
Member

I think you accidentally removed the sys.displayhook lookalike code (the try...except UnicodeDecodeError blocks are gone). I think the original PR without the isinstance(sys.stdout.buffer) checks would've worked fine for our scenario. Sorry for any inconvenience caused.

@Sh3idan
Copy link
Contributor Author

Sh3idan commented Oct 21, 2020

I added the previous code without the isinstance condition and commited your suggestion! If the windows tests doesn't pass, I'll look at it in few hours.

Sorry for any inconvenience caused.

No worries ;)

@Sh3idan
Copy link
Contributor Author

Sh3idan commented Oct 22, 2020

As the function is implemented differently depending on the OS, I have distinguished the tests for Windows and the others. On Windows, listener is tested instead of interact on other OS. Note that we must sanitize the output of the listener function because of an EOF exception.

@github-actions
Copy link

This PR is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 5 days

@github-actions github-actions bot added the stale Stale PR or inactive for long period of time. label Dec 16, 2020
@hugovk
Copy link
Member

hugovk commented Apr 12, 2022

Note the telnetlib module is deprecated in 3.11 and set for removal in 3.13.

See PEP 594 – Removing dead batteries from the standard library and #91217.

@AlexWaygood
Copy link
Member

@Fidget-Spinner, given that this is already reviewed and approved and has no merge conflicts, would you like to merge this one? If not, it should perhaps be closed due to the deprecation.

@Fidget-Spinner
Copy link
Member

@AlexWaygood I approved this way back - before I was even a triager.

Unfortunately, Alex and Hugo are right. Since telnetlib is deprecated, the usual policy IMO is to not accept bugfixes unless it's a security fix. I think we should close this PR and the issue.

@AlexWaygood
Copy link
Member

@AlexWaygood I approved this way back - before I was even a triager.

Oh, I had no idea! I hope you don't mind the throwback 🙂

I think we should close this PR and the issue.

Agreed. I'm sorry that this was open for so long without being either merged or rejected, @Sh3idan. We appreciate the contribution — I hope that this does not dissuade you from contributing to CPython in the future 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting core review stale Stale PR or inactive for long period of time.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants