Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can’t safely put NUL or CR bytes inside a double-quoted string #186

Open
andersk opened this issue Apr 5, 2016 · 4 comments
Open

Can’t safely put NUL or CR bytes inside a double-quoted string #186

andersk opened this issue Apr 5, 2016 · 4 comments

Comments

@andersk
Copy link
Contributor

andersk commented Apr 5, 2016

Inside a double-quoted string, Pyth translates CR (\r) to LF (\n). NUL bytes (\000) seem to work unless followed by a digit 0–7, because Pyth translates them to \0 instead of \000.

$ printf '"\r"' | xxd
00000000: 220d 22                                  "."
$ printf '"\r"' | pyth -d /dev/stdin
==================== 3 chars =====================
"
"
==================================================
imp_print("\n")
==================================================

$ printf '"\00012"' | xxd
00000000: 2200 3132 22                             ".12"
$ printf '"\00012"' | pyth -d /dev/stdin 
==================== 5 chars =====================
"12"
==================================================
imp_print("\012")
==================================================
@isaacg1
Copy link
Owner

isaacg1 commented Apr 5, 2016

I've fixed the null byte issue, but the CR issue seems to be introduced by Python. I'll need to investigate more for that one.

@andersk
Copy link
Contributor Author

andersk commented Apr 5, 2016

If you replace open(file_or_string, encoding='iso-8859-1') with open(file_or_string, encoding='iso-8859-1', newline=''), then Python will stop translating \r and \r\n to \n. Of course, you may then need to teach Pyth to keep accepting \r and \r\n in various other places where newlines are significant, to keep Mac and Windows users happy.

(It may be cleaner, but more work, to open in binary mode and use bytes everywhere?)

@vendethiel
Copy link
Contributor

\r hasn't been used on Mac for a while now

@andersk
Copy link
Contributor Author

andersk commented Aug 2, 2016

There are similar issues with \ followed by NUL or LF or CR.

\␀imp_print("␀")ValueError: source code string cannot contain null bytes

\␊ or \␍IndexError: string index out of range

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants