-
-
Notifications
You must be signed in to change notification settings - Fork 31.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SyntaxError: encoding problem: iso-8859-1 on Windows #65043
Comments
Microsoft Windows [Version 6.1.7601] C:\bug>python
Python 3.3.5rc2 (v3.3.5rc2:ca5635efe090, Mar 2 2014, 18:18:29) [MSC v.1600 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> exit() C:\bug>python test2.py |
It's a duplicate of the issue bpo-20731. |
It seems that this is not fixed in 3.3.5. Someone please reproduce it. |
Works fine for me |
Thanks Mark. Perhaps, the problem is text-mode handling. When using Windows's text-mode stream, ftell() may return -1 even if no error occured. |
When opening LF-newline file, ftell() may return zero when the position is not at the beginning of the file. Maybe LF-newline file should open in binary-mode. |
I can reproduce this one. There are a few conditions which needs to be met:
More observations:
|
I can reproduce this with 3.4.1 and 3.5.0. |
This fix for bpo-20731 doesn't address this bug completely because it's possible for ftell to return -1 without an actual error, as test2.py demonstrates. In text mode, CRLF is translated to LF by the CRT's _read function (Win32 ReadFile). So the buffer that's used by FILE streams is already translated. To get the stream position, ftell first calls _lseek (Win32 SetFilePointer) to get the file pointer. Then it adjusts the file pointer for the unwritten/unread bytes in the buffer. The problem for reading is how to tell whether or not LF in the buffer was translated from CRLF? The chosen 'solution' is to just assume CRLF. The example file test2.py is 33 bytes. At the time fp_setreadl calls ftell(tok->fp), the file pointer is 33, and Py_UniversalNewlineFgets has read the stream up to '#coding:latin-1\n'. That leaves 17 newline characters buffered. As stated above, ftell assumes CRLF, so it calculates the stream position as 33 - (17 * 2) == -1. That happens to be the value returned for an error, but who's checking? In this case, errno is 0 instead of the documented errno constants EBADF or EINVAL. Here's an example in 2.7.7, since it uses FILE streams: >>> f = open('test2.py')
>>> f.read(16)
'#coding:latin-1\n'
>>> f.tell()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
IOError: [Errno 0] Error Can the file be opened in binary mode in Modules/main.c? Currently it's using |
I've tried to make the title more meaningful, feel free to change it if you can think of something better. |
This bug just bit me. Changing "# coding: utf8" to "# coding: utf-8" works around it. |
(oops: with Python 3.4.1 on Windows) |
I've just been bitten by this on 3.6.2, Windows Server 2008 R2, when running the setup.py script for QuantLib-SWIG: It seems there is different behaviour depending on whether:
Some of that has been mentioned previously, but I think the 4096-byte limit might be new, which is why I'm posting. I've attached a script I used to come up with the results below. It contains:
The file's length is exactly 4096 bytes. Running this, or slightly modified versions of this, with a 3.6.2 interpreter gave the following results:
I had no issues with python 2.7.13. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: