-
-
Notifications
You must be signed in to change notification settings - Fork 30.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BufferedIncrementalEncoder violates IncrementalEncoder interface #64619
Comments
The documentation of IncrementalEncoder.getstate() says: """ But implementation of BufferedIncrementalEncoder.getstate() is def getstate(self):
return self.buffer or 0 self.buffer is "unencoded input that is kept between calls to encode()", e.g. a string. |
I dug up an ancient email about that subject:
And indeed the incremental encoder for idna behaves strange: >>> import io
>>> b = io.BytesIO()
>>> s = io.TextIOWrapper(b, 'idna')
>>> s.write('x')
1
>>> s.tell()
0
>>> b.getvalue()
b''
>>> s.write('.')
1
>>> s.tell()
2
>>> b.getvalue()
b'x.'
>>> b = io.BytesIO()
>>> s = io.TextIOWrapper(b, 'idna')
>>> s.write('x')
1
>>> s.seek(s.tell())
0
>>> s.write('.')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/walter/.local/lib/python3.3/codecs.py", line 218, in encode
(result, consumed) = self._buffer_encode(data, self.errors, final)
File "/Users/walter/.local/lib/python3.3/encodings/idna.py", line 246, in _buffer_encode
result.extend(ToASCII(label))
File "/Users/walter/.local/lib/python3.3/encodings/idna.py", line 73, in ToASCII
raise UnicodeError("label empty or too long")
UnicodeError: label empty or too long The cleanest solution might probably by to switch to a (buffered_input, additional_state_info) state. However I don't know what changes this would require in the seek/tell imlementations. |
IncrementalNewlineDecoder requires that decoder state is integer (C implementation requires at most 63-bit unsigned integer). TextIOWrapper requires that decoder state is at most 64-bit unsigned integer (only 63-bit if universal newlines is enabled). |
For what it’s worth, both io.TextIOWrapper and _pyio.TextIOWrapper appear to only ever call IncrementalEncoder.setstate(0). And the newline _decoder_ is not relevant because it doesn’t use any _encoder_. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: