-
-
Notifications
You must be signed in to change notification settings - Fork 29.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
stdlib wrongly uses len() for bytes-like object #88605
Comments
If run this code, it will raise an exception: import pickle
import lzma
import pandas as pd
with lzma.open("test.xz", "wb") as file:
pickle.dump(pd.DataFrame(range(1_000_000)), file, protocol=5) The exception: Traceback (most recent call last):
File "E:\testlen.py", line 7, in <module>
pickle.dump(pd.DataFrame(range(1_000_000)), file, protocol=5)
File "D:\Python39\lib\lzma.py", line 234, in write
self._pos += len(data)
TypeError: object of type 'pickle.PickleBuffer' has no len()
The exception is raised in lzma.LZMAFile.write() method:
https://github.com/python/cpython/blob/v3.10.0b2/Lib/lzma.py#L238
PickleBuffer doesn't have .__len__ method, is it intended? |
Oh, LZMAFile.write() should not use len() directly on input data because it does not always work correctly with memoryview and other objects supporting the buffer protocol. It should use memoryview(data).nbytes or data = memoryview(data).cast('B') if other byte-oriented operations (indexing, slicing) are used. See for example Lib/gzip.py, Lib/_pyio.py, Lib/_compression.py, Lib/ssl.py, Lib/socketserver.py, Lib/wave.py. |
Ok, I'm working on a PR. |
I am checking all the .py files in I think PR 26764 is prepared, it fixes the len() bugs in bz2.py/lzma.py files. |
Thank you for your contribution Ma Lin. |
Serhiy Storchaka: Sorry, I found This bug was reported & fixed by GitHub user The second commit fixes an omission of bpo-41735, a very simple fix, I fix it in PR29468 by the way. |
Can this be closed now or is there anything else to do? |
But this bug will not be triggered. When calling this method, always pass bytes data.
So I think this issue can be closed. |
Would it make sense to backport this change to Python 3.8 or would that not make sense? |
3.8 is in security fix only mode. Unless this is a security issue it cannot be backported there. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: