New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OverflowError: size does not fit in an int #122
Comments
Hi Cory, I believe that the core of this issue that a bug in the Python standard I have the impression that you are hitting this overflow when using |
Without compression it works fine, the only disadvantage being that it takes ~4.6Gb on disk |
Glad to hear that it is working with compression. I am sorry to hear that compression is not working, but I do believe that |
IIRC zlib has a 2GB limit for a single buffer, which stems from a use of 'in32' (not 'uint32') in zlib, observe:
|
python 3.4 working fine with compress option for large file, may be fixed on 2011。 |
I am having the same issue with python 2.7 |
Yes this is a bug in the zlib module which is part of the standard library of Python. Please upgrade to Python 3 if you want to use compression on large numpy arrays (3.5 is the current stable release). |
@ogrisel how was this fixed in Python 3? |
I don't know, I just blindly trusted @thm1118's report. Let me check using the zlib module directly. |
Indeed Python 3.5 works fine while Python 2.7 crashes:
Python 3.5's zlib wrapper probably uses |
@ogrisel Thanks for the help |
I opened #300 to track the issue of better documenting that limitation of Python 2. Let me close the current issue. |
The bug mentioned here still exists in joblib 0.10 and Python 2.7 for large non-array objects that are pickled via
|
Actually there is no easy way to fix this limitation of Python in the code of joblib. Let's document it as a limitation of Python 2 as tracked by #300. |
@przemyslslaw My previous comment is wrong: the traceback you reported is not related to the problem reported by @corydolphin which was about zlib compression. In your case zlib is not involved at all. Please feel free to open a new issue with a minimalistic reproduction script. |
I am still getting this error for python 3. |
It would be great if you could post a stand-alone snippet reproducing the problem together with the full stacktrace. |
I am having trouble with joblib failing to dump a large (~60GB) numpy matrix.
Using joblib 8.03a, numpy1.8, scipy 13, on ubuntu 13.04, on a 64bit CPU. The error seems to suggest that the size of the data in bytes is larger than an int, which is impossible, as a 64bit int has a max value ~20^63 which is on the order of Yotta bytes.
Has anyone seen this error? I plan to pull the latest from numpy, joblib and scipy and see if there is something unreleased that fixes this issue.
The text was updated successfully, but these errors were encountered: