Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OverflowError: size does not fit in an int #122

Closed
corydolphin opened this issue Mar 4, 2014 · 17 comments
Closed

OverflowError: size does not fit in an int #122

corydolphin opened this issue Mar 4, 2014 · 17 comments

Comments

@corydolphin
Copy link

I am having trouble with joblib failing to dump a large (~60GB) numpy matrix.

Using joblib 8.03a, numpy1.8, scipy 13, on ubuntu 13.04, on a 64bit CPU. The error seems to suggest that the size of the data in bytes is larger than an int, which is impossible, as a 64bit int has a max value ~20^63 which is on the order of Yotta bytes.

Has anyone seen this error? I plan to pull the latest from numpy, joblib and scipy and see if there is something unreleased that fixes this issue.

Failed to save <type 'numpy.ndarray'> to .npy file:
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/sklearn/externals/joblib/numpy_pickle.py", line 241, in save
    obj, filename = self._write_array(obj, filename)
  File "/usr/local/lib/python2.7/dist-packages/sklearn/externals/joblib/numpy_pickle.py", line 214, in _write_array
    compress=self.compress)
  File "/usr/local/lib/python2.7/dist-packages/sklearn/externals/joblib/numpy_pickle.py", line 89, in write_zfile
    file_handle.write(zlib.compress(asbytes(data), compress))
OverflowError: size does not fit in an int

Traceback (most recent call last):
  File "/usr/lib/python2.7/runpy.py", line 162, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "/home/ubuntu/data-exploration/lin_regression/get_data.py", line 177, in <module>
    joblib.dump(X, 'scratch/lin_regression_data/%s_X.pkl' % experimentName, compress=3)
  File "/usr/local/lib/python2.7/dist-packages/sklearn/externals/joblib/numpy_pickle.py", line 367, in dump
    pickler.dump(value)
  File "/usr/lib/python2.7/pickle.py", line 224, in dump
    self.save(obj)
  File "/usr/local/lib/python2.7/dist-packages/sklearn/externals/joblib/numpy_pickle.py", line 249, in save
    return Pickler.save(self, obj)
  File "/usr/lib/python2.7/pickle.py", line 331, in save
    self.save_reduce(obj=obj, *rv)
  File "/usr/lib/python2.7/pickle.py", line 419, in save_reduce
    save(state)
  File "/usr/local/lib/python2.7/dist-packages/sklearn/externals/joblib/numpy_pickle.py", line 249, in save
    return Pickler.save(self, obj)
  File "/usr/lib/python2.7/pickle.py", line 286, in save
    f(self, obj) # Call unbound method with explicit self
  File "/usr/lib/python2.7/pickle.py", line 649, in save_dict
    self._batch_setitems(obj.iteritems())
  File "/usr/lib/python2.7/pickle.py", line 681, in _batch_setitems
    save(v)
  File "/usr/local/lib/python2.7/dist-packages/sklearn/externals/joblib/numpy_pickle.py", line 249, in save
    return Pickler.save(self, obj)
  File "/usr/lib/python2.7/pickle.py", line 331, in save
    self.save_reduce(obj=obj, *rv)
  File "/usr/lib/python2.7/pickle.py", line 419, in save_reduce
    save(state)
  File "/usr/local/lib/python2.7/dist-packages/sklearn/externals/joblib/numpy_pickle.py", line 249, in save
    return Pickler.save(self, obj)
  File "/usr/lib/python2.7/pickle.py", line 286, in save
    f(self, obj) # Call unbound method with explicit self
  File "/usr/lib/python2.7/pickle.py", line 562, in save_tuple
    save(element)
  File "/usr/local/lib/python2.7/dist-packages/sklearn/externals/joblib/numpy_pickle.py", line 249, in save
    return Pickler.save(self, obj)
  File "/usr/lib/python2.7/pickle.py", line 286, in save
    f(self, obj) # Call unbound method with explicit self
  File "/usr/lib/python2.7/pickle.py", line 486, in save_string
    self.write(BINSTRING + pack("<i", n) + obj)
struct.error: 'i' format requires -2147483648 <= number <= 2147483647
@GaelVaroquaux
Copy link
Member

Hi Cory,

I believe that the core of this issue that a bug in the Python standard
library that uses 32bit ints in some places, and thus you are hitting an
overflow.

I have the impression that you are hitting this overflow when using
compression. My hunch is that the code path not using compression might be
more to this overflow (because it relies in less subtle ways on the
standard library). Could you try without compression?

@corydolphin
Copy link
Author

Without compression it works fine, the only disadvantage being that it takes ~4.6Gb on disk

@GaelVaroquaux
Copy link
Member

Without compression it works fine, the only disadvantage being that it takes ~4.6Gb on disk

Glad to hear that it is working with compression.

I am sorry to hear that compression is not working, but I do believe that
it is outside of our control and in Python's standard library's codebase.
I think that it may very well be fixed in some Python 3 version.

@esc
Copy link
Contributor

esc commented Jul 4, 2014

IIRC zlib has a 2GB limit for a single buffer, which stems from a use of 'in32' (not 'uint32') in zlib, observe:

In [9]: f = "a" * (2**31-1)

In [10]: c = zlib.compress(f)

In [11]: f = "a" * (2**31)

In [12]: c = zlib.compress(f)
---------------------------------------------------------------------------
OverflowError                             Traceback (most recent call last)
<ipython-input-12-e21f43971797> in <module>()
----> 1 c = zlib.compress(f)

OverflowError: size does not fit in an int

@thm1118
Copy link

thm1118 commented May 5, 2015

python 3.4 working fine with compress option for large file, may be fixed on 2011。
it is helpful to give this tip in the document

@mdasadul
Copy link

I am having the same issue with python 2.7

@ogrisel
Copy link
Contributor

ogrisel commented Jan 18, 2016

I am having the same issue with python 2.7

Yes this is a bug in the zlib module which is part of the standard library of Python. Please upgrade to Python 3 if you want to use compression on large numpy arrays (3.5 is the current stable release).

@esc
Copy link
Contributor

esc commented Jan 18, 2016

@ogrisel how was this fixed in Python 3?

@ogrisel
Copy link
Contributor

ogrisel commented Jan 18, 2016

I don't know, I just blindly trusted @thm1118's report. Let me check using the zlib module directly.

@ogrisel
Copy link
Contributor

ogrisel commented Jan 18, 2016

Indeed Python 3.5 works fine while Python 2.7 crashes:

$ python3.5 -c "import zlib; zlib.compress(b'a' * (2 ** 31))"
$ python2.7 -c "import zlib; zlib.compress(b'a' * (2 ** 31))"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
OverflowError: size does not fit in an int

Python 3.5's zlib wrapper probably uses uint instead of int for storing the size of the byte array. Or maybe a even the size_t type to be able to address much larger arrays.

@mdasadul
Copy link

@ogrisel Thanks for the help

@ogrisel
Copy link
Contributor

ogrisel commented Jan 18, 2016

I opened #300 to track the issue of better documenting that limitation of Python 2.

Let me close the current issue.

@ogrisel ogrisel closed this as completed Jan 18, 2016
@przemyslslaw
Copy link

przemyslslaw commented Aug 17, 2016

The bug mentioned here still exists in joblib 0.10 and Python 2.7 for large non-array objects that are pickled via Parallel. It happens because CustomizablePickler fails to dump, presumably because of the issue described in #300. The merge #260 does not fix it, because CustomizablePickler uses its own dump. Is there any workaround?

Process PoolWorker-16:
Traceback (most recent call last):
  File "/NS/anaconda/anaconda/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
    self.run()
  File "/NS/anaconda/anaconda/lib/python2.7/multiprocessing/process.py", line 114, in run
    self._target(*self._args, **self._kwargs)
  File "/NS/anaconda/anaconda/lib/python2.7/multiprocessing/pool.py", line 122, in worker
    put((job, i, (False, wrapped)))
  File "/NS/anaconda/anaconda/lib/python2.7/site-packages/joblib/pool.py", line 386, in put
    return send(obj)
  File "/NS/anaconda/anaconda/lib/python2.7/site-packages/joblib/pool.py", line 371, in send
    CustomizablePickler(buffer, self._reducers).dump(obj)
  File "/NS/anaconda/anaconda/lib/python2.7/pickle.py", line 224, in dump
    self.save(obj)
  File "/NS/anaconda/anaconda/lib/python2.7/pickle.py", line 286, in save
    f(self, obj) # Call unbound method with explicit self
  File "/NS/anaconda/anaconda/lib/python2.7/pickle.py", line 554, in save_tuple
    save(element)
  File "/NS/anaconda/anaconda/lib/python2.7/pickle.py", line 286, in save
    f(self, obj) # Call unbound method with explicit self
  File "/NS/anaconda/anaconda/lib/python2.7/pickle.py", line 554, in save_tuple
    save(element)
  File "/NS/anaconda/anaconda/lib/python2.7/pickle.py", line 331, in save
    self.save_reduce(obj=obj, *rv)
  File "/NS/anaconda/anaconda/lib/python2.7/pickle.py", line 401, in save_reduce
    save(args)
  File "/NS/anaconda/anaconda/lib/python2.7/pickle.py", line 286, in save
    f(self, obj) # Call unbound method with explicit self
  File "/NS/anaconda/anaconda/lib/python2.7/pickle.py", line 554, in save_tuple
    save(element)
  File "/NS/anaconda/anaconda/lib/python2.7/pickle.py", line 286, in save
    f(self, obj) # Call unbound method with explicit self
  File "/NS/anaconda/anaconda/lib/python2.7/pickle.py", line 492, in save_string
    self.write(BINSTRING + pack("<i", n) + obj)
error: 'i' format requires -2147483648 <= number <= 214748364

@ogrisel ogrisel reopened this Aug 17, 2016
@ogrisel
Copy link
Contributor

ogrisel commented Aug 17, 2016

Actually there is no easy way to fix this limitation of Python in the code of joblib. Let's document it as a limitation of Python 2 as tracked by #300.

@ogrisel ogrisel closed this as completed Aug 17, 2016
@ogrisel
Copy link
Contributor

ogrisel commented Aug 17, 2016

@przemyslslaw My previous comment is wrong: the traceback you reported is not related to the problem reported by @corydolphin which was about zlib compression. In your case zlib is not involved at all. Please feel free to open a new issue with a minimalistic reproduction script.

@listentojohan
Copy link

listentojohan commented Mar 14, 2017

I am still getting this error for python 3.
I had compress=3
Running it w. compress=4 fixed it...

@lesteve
Copy link
Member

lesteve commented Mar 14, 2017

It would be great if you could post a stand-alone snippet reproducing the problem together with the full stacktrace.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants