-
-
Notifications
You must be signed in to change notification settings - Fork 11.1k
Open
Description
Proposed new feature or change:
I'm using bunch of large 2D arrays (about 10000 x 10000 ) and I need to compress and save data quickly.
The problem is that the savez_compressed() function uses only default compress level of zipfile.ZipFile class.
(The default compress level of zipfile.ZIP_DEFLATED is '6')
If I could use compress level 1 for savez_compressed() function
- w/ Compress level 1: Compression ratio: 66.8%, exec. time: about 3.2 seconds
- w/ Compress level 6: Compression ratio: 69.5%, exec. time: about 24.7 seconds
--> The compress level 1 is the most suitable for my project.
Therefore I want to configure the compression level in savez_compressed() function.
Here is my solution. I could solve it with adding just few lines.
my modification code snippet of _savez() function definition of the numpy/lib/npyio.py file.
def _savez(file, args, kwds, compress, allow_pickle=True, pickle_kwargs=None):
# Import is postponed to here since zipfile depends on gzip, an optional
# component of the so-called standard library.
import zipfile
if not hasattr(file, 'write'):
file = os_fspath(file)
if not file.endswith('.npz'):
file = file + '.npz'
namedict = kwds
for i, val in enumerate(args):
key = 'arr_%d' % i
if key in namedict.keys():
raise ValueError(
"Cannot use un-named variables and keyword %s" % key)
namedict[key] = val
if compress:
compression = zipfile.ZIP_DEFLATED
else:
compression = zipfile.ZIP_STORED
if 'compresslevel' in namedict:
compresslevel = namedict['compresslevel']
if not isinstance(compresslevel, int) or compresslevel < 1 or compresslevel > 9:
compresslevel = None
else:
compresslevel = None
zipf = zipfile_factory(file, mode="w", compression=compression, compresslevel=compresslevel)
for key, val in namedict.items():
fname = key + '.npy'
val = np.asanyarray(val)
# always force zip64, gh-10776
with zipf.open(fname, 'w', force_zip64=True) as fid:
format.write_array(fid, val,
allow_pickle=allow_pickle,
pickle_kwargs=pickle_kwargs)
zipf.close()
It works!!
I haven't understand the github process from my idea to the merge in this kind of big open source repository cause I have no experience of github upstream activity yet. So... just let me share my idea through an issue.
thanks.
longqzh, zxweed, mrartemev-zero10, AndreiBarsan, dazzle-me and 2 more
Metadata
Metadata
Assignees
Labels
No labels