FIX: Fixed np.load time when file is large & compressed. #26509

rajat315315 · 2024-05-23T14:08:22Z

Instead of passing an object of zipfile.ZipExtFile to format.read_array(), I am now passing _io.BufferedReader whose read() is faster.

seberg · 2024-05-24T08:34:39Z

numpy/lib/_npyio_impl.py

+                assert info.compress_type == 0
+                self.zip.fp.seek(
+                    info.header_offset + len(info.FileHeader()) + 20
+                )


A small nitpicks is that NameToInfo seems undocumented, so I think using .getinfo() is maybe nicer (unfortunately, so is FileHeader()).

One worry is that without duplicating fp first this isn't thread-safe or is NpzFile so fundamentally not thread-safe that we don't have to worry about it?

I am have wondering if we could clean up the read_array to use fp.readinto(res_array) to safe unnecessary copies. Unfortunately, zipfile doesn't implement a specialized readinto so that only saves 1 of the two copies. (which still may be nice on its own.)

Maybe OP can answer about thread-safety as he seemed to introduce the code..

Fixed np.load time.

7f7f6bd

rajat315315 force-pushed the numpy.load-time-slow branch from fe04a0a to 7f7f6bd Compare May 23, 2024 14:13

seberg reviewed May 24, 2024

View reviewed changes

Test-case fixed.

41f18ed

rajat315315 force-pushed the numpy.load-time-slow branch from 2da086c to 41f18ed Compare May 24, 2024 17:02

Using getinfo instead of undocumented NameToInfo.

bfc3514

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FIX: Fixed np.load time when file is large & compressed. #26509

FIX: Fixed np.load time when file is large & compressed. #26509

rajat315315 commented May 23, 2024

seberg May 24, 2024

rajat315315 May 24, 2024

FIX: Fixed np.load time when file is large & compressed. #26509

Are you sure you want to change the base?

FIX: Fixed np.load time when file is large & compressed. #26509

Conversation

rajat315315 commented May 23, 2024

seberg May 24, 2024

Choose a reason for hiding this comment

rajat315315 May 24, 2024

Choose a reason for hiding this comment