Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

on_demand for xlsx and fixed MemoryError for mmaped big xls file #368

Closed
wants to merge 7 commits into from
Closed

Conversation

Dragon2fly
Copy link

  • xlsx: on_demand now also works with xlsx file, significantly reduce the time of loading workbook if you only need to access a certain sheet's data. Unittest is also included.

  • xls: on_demand with mmaped used to have no effect on xls file, whose has non-contiguous sectors. Last line of _locate_stream caused it to load the whole data from mmaped file into the memory, time consuming and MemoryError for big xls files (>100MB). Now the mmap part work as intended, only needed range of memory is loaded up on sheet access. Thus, reduce the access time and memory in case you only need a certain sheet's data. Unittest is also included.

…o work with xlsx

Add unittest case: xls on_demand mmap: memory error
Add unittest case: xlsx on_demand: loads sheet faster, unload sheet does work
note: xls test file size reduced to 99.9MB to be able to push to github.
@coveralls
Copy link

coveralls commented May 1, 2020

Coverage Status

Coverage increased (+1.2%) to 63.394% when pulling e97443b on Dragon2fly:master into f8371f0 on python-excel:master.

@cjw296
Copy link
Member

cjw296 commented May 4, 2020

Please see the note under https://github.com/python-excel/xlrd#xlrd.

@cjw296 cjw296 closed this May 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants