Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make_response_from_records with large data sets (xlsx) crashes #16

Closed
rotten opened this issue Aug 24, 2016 · 5 comments
Closed

make_response_from_records with large data sets (xlsx) crashes #16

rotten opened this issue Aug 24, 2016 · 5 comments

Comments

@rotten
Copy link

rotten commented Aug 24, 2016

If I call make_response_from_records() to convert to CSV with 1M rows, it does it no problem. However if I do the same thing to XLSX, it runs memory up until the whole flask application comes crashing down and dies.

Obviously an Excel spreadsheet can't normally have 1M rows in it anyhow.

I can check the number of rows before I call make_response_from_records() to mitigate this problem.

I was opening this issue to see if that function could be updated to throw an exception if too many rows are passed instead of happily ingesting them until it dies.

@chfw
Copy link
Member

chfw commented Aug 25, 2016

I think it is a challenge to the underlying library: openpyxl. The boundary condition is related to local computer's memory capacity and the number of cells to be written. So the condition to throw an exception would vary per computer per writing task. To be honest, I do not know how to find the universal boundary condition.

I found a couple of threads relevant for this top in stackoverflow:

http://stackoverflow.com/questions/21328884/openpyxl-writing-large-excel-files-with-python
http://stackoverflow.com/questions/16608028/set-cell-format-and-style-using-optimized-writer-in-openpyxl

So to handle huge data, either optimized writer in openpyxl or xlsxwriter could be provide some hope. Will look at it later.

@chfw
Copy link
Member

chfw commented Aug 25, 2016

Please try this plugin pyexcel-xlsxw as it tries to use constant memory:

pip uninstall pyexcel-xlsx
pip install https://github.com/pyexcel/pyexcel-xlsxw/archive/master.zip

@rotten
Copy link
Author

rotten commented Aug 25, 2016

Thanks! I'll give it a try.

@rotten
Copy link
Author

rotten commented Aug 25, 2016

So far, so good. Requests which crashed my flask api server yesterday (in development) are running without issue ... and much faster too. We'll continue to test. Thanks for finding a solution so quickly!

@chfw
Copy link
Member

chfw commented Aug 26, 2016

released it to pypi. raise an issue if any problem that will emerge.

@chfw chfw closed this as completed Aug 26, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants