Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support large tables in output #64

Open
GoogleCodeExporter opened this issue Sep 1, 2015 · 1 comment
Open

Support large tables in output #64

GoogleCodeExporter opened this issue Sep 1, 2015 · 1 comment

Comments

@GoogleCodeExporter
Copy link

What steps will reproduce the problem?
1. Generate data (e.g 2.9 million rows)
2. Load into PrettyTable using add_row()
3. Print the table

Example:

table_columns = ['file id', 'parent directory (directory id)', 'file name', 
'type', 'extra info']
display_table = PrettyTable(table_columns)
for file_data in file_query():  # this could also be the db_cursor variant
     display_table.addrow([file_data[0], file_data[1], file_data[2], file_data[3], file_data[4]])

print(display_table)


What is the expected output? What do you see instead?

expect that the table gets displayed
instead, it crashes on a memory compliant as PrettyTable tries to convert 
itself into a one big string


What version of the product are you using? On what operating system?

Linux (Kubuntu 14.10), Python 2.7, prettytable 0.7.2

Please provide any additional information below.

Original issue reported on code.google.com by gpcl...@gmail.com on 15 Dec 2014 at 5:02

@GoogleCodeExporter
Copy link
Author

If you were using github or git I'd submit a PR; but since you're using SVN 
I've attached an updated copy of the prettytable.py. This version does two 
things:

1. Enables my use case above by introducing a new function - print_table() - 
that prints the lines to a file (default sys.stdout) instead of building them 
into a list.
Instead of:

print(myprettytable)

You do:

myprettytable.print_table()

It also takes a file and end parameter like the print() does so callers can 
redirect as desired.

2. Reduces memory significantly by using a generator - my test went from 
11-12GB of RAM down to just under 7 GB of RAM usage.

The original get_string() was split into a few more functions to re-use the 
code between the get_string() and print_table().

While this version works, and does a great job for the really big tables; it 
could be further improved if the formatted data did not have to be saved.

prettytable_alternate.py is an attempt to use more generators to reduce memory. 
Indeed it did work - peak was down to just over 5GB, and normal was around 
4.7GB - but it also took a lot longer to output the data (it also had to format 
the data twice due to the row generator). However, in both cases data is being 
outputted earlier than the original implementation since it can be outputted 
before all the data is completely built up.

Perhaps you have other ideas on how to speed this all up and reduce memory 
consumption for the very large table variants.

Original comment by gpcl...@gmail.com on 15 Dec 2014 at 11:35

Attachments:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant