-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Blaze support #127
Comments
I'll see what I can do. It should be combined with working on splitting out the common code from tabview and gtabview. I was a little disappointed in the file:// behavior for large files. One very large (~800M with a couple million rows, I believe) file didn't ever open at all in gtabview after working on it for about 5 minutes. A smaller file (380M) opened, but there was an almost 10 second lag each time it loaded a new section of the file. It also didn't work at all for a Latin-1 encoded file. I tried it with mysql tables and of course most of my tables have a DECIMAL(10,2) data-type...which isn't yet supported by odo (blaze/odo#206). Just a little frustrating that it wasn't handling very well the data I was throwing at it! Scott |
About DECIMAL issue, it's more a I just add here some code to create big random CSV file
I tried both
and you are right that's not usable with big file size ! |
AFAIK blaze is just reading the file in chunks somehow. It initially opens quicker, but then it's just as slow as requiring each new chunk. And if I'm not mistaken, pandas read_csv is just csv in disguise, without the little tweaks we added in tabview. We could do much better than that assuming files and sequential reads: ie: read only 'n' lines (exactly one chunk) when the file size is beyond a certain threshold. For files (not streams) we could do that in both forward and reverse to avoid allocating memory at the expense of extra I/O. |
Hello,
@wavexx did an excellent work to provide a Blaze support to gtabview
see TabViewer/gtabview#10
it's now possible to connect to any database supported by SQLAlchemy and display a table (even a very long table) using a table URI http://blaze.pydata.org/en/latest/uri.html
Some other improvements have been exposed #116
(especially about Pandas DataFrame with multi index)
It will be nice if we could have such a Blaze support on tabview side because it will become possible to display content of very long tables from SSH connection (for example)
Kind regards
The text was updated successfully, but these errors were encountered: