This makes DataFrame's show up and nice HTML tables in the IPython notebook. This initial implementation is very plain - but better than the plaintext output. We need to think more about how we want to handle these things, but this is a start.
Add _repr_html method to make DataFrames nice in IPython notebook.
Good idea, I'd been thinking about this recently.
If the dataframe is longer than some limit, the standard __repr__ will switch to a brief form just showing info about each column, rather than showing a massive table. I don't think to_html() currently does that, but it should probably be implemented for this.
I think It would be nice to put the table in a scrollable div of fixed size. I haven't looked at to_html yet, but do you think that approach makes sense?
That mitigates the problem, but just testing, I can quickly make a DataFrame with 10M rows. That HTML will probably take some time to generate, and take over a lot of memory (first on the server, then in the browser). So I think there needs to be some cut-off.
Another option would be to do a head view - so show the first, say, 50 rows, and have something at the bottom that indicates there's more.
to_html() does not have the __repr__() cleverness to switch between a brief form or a full dump. It is not really needed i think, there is plenty of stuff one can do in the html world to display tables in whatever form. FYI I'm using to_html() combined with mako
Now for the ipython notebook, it sounds like a good idea to do something extra on top of to_html() (like indeed for e.g a scrollable div) to handle large DataFrames. This can be done without changing to_html() itself.
I pulled your code, added a scrollable div and fall over to info representation for large DataFrames ( b570153). Also added a little unit test.
I see the mechanics of the fallback for large dataframes returns text in a <pre> tag. I think there's a neater way of falling back to a text repr, though I forget whether it's to return None or raise an error. Brian will know, I'm sure.
@lodagro the summary repr from _repr_html is missing the class header
Also maybe it should use print_config.max_columns instead of 20?
Reason i did not use print_config variables is because it is in a scrollable div.
Meaning that with the default setting of print_config.max_columns, there would be no vertical scroll bar. Since repr would switch over to summary view if DataFrame is wider than terminal.
Ok, i will use the same switch over between full and summary for _repr_html_ and __repr__ and have a look at the missing class header.
Ah, that's a good point. Maybe just have no limit then on the number of columns since you can scroll right?
I did put a limit on rows/columns to avoid sending massive amounts of html to the notebook, in case of very large dataframes. So what do we chose?
Now that i think of it, what about adding some css styling? I don't know if the notebook can handle it. I can give it a try.
Any preferences on style?
No preferences there but go right ahead. I have never been the best with those kinds of aesthetics
A few points:
Any plans to add an option to the notebook to disable enriched repr? I actually rather like the plain text output for demos
Was patiently waiting for feedback, apparently my last comments did not reach github, so repeating
'<pre>' + multi_line_string + '</pre>'
'<pre>\n' + multi_line_string + '\n</pre>'
I don't really have time to work on this right now but a few comments:
DataFrame_repr_html changes according to #772 discussion.