Followup to Conditional HTML Styling #11610

Open
TomAugspurger opened this Issue Nov 16, 2015 · 40 comments

Comments

Projects
None yet
10 participants
Contributor

TomAugspurger commented Nov 16, 2015

Follows #10250

For 0.17.1

  • Nicer table styling for the pydata.org website
  • Include a visual example in 0.17.1.rst
  • remove doc/source/html-styling.html and find a way to include doc/source/html-styling.ipynb in the doc build (should use --template=basic)
  • update print_versions
  • update requirements_all.txt to include jinja

For 0.18.0 / Future

  • sparsify MultiIndex repr (maybe push till 0.18)
  • easy alignment styles #12144
  • Template modification: This is for things like wrapping base64 encoded values in img tags, urls, etc. flows into...
  • Break the large template in Styler.template into smaller blocks. Let people extend that. We could (maybe) allow users to choose which template to use to render each column/cell with solving the template modification problem
  • Truncated repr
  • hook into pd.options, allow setting of default reprs with styles
  • Refactor parts of Styler into a BaseStyler, maybe add a LaTeX styler (maybe deprecate / replace the to_html and to_latex methods; Jinja templates are much more pleasant to work with), xref #11700
  • Categoricals / Boolens builtin stylings

will add more as we go.

TomAugspurger added this to the 0.17.1 milestone Nov 16, 2015

Contributor

jreback commented Nov 16, 2015

@TomAugspurger I had to slightly change style.rst to get this to appear in the index. You may want to further modify.

Contributor

janschulz commented Nov 16, 2015

A comment on the API of the highlighter:

def color_negative_red(val):
    """
    Takes a scalar and returns a string with
    the css property `'color: red'` for negative
    strings, black otherwise.
    """
    color = 'red' if val < 0 else 'black'
    return 'color: %s' % color

This basically only works for html representation ("color: red" is css speak) and assumes that the return value should got to the css.

In latex (see e.g. this example), you prefix/suround the value with a command:

[...]
\usepackage[table]{xcolor}% http://ctan.org/pkg/xcolor
  Some & \cellcolor{blue!25}coloured & contents \\

would render the second cell blue (by using a command from a special package).

So for latex you probably need templates ala \cellcolor{blue!25}%s or \textbf{%s} (which make the value bold).

kynan commented Nov 16, 2015

@TomAugspurger I think you made a great start on this! A few ideas for making this approach potentially more flexible:

  1. Allow extra attributes on the <table>.

    Common use case: You want to export an html table with sortable column, e.g. using sortable. For that you would need the following opening tag:

    <table class="sortable-theme-bootstrap" data-sortable>
    
  2. Support "external" styling via existing custom CSS style sheets.

    For this to work, the current approach with unique ids per cell which are then targeted with a auto generated CSS doesn't really work. Instead it would be useful to be able to assign additional classes (or data attributes) to cells based on the Styler rules.

    Example use case: assign class positive to all values > 0 and negative to all values < 0

    A potential advantage of this would be much smaller size of generated code (since we don't need a custom CSS block for each cell) and better performance when rendering in the browser (fewer rules to apply).

  3. Allow setting arbitrary attributes on table cells based on rules.

This extends the previous suggestion and would allow using exported html tables with any kind of JavaScript library using specific attributes.

Sorry for being so late in making these suggestions - I didn't manage to read through all of #10250 before it was merged.

Contributor

TomAugspurger commented Nov 16, 2015

@kynan fantastic, thanks for the feedback. I'll go through it in more detail later.

Your item 1. sounds pretty simple. Is attribute the correct term for the items in the opening tag? <table class="sortable-theme-bootstrap" data-sortable> We could include a method like .set_table_attributes for that.

kynan commented Nov 16, 2015

@TomAugspurger I believe attribute is the common term and also the one the W3C uses. set_table_attributes sounds sensible to me.

jreback closed this in #11634 Nov 18, 2015

jreback reopened this Nov 18, 2015

Contributor

jreback commented Nov 18, 2015

@TomAugspurger merged your PR; I'll leave you to close when you are ready.

Contributor

TomAugspurger commented Nov 18, 2015

Thanks, I want to get a better solution in place for including notebooks in the sphinx build, but that works for now.

@kynan, for your second item, assigning classes to cells. My current thinking is to have a method on Styler (would it be terrible to call it .classify?) that takes a function to be evaluated and a class to assign to the cells where that function evaluates to True. The limitation here is that the class that's assigned doesn't get to refer to the data. So you couldn't (easily) do something like our .background_gradient, which is why I discarded this approach originally. But it might make sense to have in addition to the one-class-per cell approach we have. This will probably need to wait for the next release though.

Contributor

TomAugspurger commented Nov 18, 2015

docs are up if anyone sees anything. It does still have the [In] and [Out] tags and ¶ markers I might be able to hide.

Contributor

jreback commented Nov 18, 2015

@TomAugspurger yep look great!

on the css side, I think its possible if we tag with the SAME names, e.g.

df.style.highlight_null(css='null_class').background_gradient(css='gradients_class')

then as long as you tag THOSE cells with that class it would work. we could have default class names (based on the function name), and have this kw to override.

Contributor

TomAugspurger commented Nov 18, 2015

The df.style.highlight_null(css='null_class') I could see working like that, since that fits the binary "if this condition is true, apply this style".

This could be my limited understanding of CSS, but I don't see how you could accomplish df.style.background_gradient(css='gradients_class') in CSS, just knowing that these columns have this class. (I think) you'd need a class per color you want to assign.

I suppose we could add a data attribute to each cell with the value of that cell... You might be able to pull off some CSS wizardry to accomplish it in that case, but I don't see the average python use being able to write or customize that.

Contributor

jreback commented Nov 18, 2015

@TomAugspurger I think you would actually construct the classes WITH the in this case the level embedded, (for some you wouldn't need to do this), maybe something like 'gradient_level_0_class` (e.g. say you ten levels of gradient. but this is a refinement.

@jreback jreback modified the milestone: Next Major Release, 0.17.1 Nov 18, 2015

I've been playing around with the new styling features and have a few comments, overall this is a great new addition.

The highlight_min, highlight_max and highlight_null would be a lot better if instead of taking a color argument they would actually take the css format string or **kwargs that correspond to css style names - this would allow:

  1. using inverted colors (background-color: black; color: white)
  2. other formatting options than background shading (font-weight: bold)

The documentation is also a little confusing in terms of debugging the styling functions.

Debugging Tip: If you're having trouble writing your style function, try just passing it into df.apply. Styler.apply uses that internally, so the result should be the same.```

The full stop and space between apply and Styler is somewhat difficult to spot (I only noticed it when pasting the quote here) - I was looking for df.apply.Styler.apply which obviously doesn't exist, a little rewording would fix this. (You need to look at the text at https://pandas-docs.github.io/pandas-docs-travis/style.html to spot it)

Debugging Tip: If you're having trouble writing your style function, try passing it into df.apply. Internally Styler.apply uses that, so the result should be the same.```

@jreback jreback modified the milestone: 0.18.0, Next Major Release Nov 20, 2015

Contributor

TomAugspurger commented Nov 20, 2015

On your first point, I agree that would be useful. If we end up going with a .classify that takes a function returning booleans (pd.isnull) and assigns classes where that's True, we should be able to handle that pretty easily. I think we should hold off adding more keywords to .highlight_null until we decide what to do there.

I just pushed a PR to clarify the documentation. That was confusing, thanks.

@TomAugspurger to repeat, awesome work!

A question on the 'provisional status'. First, as I said on gitter, I think it is a good idea to put the same provisional note from the notebook in the whatsnew note (experimental = can still change + feedback wanted).
Secondly, we could also emit a warning about this on first usage to be even more explicit? (but maybe that's a bit too intrusive). But if it is only on the first import of the style module, and not each time you use it, maybe it is OK?

Question for the docs: the built notebook in html form is still in the source code. Is this on purpose? (as eg in the latest PR you only updated the notebook and not the html file)

Contributor

TomAugspurger commented Nov 20, 2015

Having the generated HTML is not intended, I thought I deleted that. I'll remove it in my PR adding the provisional note.

For the warning. I never was a fan of always getting the warnings when using IPy widgets. I can go either way though. At the very least I'm going to add a note to the docstring for Styler.

Another small note on the docs: maybe it would be good to include a link the notebook on nbviewer? As this actually still looks better than the one included in the docs (the table styling (the borders) is 'uglier')

Contributor

TomAugspurger commented Nov 20, 2015

I was trying to figure out how to include a link that points to the same version of the notebook, but adding the link changes the notebook :) I suppose we just link to https://github.com/pydata/pandas/blob/master/doc/source/html-styling.ipynb, understanding that the contents at that URL can change?

Contributor

jreback commented Nov 20, 2015

@TomAugspurger we can host a rendered version on pandas.pydata.org easily in the doc directory and just link to it (from the docs). IIRC this was your original suggestion :)

github doesn't render these properly AFAICT

I think putting a link 'See this notebook on nbviewer' that points to the one in master is OK (to be fully correct, it should point to the version in the version tag, but that is bit difficult as that does not yet exist :-))

Contributor

TomAugspurger commented Nov 20, 2015

Just iterate on the links until they converge :)

Just pushed pydata#11664 with an NBviewer link pointing to master until we get the version uploaded to pandas.pydata.org as part of the doc build.

Contributor

jreback commented Nov 20, 2015

hmm, so this is the argument then for including the nbconvert outputted .html, which we can then directly link as a file

A possible advantage of using the link to nbviewer, is that it is then easier to download the notebook to run things yourself

Contributor

TomAugspurger commented Nov 20, 2015

Yeah I think we should definitely link to nbviewer since they already have the stuff in place to download as a link. I'm not sure how including the rendered HTML in the doc build helps (or hurts) with this.

OK, merging the PR then!

Contributor

jreback commented Nov 25, 2015

@TomAugspurger let's try to link all relevant HTML issues at the top of the tracker (as most/maybe all can be accomplished via .style).

Contributor

jreback commented Nov 25, 2015

xref #11700

kynan commented Nov 25, 2015

@TomAugspurger Sorry for the delay. classifier sounds good to me. Is there a reason it could only return a boolean and not a string with the class name?

121onto commented Nov 26, 2015

This feature looks really promising. Thanks all who worked on it!

Wouldn't it be great if I could render DataFrames in html outside of ipython notebooks!? I'm not a fan of developing inside ipython notebook, and working with matplotlib entails a lot of overhead.

Two workflows that come to mind are as follows. First, if you are working on a mac, keep a quicklook window open on a PDF file that you use to store current output. Then, define a my_print function that render()s an html string and prints it to your PDF file:

from weasyprint import HTML

def my_style(frame):
    return frame.style.highlight_null(null_color='red') # or whatever

def my_print_pdf(frame, styler, filename='/Users/username/temp/frame_viewer.pdf'):
    style = styler(frame)
    html = HTML(style.render())
    html.write_pdf(filename)
    return None

The Quicklook should update each time you my_print a DataFrame.

Second, use something like Browsersync to watch an HTML file. To watch a file with Browsersync, you'd type the following in your terminal:

cd ~/temp
browser-sync start --server --index "frame_viewer.html" --files "*.html"

With this approach, you'd write a my_print that dumps the output from render() to an html file. Because Browsersync expects body tags, you'll need to append those to the output from render:

def my_print_html(frame, styler, filename='/Users/username/temp/frame_viewer.html'):
    style = styler(frame)
    html = "<html><head></head><body>" + style.render() + "</body></html>"
    with open(filename, 'w') as f:
        print(html, file=f)

    return None

Each time you call my_print_html Browsersync will refresh your browser automatically.

Notes: code not tested.

Edits 1: tested Browsersync example and updated code so it works on my machine.
Edits 2: updated browsersync command.

Contributor

TomAugspurger commented Nov 26, 2015

Those are both possible now right? You'd just need to write the code / startup the server? Something like your code would fit well in the cookbook documentation.

Adding "builtin" support for rendering in the notebook had a very favorable cost-benefit ratio. Two lines of code for something used by so many. I don't anticipate adding other backends.

On Nov 26, 2015, at 15:30, 121onto notifications@github.com wrote:

This feature looks really promising. Thanks all who worked on it!

Wouldn't it be great if I could render DataFrames in html outside of ipython notebooks!? I'm not a fan of developing inside ipython notebook, and working with matplotlib entails a lot of overhead.

Two workflows that come to mind are as follows. First, if you are working on a mac, keep a quicklook window open on a PDF file that you use to store current output. Then, define a my_print function that render()s an html string and prints it to your PDF file:

from weasyprint import HTML

def my_style(frame):
return frame.highlight_null() # or whatever

def my_print_pdf(frame, styler, filename='~/temp/frame_viewer.pdf'):
style = styler(frame)
html = HTML(style.render())
html.write_pdf(filename)
return None
The Quicklook should update each time you my_print a DataFrame.

Second, use something like Browsersync to watch an HTML file. To watch a file with Browsersync, you'd type something like browser-sync start --server --files "~/temp/frame_viewer.html" in the terminal. With this approach, you'd write a my_print that dumps the output from render() to an html file. Because Browsersync expects body tags, you may need to append those:

def my_print_html(frame, styler, filename='~/temp/frame_viewer.html'):
style = styler(frame)
html = "" + style.render()) + ""
with open(filename, 'w') as f:
print(html, file=f)

return None

Notes: code not tested.


Reply to this email directly or view it on GitHub.

121onto commented Nov 26, 2015

@TomAugspurger yes, possible right now. The Browsersync example should work now.

Contributor

jreback commented Dec 4, 2015

@TomAugspurger just came across this from xlsxwriter here
might be a nugget we could steal....

Contributor

janschulz commented Dec 7, 2015

This is an interesting approach in R: https://github.com/renkun-ken/formattable -> see the last example

joekane3 referenced this issue Dec 15, 2015

Closed

Style source #11844

Hi,

This feature is great - thanks.

I wanted a way to style a column based on data in another column. I couldn't see a way to do this so made a change to the .bar() styler method. Suggestions on how else to perform such a thing would be appreciated.

image

Apologies if I have not done this correctly.

Contributor

TomAugspurger commented Dec 15, 2015

@joekane3 something like that should be possible through the .apply method. Your style function will get the entire DataFrame, so you can use the values in column A to apply styles in columns B. Your function should return a DataFrame with background-color: <color> for column B and empty strings everywhere else.

Of course, you'll have to do all the conversion from values to colors on your own. Pandas just uses matplotlib internally, so that's probably your best bet.

monkh commented Dec 30, 2015

I'm new to github, sorry if this is wrong place to post this.
I don't think there is but is there a way to hide the index column when outputting a styled DataFrame through the .render() function?

Also it seems like the Styler is going to (in the future) make .to_html() obsolete.

Contributor

TomAugspurger commented Dec 30, 2015

Correct, the index is unstyleable right now. I plan to fix that in the future, including an option to hide it.

We'll always have to_html, but the implementation might reuse the code here.

@jreback jreback modified the milestone: 0.18.1, 0.18.0 Feb 13, 2016

TomAugspurger added the Style label Mar 11, 2016

@jreback jreback modified the milestone: 0.18.1, Next Major Release Apr 26, 2016

TomAugspurger removed the Style label May 17, 2016

mjmyers commented Jul 18, 2016

Correct, the index is unstyleable right now. I plan to fix that in the future, including an option to hide it.

@TomAugspurger : Was there ever any headway on this? I can't find mention of it in the docs. I've had to do some pretty hacky css to hide the index while using the styler.

Contributor

jreback commented Jul 18, 2016

actually a bit of work here: pydata#11655

Contributor

TomAugspurger commented Jul 20, 2016

@mjmyers nothing for styling the index yet though. The The big thing is finding an API that's nice to work with. Some possibilities

  • Adding a target={data,index,columns} keyword to relevant functions for what to style
  • Adding dedicated methods like apply_axis/apply_labels or apply_index/apply_columns
  • Adding a namespace df.style.index/columns.<method>

But I haven't thought too much about it yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment