Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Color text based on background color when using _background_gradient() #21263

Merged

Conversation

joelostblom
Copy link
Contributor

@joelostblom joelostblom commented May 30, 2018

The purpose of this PR is to automatically color text dark or light based on the background color of the HTML table:

================old behavior ===========================new behavior========
image image

As described in #21258, I use the luminance-based approach from seaborn's annotated heatmaps. Tagging @WillAyd who commented on that issue. A few comments on this PR

  1. Initially, I was not sure if defining the relative_luminance() method within _background_gradient() was the right way to go, but I opted for this since I saw the same approach elsewhere in the file. Let me know if you prefer a different approach.

  2. I am not sure how intuitive it is that a parameter named text_color takes a numeric argument and not a color, but I think it is a good name for discoverability. Naming it luminance_threshold or similar might be confusing for users looking for a way to change the text color.

  3. I opted to make the light text not completely white. The focus should be the background color so the text should not pop out too much. Thoughts?

================#ffffff ============================#f1f1f1 (current choice)===
image image

image image

  1. I think 0.2 is a good default threshold value for text_color based on my own qualitative assessment, feel free to disagree. The seaborn default of 0.4 makes too much text white in my opinion. Since colors and contrast can be quite subjective, I thought I would include some comparisons.

============text_color=0.4 ======================text_color=0.2 (current choice)===
image image
image image
image image

This is my first PR, apologies if I have misunderstood something.

Copy link
Member

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. Here are some initial comments

text_color: float or int
luminance threshold for determining text color. Facilitates text
visibility across varying background colors. From 0 to 1.
0 = all text is dark colored, 1 = all text is light colored.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can add a version added for 0.24.0 for this

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this the common name for this field in mpl?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO text_color=0.2 looks very counterintuitive to me. It almost looks like the exact opposite of what this feature is about (not having a constant text color).
Shouldn't the name at least contain "threshold", e.g. text_color_threshold?

Copy link
Contributor Author

@joelostblom joelostblom May 31, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that text_color_threshold is suitable, is it ok with multiple underscores in the parameter name? I don't think there is a common name for this in mpl (for coloring text in general, they tend to use just color, but I believe it would be easy to mistake that for the backgruond color here due to the method's name)

@@ -863,7 +863,7 @@ def highlight_null(self, null_color='red'):
return self

def background_gradient(self, cmap='PuBu', low=0, high=0, axis=0,
subset=None):
subset=None, text_color=0.2):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To your question of what value to use as a default I don't have a preference visually, but if Seaborn is using 0.4 I'd rather just fall inline with that. Would certainly make the look and feel more consistent for users using both in say a Jupyter notebook

if (not isinstance(text_color, (float, int)) or
not 0 <= text_color <= 1):
msg = "`text_color` must be a value from 0 to 1."
raise ValueError(msg)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a test to ensure this raises?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am having troubles getting the test correct for this. When I try the function manually, it raises ValueError when called with any one of the parameters in the test, but pytest keeps failing saying that it doesn't raise a ValueError. Would you have time to check my latest commits and advise?

raise ValueError(msg)

def relative_luminance(color):
"""Calculate the relative luminance of a color according to W3C
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a standard for pandas docstrings you'll want to follow:

https://python-sprints.github.io/pandas/guide/pandas_docstring.html

Off the top of my head:

  • The first row should be only one line
  • The type and description of parameter(s) should be on separate lines
  • You'll want a space before the Returns section
  • Could add a Raises section for the bad luminance

@@ -1031,7 +1031,9 @@ def test_background_gradient(self):

result = df.style.background_gradient(
subset=pd.IndexSlice[1, 'A'])._compute().ctx
assert result[(1, 0)] == ['background-color: #fff7fb']

assert result[(1, 0)] == ['background-color: #fff7fb',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a more comprehensive / dedicated test for this? Something that encompasses the full range of expected values

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a suggestion

@codecov
Copy link

codecov bot commented May 31, 2018

Codecov Report

Merging #21263 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #21263      +/-   ##
==========================================
+ Coverage   91.84%   91.85%   +<.01%     
==========================================
  Files         153      153              
  Lines       49538    49555      +17     
==========================================
+ Hits        45499    45518      +19     
+ Misses       4039     4037       -2
Flag Coverage Δ
#multiple 90.25% <100%> (ø) ⬆️
#single 41.86% <9.09%> (-0.01%) ⬇️
Impacted Files Coverage Δ
pandas/io/formats/style.py 96.12% <100%> (+0.08%) ⬆️
pandas/core/frame.py 97.22% <0%> (ø) ⬆️
pandas/core/series.py 94.12% <0%> (ø) ⬆️
pandas/util/_decorators.py 82.25% <0%> (ø) ⬆️
pandas/tseries/offsets.py 97% <0%> (ø) ⬆️
pandas/core/arrays/categorical.py 95.69% <0%> (+0.01%) ⬆️
pandas/core/sparse/array.py 91.38% <0%> (+0.06%) ⬆️
pandas/core/algorithms.py 94.81% <0%> (+0.31%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c85ab08...41911af. Read the comment docs.

@@ -1028,10 +1028,25 @@ def test_background_gradient(self):
assert all("#" in x[0] for x in result.values())
assert result[(0, 0)] == result[(0, 1)]
assert result[(1, 0)] == result[(1, 1)]
for res in result:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make this a separate test called test_text_color_threshold to distinguish it from the gradient testing.

Also I didn't really understand the point of the conditional - can we not be more assertive about the exact values we'd expect?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thinking is that we want to test if the text color is light or dark conditional on the background color. If for some unexpected reason the background color changes, this test should not fail.

I added a section to assert if the background color takes on one of its 4 expected values to prevent that test_text_color_threshold doesn't test anything (I could move this section to test_background_gradient if you think that is more suitable).

'color: #000000']

@td.skip_if_no_mpl
def test_text_color_threshold(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename this to test_text_color_threshold_raises

@td.skip_if_no_mpl
def test_text_color_threshold(self):
df = pd.DataFrame([[1, 2], [2, 4]], columns=['A', 'B'])
for text_color_threshold in [1.1, '1', -1, [2, 2]]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can parametrize these (check test_to_html.py for examples if you don't know what I mean)

def test_text_color_threshold(self):
df = pd.DataFrame([[1, 2], [2, 4]], columns=['A', 'B'])
for text_color_threshold in [1.1, '1', -1, [2, 2]]:
with pytest.raises(ValueError):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tm.assert_raises_regex would be preferable as you can use it to assert on the message as well. Can still be used as a context manager (can also see usage in test_to_html.py)

Copy link
Contributor Author

@joelostblom joelostblom Jun 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the pointers, I updated accordingly.

This test still complains that a ValueError is not raised for any of the parameters. When I call the the function outside the test, it does raise ValueError: ('`text_color_threshold` must be a value from 0 to 1.', 'occurred at index A'). The test works fine if I raise ValueError(msg) manually.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This didn't raise for me locally, though it did when running the _compute method

In [12]: df.style.background_gradient(text_color_threshold='1')
Out[12]: <pandas.io.formats.style.Styler at 0x110163630>

In [13]: df.style.background_gradient(text_color_threshold='1')._compute()
ValueError: ('`text_color_threshold` must be a value from 0 to 1.', 'occurred at index A')

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I missed that _compute() is called automatically in Notebooks via _repr_html_. I believe all comments are addressed now.

@@ -863,7 +863,7 @@ def highlight_null(self, null_color='red'):
return self

def background_gradient(self, cmap='PuBu', low=0, high=0, axis=0,
subset=None):
subset=None, text_color_threshold=0.408):
"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also add a Raises section for the docstring?

@td.skip_if_no_mpl
def test_text_color_threshold(self):
df = pd.DataFrame([[1, 2], [2, 4]], columns=['A', 'B'])
for c_map in [None, 'YlOrRd']:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parametrize the cmap

df = pd.DataFrame([[1, 2], [2, 4]], columns=['A', 'B'])
for c_map in [None, 'YlOrRd']:
result = df.style.background_gradient(cmap=c_map)._compute().ctx
for res in result:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of the loop just be explicit about the dict that you expect and compare it to the result

elif result[res][0].split(' ')[1] in ['#800026', '#440154']:
assert result[(res)][1].split(' ')[1] == '#f1f1f1'
result = df.style.background_gradient(cmap=c_map)._compute().ctx
test_colors = {None: {(0, 0): ('#440154', '#f1f1f1'),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change the variable here to expected (standard in pandas tests)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of using a nested dict you should also send this in via parametrization, so you would have "c_map,expected" and then send in a tuple of the c_map and it's expected result

'YlOrRd': {(0, 0): ('#ffffcc', '#000000'),
(1, 0): ('#800026', '#f1f1f1')}}
# Light text on dark background
assert result[0, 0][0].split(' ')[1] == test_colors[c_map][0, 0][0], (
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you splitting this? Just make your expected variable account for the exact string required. Unless I'm missing something you should just have one assertion here for result == expected

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was doing this to be able to raise different assertion messages for the background and foreground color. But since pytest shows the expected and current value anyways, this should be easy to trace done without separate assertion messages.

def test_text_color_threshold(self, c_map, expected):
df = pd.DataFrame([[1, 2], [2, 4]], columns=['A', 'B'])
result = df.style.background_gradient(cmap=c_map)._compute().ctx
assert result[0, 0] == expected[0]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much closer but why not set up expected so it looks something like:

{(0, 0): [...], 
 (0, 1): [...]}

And simplify at the end to result == expected?

assert result[(1, 0)] == ['background-color: #fff7fb',
'color: #000000']

@td.skip_if_no_mpl
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do these tests need to be in the MatplotlibDep test class? I didn't think this required mpl but could be wrong.

If that is the case then move the @td.skip_if_no_mpl decorator to the class instead of on each function individually.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, they both call background_gradient() which depends on mpl.

@@ -181,7 +181,7 @@ Reshaping
Other
^^^^^

-
- :meth: `~pandas.io.formats.style.Styler.background_gradient` now takes a ``text_color_threshold`` parameter to automatically lighten the text color based on the luminance of the background color. This improves readability with dark background colors without the need to limit the background colormap range. (:issue:`21258`, :issue:`21269`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can just reference the first issue here, since that is what you are closing anyway (second was a duplicate)

@WillAyd
Copy link
Member

WillAyd commented Jun 3, 2018

Thanks for the changes. Assuming tests pass I'll approve on my end

@joelostblom
Copy link
Contributor Author

@WillAyd Thank you for all the help, much appreciated!

@gfyoung
Copy link
Member

gfyoung commented Jun 6, 2018

@TomAugspurger : Could you take a look?

@@ -879,26 +879,39 @@ def background_gradient(self, cmap='PuBu', low=0, high=0, axis=0,
1 or 'columns' for columnwise, 0 or 'index' for rowwise
subset: IndexSlice
a valid slice for ``data`` to limit the style application to
text_color_threshold: float or int
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add , default 0.408.

And may a note as to why that's the default?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We chose 0.408 to stay consistent with the Seaborn implementation. Should I ask the Seaborn author for the underlying reason, just reference seaborn, or leave the value without explanation?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries, was just curious.

Raises
------
ValueError
If ``text_color_threshold`` is not a value from 0 to 1.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Single backtick for parameter names.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I did double tick because I saw it in other places, for example for high and low in the notes section. Do you want me to update them to single tick as well? Does the same go for data in the subset section and the expressions in the note section?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I noticed that after posting. Probably fine to be consistent w/ the rest of the docstring here.

@TomAugspurger TomAugspurger merged commit c388dde into pandas-dev:master Jun 7, 2018
@TomAugspurger
Copy link
Contributor

Thanks @joelostblom!

@soxofaan
Copy link
Contributor

soxofaan commented Jun 7, 2018

thanks for this improvement.

FYI: under PR #21259 I add support for axis=None in background_gradient. As a side effect I could simplify the implementation of the relative_luminance function that was added here, because it now just receives a rgba tuple/array directly (instead of a color hex string that had to be parsed).

david-liu-brattle-1 pushed a commit to david-liu-brattle-1/pandas that referenced this pull request Jun 18, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Automatically adjust HTML table text color when using style.background_gradient()
6 participants