Color text based on background color when using `_background_gradient()` #21263

joelostblom · 2018-05-30T20:41:26Z

The purpose of this PR is to automatically color text dark or light based on the background color of the HTML table:

================old behavior ===========================new behavior========

As described in #21258, I use the luminance-based approach from seaborn's annotated heatmaps. Tagging @WillAyd who commented on that issue. A few comments on this PR

Initially, I was not sure if defining the relative_luminance() method within _background_gradient() was the right way to go, but I opted for this since I saw the same approach elsewhere in the file. Let me know if you prefer a different approach.
I am not sure how intuitive it is that a parameter named text_color takes a numeric argument and not a color, but I think it is a good name for discoverability. Naming it luminance_threshold or similar might be confusing for users looking for a way to change the text color.
I opted to make the light text not completely white. The focus should be the background color so the text should not pop out too much. Thoughts?

================#ffffff ============================#f1f1f1 (current choice)===

I think 0.2 is a good default threshold value for text_color based on my own qualitative assessment, feel free to disagree. The seaborn default of 0.4 makes too much text white in my opinion. Since colors and contrast can be quite subjective, I thought I would include some comparisons.

============text_color=0.4 ======================text_color=0.2 (current choice)===

This is my first PR, apologies if I have misunderstood something.

closes Automatically adjust HTML table text color when using style.background_gradient() #21258
tests added / passed
passes git diff upstream/master -u -- "*.py" | flake8 --diff
whatsnew entry

Close pandas-dev#21258

WillAyd

Thanks for the PR. Here are some initial comments

WillAyd · 2018-05-30T22:20:24Z

pandas/io/formats/style.py

+        text_color: float or int
+            luminance threshold for determining text color. Facilitates text
+            visibility across varying background colors. From 0 to 1.
+            0 = all text is dark colored, 1 = all text is light colored.


Can add a version added for 0.24.0 for this

is this the common name for this field in mpl?

IMHO text_color=0.2 looks very counterintuitive to me. It almost looks like the exact opposite of what this feature is about (not having a constant text color).
Shouldn't the name at least contain "threshold", e.g. text_color_threshold?

I agree that text_color_threshold is suitable, is it ok with multiple underscores in the parameter name? I don't think there is a common name for this in mpl (for coloring text in general, they tend to use just color, but I believe it would be easy to mistake that for the backgruond color here due to the method's name)

WillAyd · 2018-05-30T22:21:49Z

pandas/io/formats/style.py

@@ -863,7 +863,7 @@ def highlight_null(self, null_color='red'):
        return self

    def background_gradient(self, cmap='PuBu', low=0, high=0, axis=0,
-                            subset=None):
+                            subset=None, text_color=0.2):


To your question of what value to use as a default I don't have a preference visually, but if Seaborn is using 0.4 I'd rather just fall inline with that. Would certainly make the look and feel more consistent for users using both in say a Jupyter notebook

WillAyd · 2018-05-30T22:22:11Z

pandas/io/formats/style.py

+            if (not isinstance(text_color, (float, int)) or
+                    not 0 <= text_color <= 1):
+                msg = "`text_color` must be a value from 0 to 1."
+                raise ValueError(msg)


Can you add a test to ensure this raises?

I am having troubles getting the test correct for this. When I try the function manually, it raises ValueError when called with any one of the parameters in the test, but pytest keeps failing saying that it doesn't raise a ValueError. Would you have time to check my latest commits and advise?

WillAyd · 2018-05-30T22:23:57Z

pandas/io/formats/style.py

+                raise ValueError(msg)
+
+            def relative_luminance(color):
+                """Calculate the relative luminance of a color according to W3C


There is a standard for pandas docstrings you'll want to follow:

https://python-sprints.github.io/pandas/guide/pandas_docstring.html

Off the top of my head:

The first row should be only one line

The type and description of parameter(s) should be on separate lines

You'll want a space before the Returns section

Could add a Raises section for the bad luminance

WillAyd · 2018-05-30T22:25:10Z

pandas/tests/io/formats/test_style.py

@@ -1031,7 +1031,9 @@ def test_background_gradient(self):

        result = df.style.background_gradient(
            subset=pd.IndexSlice[1, 'A'])._compute().ctx
-        assert result[(1, 0)] == ['background-color: #fff7fb']
+
+        assert result[(1, 0)] == ['background-color: #fff7fb',


Can we add a more comprehensive / dedicated test for this? Something that encompasses the full range of expected values

Added a suggestion

codecov · 2018-05-31T00:02:12Z

Codecov Report

Merging #21263 into master will increase coverage by <.01%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master   #21263      +/-   ##
==========================================
+ Coverage   91.84%   91.85%   +<.01%     
==========================================
  Files         153      153              
  Lines       49538    49555      +17     
==========================================
+ Hits        45499    45518      +19     
+ Misses       4039     4037       -2

Flag	Coverage Δ
#multiple	`90.25% <100%> (ø)`	⬆️
#single	`41.86% <9.09%> (-0.01%)`	⬇️

Impacted Files	Coverage Δ
pandas/io/formats/style.py	`96.12% <100%> (+0.08%)`	⬆️
pandas/core/frame.py	`97.22% <0%> (ø)`	⬆️
pandas/core/series.py	`94.12% <0%> (ø)`	⬆️
pandas/util/_decorators.py	`82.25% <0%> (ø)`	⬆️
pandas/tseries/offsets.py	`97% <0%> (ø)`	⬆️
pandas/core/arrays/categorical.py	`95.69% <0%> (+0.01%)`	⬆️
pandas/core/sparse/array.py	`91.38% <0%> (+0.06%)`	⬆️
pandas/core/algorithms.py	`94.81% <0%> (+0.31%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c85ab08...41911af. Read the comment docs.

WillAyd · 2018-06-01T00:56:52Z

pandas/tests/io/formats/test_style.py

@@ -1028,10 +1028,25 @@ def test_background_gradient(self):
            assert all("#" in x[0] for x in result.values())
            assert result[(0, 0)] == result[(0, 1)]
            assert result[(1, 0)] == result[(1, 1)]
+            for res in result:


Make this a separate test called test_text_color_threshold to distinguish it from the gradient testing.

Also I didn't really understand the point of the conditional - can we not be more assertive about the exact values we'd expect?

My thinking is that we want to test if the text color is light or dark conditional on the background color. If for some unexpected reason the background color changes, this test should not fail.

I added a section to assert if the background color takes on one of its 4 expected values to prevent that test_text_color_threshold doesn't test anything (I could move this section to test_background_gradient if you think that is more suitable).

WillAyd · 2018-06-01T00:57:11Z

pandas/tests/io/formats/test_style.py

+                                  'color: #000000']
+
+    @td.skip_if_no_mpl
+    def test_text_color_threshold(self):


Rename this to test_text_color_threshold_raises

WillAyd · 2018-06-01T00:58:10Z

pandas/tests/io/formats/test_style.py

+    @td.skip_if_no_mpl
+    def test_text_color_threshold(self):
+        df = pd.DataFrame([[1, 2], [2, 4]], columns=['A', 'B'])
+        for text_color_threshold in [1.1, '1', -1, [2, 2]]:


You can parametrize these (check test_to_html.py for examples if you don't know what I mean)

WillAyd · 2018-06-01T00:59:31Z

pandas/tests/io/formats/test_style.py

+    def test_text_color_threshold(self):
+        df = pd.DataFrame([[1, 2], [2, 4]], columns=['A', 'B'])
+        for text_color_threshold in [1.1, '1', -1, [2, 2]]:
+            with pytest.raises(ValueError):


tm.assert_raises_regex would be preferable as you can use it to assert on the message as well. Can still be used as a context manager (can also see usage in test_to_html.py)

Thanks for the pointers, I updated accordingly.

This test still complains that a ValueError is not raised for any of the parameters. When I call the the function outside the test, it does raise ValueError: ('`text_color_threshold` must be a value from 0 to 1.', 'occurred at index A'). The test works fine if I raise ValueError(msg) manually.

This didn't raise for me locally, though it did when running the _compute method

In [12]: df.style.background_gradient(text_color_threshold='1') Out[12]: <pandas.io.formats.style.Styler at 0x110163630> In [13]: df.style.background_gradient(text_color_threshold='1')._compute() ValueError: ('`text_color_threshold` must be a value from 0 to 1.', 'occurred at index A')

Thanks, I missed that _compute() is called automatically in Notebooks via _repr_html_. I believe all comments are addressed now.

WillAyd · 2018-06-01T01:00:34Z

pandas/io/formats/style.py

@@ -863,7 +863,7 @@ def highlight_null(self, null_color='red'):
        return self

    def background_gradient(self, cmap='PuBu', low=0, high=0, axis=0,
-                            subset=None):
+                            subset=None, text_color_threshold=0.408):
        """


Can you also add a Raises section for the docstring?

WillAyd · 2018-06-03T16:58:58Z

pandas/tests/io/formats/test_style.py

+    @td.skip_if_no_mpl
+    def test_text_color_threshold(self):
+        df = pd.DataFrame([[1, 2], [2, 4]], columns=['A', 'B'])
+        for c_map in [None, 'YlOrRd']:


Parametrize the cmap

WillAyd · 2018-06-03T17:03:38Z

pandas/tests/io/formats/test_style.py

+        df = pd.DataFrame([[1, 2], [2, 4]], columns=['A', 'B'])
+        for c_map in [None, 'YlOrRd']:
+            result = df.style.background_gradient(cmap=c_map)._compute().ctx
+            for res in result:


Instead of the loop just be explicit about the dict that you expect and compare it to the result

WillAyd · 2018-06-03T19:29:15Z

pandas/tests/io/formats/test_style.py

-                elif result[res][0].split(' ')[1] in ['#800026', '#440154']:
-                    assert result[(res)][1].split(' ')[1] == '#f1f1f1'
+        result = df.style.background_gradient(cmap=c_map)._compute().ctx
+        test_colors = {None: {(0, 0): ('#440154', '#f1f1f1'),


Change the variable here to expected (standard in pandas tests)

Instead of using a nested dict you should also send this in via parametrization, so you would have "c_map,expected" and then send in a tuple of the c_map and it's expected result

WillAyd · 2018-06-03T19:31:08Z

pandas/tests/io/formats/test_style.py

+                       'YlOrRd': {(0, 0): ('#ffffcc', '#000000'),
+                                  (1, 0): ('#800026', '#f1f1f1')}}
+        # Light text on dark background
+        assert result[0, 0][0].split(' ')[1] == test_colors[c_map][0, 0][0], (


Why are you splitting this? Just make your expected variable account for the exact string required. Unless I'm missing something you should just have one assertion here for result == expected

I was doing this to be able to raise different assertion messages for the background and foreground color. But since pytest shows the expected and current value anyways, this should be easy to trace done without separate assertion messages.

WillAyd · 2018-06-03T20:24:33Z

pandas/tests/io/formats/test_style.py

+    def test_text_color_threshold(self, c_map, expected):
+        df = pd.DataFrame([[1, 2], [2, 4]], columns=['A', 'B'])
+        result = df.style.background_gradient(cmap=c_map)._compute().ctx
+        assert result[0, 0] == expected[0]


Much closer but why not set up expected so it looks something like:

{(0, 0): [...], (0, 1): [...]}

And simplify at the end to result == expected?

WillAyd · 2018-06-03T20:25:40Z

pandas/tests/io/formats/test_style.py

+        assert result[(1, 0)] == ['background-color: #fff7fb',
+                                  'color: #000000']
+
+    @td.skip_if_no_mpl


Do these tests need to be in the MatplotlibDep test class? I didn't think this required mpl but could be wrong.

If that is the case then move the @td.skip_if_no_mpl decorator to the class instead of on each function individually.

Yes, they both call background_gradient() which depends on mpl.

WillAyd · 2018-06-03T20:46:44Z

doc/source/whatsnew/v0.24.0.txt

@@ -181,7 +181,7 @@ Reshaping
 Other
 ^^^^^

-
+- :meth: `~pandas.io.formats.style.Styler.background_gradient` now takes a ``text_color_threshold`` parameter to automatically lighten the text color based on the luminance of the background color. This improves readability with dark background colors without the need to limit the background colormap range. (:issue:`21258`, :issue:`21269`)


Can just reference the first issue here, since that is what you are closing anyway (second was a duplicate)

WillAyd · 2018-06-03T20:47:15Z

Thanks for the changes. Assuming tests pass I'll approve on my end

joelostblom · 2018-06-03T23:27:25Z

@WillAyd Thank you for all the help, much appreciated!

gfyoung · 2018-06-06T16:33:22Z

@TomAugspurger : Could you take a look?

TomAugspurger · 2018-06-06T21:38:04Z

pandas/io/formats/style.py

@@ -879,26 +879,39 @@ def background_gradient(self, cmap='PuBu', low=0, high=0, axis=0,
            1 or 'columns' for columnwise, 0 or 'index' for rowwise
        subset: IndexSlice
            a valid slice for ``data`` to limit the style application to
+        text_color_threshold: float or int


Add , default 0.408.

And may a note as to why that's the default?

We chose 0.408 to stay consistent with the Seaborn implementation. Should I ask the Seaborn author for the underlying reason, just reference seaborn, or leave the value without explanation?

No worries, was just curious.

TomAugspurger · 2018-06-06T21:38:19Z

pandas/io/formats/style.py

+        Raises
+        ------
+        ValueError
+            If ``text_color_threshold`` is not a value from 0 to 1.


Single backtick for parameter names.

Thanks, I did double tick because I saw it in other places, for example for high and low in the notes section. Do you want me to update them to single tick as well? Does the same go for data in the subset section and the expressions in the note section?

Yes, I noticed that after posting. Probably fine to be consistent w/ the rest of the docstring here.

TomAugspurger · 2018-06-07T12:52:27Z

Thanks @joelostblom!

soxofaan · 2018-06-07T19:26:03Z

thanks for this improvement.

FYI: under PR #21259 I add support for axis=None in background_gradient. As a side effect I could simplify the implementation of the relative_luminance function that was added here, because it now just receives a rgba tuple/array directly (instead of a color hex string that had to be parsed).

* Color text based on background gradient Closes pandas-dev#21258

joelostblom added 2 commits May 30, 2018 16:39

Color text based on background gradient

36dd331

Close pandas-dev#21258

Add additional return value to the test

04d30f8

WillAyd requested changes May 30, 2018

View reviewed changes

This was referenced May 31, 2018

Use light text color on dark colors in style.background_gradient #21269

Closed

ENH: Add support for tablewise application of style.background_gradient with axis=None #21259

Merged

joelostblom added 6 commits May 31, 2018 20:20

Format docstring to pandas standards

6ce03f3

Set relative luminance default to same as in seaborn

a878359

Add version added note

e7e8444

Test a larger range of expected values

ee79b10

Change parameter name to text_color_threshold

075cd54

Add test for bad text_color_threshold values

5023dd6

WillAyd requested changes Jun 1, 2018

View reviewed changes

joelostblom added 5 commits June 1, 2018 09:51

Add Raises section to docstring

163c364

Add separate test for text_color_threshold

ba6f981

Add parameterized test for text_color_threshold ValueError

a6dc14d

Fix test by calling ._compute()

8993dff

Add whatsnew entry

6eb0930

WillAyd requested changes Jun 3, 2018

View reviewed changes

Make threshold test explicit and parameterized

c30255b

WillAyd requested changes Jun 3, 2018

View reviewed changes

Parametrize test further to simplify assertions

2c34fb9

WillAyd requested changes Jun 3, 2018

View reviewed changes

joelostblom added 2 commits June 3, 2018 16:32

Simplify test further

ae2e849

Move matplotlib decorator to the class

7be4559

WillAyd reviewed Jun 3, 2018

View reviewed changes

Remove ref to duplicate issue

41911af

WillAyd approved these changes Jun 4, 2018

View reviewed changes

gfyoung added Enhancement Visualization plotting labels Jun 6, 2018

TomAugspurger approved these changes Jun 6, 2018

View reviewed changes

TomAugspurger merged commit c388dde into pandas-dev:master Jun 7, 2018

david-liu-brattle-1 pushed a commit to david-liu-brattle-1/pandas that referenced this pull request Jun 18, 2018

ENH: Color text based on background in Styler (pandas-dev#21263)

2bdefef

* Color text based on background gradient Closes pandas-dev#21258

Color text based on background color when using _background_gradient() #21263

Color text based on background color when using _background_gradient() #21263

Conversation

joelostblom commented May 30, 2018 • edited Loading

WillAyd left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

joelostblom May 31, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov bot commented May 31, 2018 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

joelostblom Jun 1, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

WillAyd commented Jun 3, 2018

joelostblom commented Jun 3, 2018

gfyoung commented Jun 6, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TomAugspurger commented Jun 7, 2018

soxofaan commented Jun 7, 2018

Color text based on background color when using `_background_gradient()` #21263

Color text based on background color when using `_background_gradient()` #21263

joelostblom commented May 30, 2018 •

edited

Loading

joelostblom May 31, 2018 •

edited

Loading

codecov bot commented May 31, 2018 •

edited

Loading

joelostblom Jun 1, 2018 •

edited

Loading