# Pandas styling

* We can integrate visual components when displaying Pandas DataFrames
* Style property returns a styler object which provides many options for formatting and displaying DataFrames.
* A styler object is basically a DataFrame with some style. In this article, we will go through 10 examples to master how styling works.
* There are two ways to use styler objects. One is element-wise styling that can be done with applymap method.
* The other one is column- or row-wise styling which requires to use apply method.

## Exercise 1 - creating a DataFrame

In [1]:
import numpy as np
import pandas as pd

df = pd.DataFrame(
    
    {
        'A':np.linspace(1,8,8),
        'B':np.random.random(8),
        'C':np.random.randn(8),
        'D':np.random.randn(8),
        'E':np.random.randint(-5,5,8)
    }
    
)

df.iloc[[1,5],[1,3]] = np.nan

df

Unnamed: 0,A,B,C,D,E
0,1.0,0.617934,-0.846906,-0.339649,-3
1,2.0,,1.24639,,-5
2,3.0,0.249473,-0.409464,-1.133222,1
3,4.0,0.047224,1.212635,-0.387533,3
4,5.0,0.652359,-1.104956,-0.650134,-2
5,6.0,,-1.321813,,-3
6,7.0,0.264685,-1.748838,0.153076,0
7,8.0,0.953899,0.035857,-0.146805,-4


## Exercise 2

* We can write a function that displays some values with a different color based on a condition. 
* For instance, we can choose to display negative values with red. 

In [2]:
def color_negative_values(val):
  color = "red" if val < 0 else "black"
  return "color: %s" % color

In [3]:
df.style.applymap(color_negative_values)

Unnamed: 0,A,B,C,D,E
0,1.0,0.617934,-0.846906,-0.339649,-3
1,2.0,,1.24639,,-5
2,3.0,0.249473,-0.409464,-1.133222,1
3,4.0,0.047224,1.212635,-0.387533,3
4,5.0,0.652359,-1.104956,-0.650134,-2
5,6.0,,-1.321813,,-3
6,7.0,0.264685,-1.748838,0.153076,0
7,8.0,0.953899,0.035857,-0.146805,-4


## Exercise 3

* Applymap executes element-wise operations whereas apply does it based on columns or rows. 
* Here is a function that changes the background color of the max value in a column.

In [4]:
def color_max(s):
    is_max = s == s.max()
    return ["background-color: lightblue" if v else "" for v in is_max]

In [5]:
df.style.apply(color_max)

Unnamed: 0,A,B,C,D,E
0,1.0,0.617934,-0.846906,-0.339649,-3
1,2.0,,1.24639,,-5
2,3.0,0.249473,-0.409464,-1.133222,1
3,4.0,0.047224,1.212635,-0.387533,3
4,5.0,0.652359,-1.104956,-0.650134,-2
5,6.0,,-1.321813,,-3
6,7.0,0.264685,-1.748838,0.153076,0
7,8.0,0.953899,0.035857,-0.146805,-4


## Exercise 4

* We can also apply the function in the previous exercise to rows by setting axis parameter as 1.

In [6]:
df.style.apply(color_max, axis=1)

Unnamed: 0,A,B,C,D,E
0,1.0,0.617934,-0.846906,-0.339649,-3
1,2.0,,1.24639,,-5
2,3.0,0.249473,-0.409464,-1.133222,1
3,4.0,0.047224,1.212635,-0.387533,3
4,5.0,0.652359,-1.104956,-0.650134,-2
5,6.0,,-1.321813,,-3
6,7.0,0.264685,-1.748838,0.153076,0
7,8.0,0.953899,0.035857,-0.146805,-4


## Exercise 5

* We can combine different style functions by chain operations.

In [7]:
df.style.applymap(color_negative_values).apply(color_max)

Unnamed: 0,A,B,C,D,E
0,1.0,0.617934,-0.846906,-0.339649,-3
1,2.0,,1.24639,,-5
2,3.0,0.249473,-0.409464,-1.133222,1
3,4.0,0.047224,1.212635,-0.387533,3
4,5.0,0.652359,-1.104956,-0.650134,-2
5,6.0,,-1.321813,,-3
6,7.0,0.264685,-1.748838,0.153076,0
7,8.0,0.953899,0.035857,-0.146805,-4


## Exercise 6

* Style functions can be partially applied to a dataframe by selecting particular rows or columns using subset parameter.

In [8]:
df.style.apply(color_max, subset=['B','C'])

Unnamed: 0,A,B,C,D,E
0,1.0,0.617934,-0.846906,-0.339649,-3
1,2.0,,1.24639,,-5
2,3.0,0.249473,-0.409464,-1.133222,1
3,4.0,0.047224,1.212635,-0.387533,3
4,5.0,0.652359,-1.104956,-0.650134,-2
5,6.0,,-1.321813,,-3
6,7.0,0.264685,-1.748838,0.153076,0
7,8.0,0.953899,0.035857,-0.146805,-4


## Exercise 7

* In addition to customized functions, pandas have some built-in style functions that might satisfy common tasks. For instance, highlight_null function marks missing values.

In [9]:
df.style.highlight_null(color="yellow")

Unnamed: 0,A,B,C,D,E
0,1.0,0.617934,-0.846906,-0.339649,-3
1,2.0,,1.24639,,-5
2,3.0,0.249473,-0.409464,-1.133222,1
3,4.0,0.047224,1.212635,-0.387533,3
4,5.0,0.652359,-1.104956,-0.650134,-2
5,6.0,,-1.321813,,-3
6,7.0,0.264685,-1.748838,0.153076,0
7,8.0,0.953899,0.035857,-0.146805,-4


## Exercise 8

* Another useful built-in function is background_gradient which marks cell proportional to the values.
* Assign a Matplotlib colormap to the cmap parameter.

In [10]:
df.style.background_gradient(cmap="PuBu")

Unnamed: 0,A,B,C,D,E
0,1.0,0.617934,-0.846906,-0.339649,-3
1,2.0,,1.24639,,-5
2,3.0,0.249473,-0.409464,-1.133222,1
3,4.0,0.047224,1.212635,-0.387533,3
4,5.0,0.652359,-1.104956,-0.650134,-2
5,6.0,,-1.321813,,-3
6,7.0,0.264685,-1.748838,0.153076,0
7,8.0,0.953899,0.035857,-0.146805,-4


In [11]:
df.style.background_gradient(cmap="YlGn_r")

Unnamed: 0,A,B,C,D,E
0,1.0,0.617934,-0.846906,-0.339649,-3
1,2.0,,1.24639,,-5
2,3.0,0.249473,-0.409464,-1.133222,1
3,4.0,0.047224,1.212635,-0.387533,3
4,5.0,0.652359,-1.104956,-0.650134,-2
5,6.0,,-1.321813,,-3
6,7.0,0.264685,-1.748838,0.153076,0
7,8.0,0.953899,0.035857,-0.146805,-4


## Exercise 9

* Highlight_max and highlight_min functions mark the maximum and minimum values in a column or row like our custom color_max function.

In [12]:
df.style.highlight_min(color="yellow")

Unnamed: 0,A,B,C,D,E
0,1.0,0.617934,-0.846906,-0.339649,-3
1,2.0,,1.24639,,-5
2,3.0,0.249473,-0.409464,-1.133222,1
3,4.0,0.047224,1.212635,-0.387533,3
4,5.0,0.652359,-1.104956,-0.650134,-2
5,6.0,,-1.321813,,-3
6,7.0,0.264685,-1.748838,0.153076,0
7,8.0,0.953899,0.035857,-0.146805,-4


In [13]:
df.style.highlight_min(color="yellow").highlight_max(color="lightgreen")

Unnamed: 0,A,B,C,D,E
0,1.0,0.617934,-0.846906,-0.339649,-3
1,2.0,,1.24639,,-5
2,3.0,0.249473,-0.409464,-1.133222,1
3,4.0,0.047224,1.212635,-0.387533,3
4,5.0,0.652359,-1.104956,-0.650134,-2
5,6.0,,-1.321813,,-3
6,7.0,0.264685,-1.748838,0.153076,0
7,8.0,0.953899,0.035857,-0.146805,-4


## Exercise 10

* Set_properties function allows to combine multiple style selections.

In [14]:
df.style.set_properties(**{"background-color": "lightgreen",
                           "color": "blue",
                           "border-color": "black"})

Unnamed: 0,A,B,C,D,E
0,1.0,0.617934,-0.846906,-0.339649,-3
1,2.0,,1.24639,,-5
2,3.0,0.249473,-0.409464,-1.133222,1
3,4.0,0.047224,1.212635,-0.387533,3
4,5.0,0.652359,-1.104956,-0.650134,-2
5,6.0,,-1.321813,,-3
6,7.0,0.264685,-1.748838,0.153076,0
7,8.0,0.953899,0.035857,-0.146805,-4


## Exercise 11

* Bar function plots bars over the cells whose lenghts are proportional to the values in the cells.

In [15]:
df.style.bar(color="lightgreen")

Unnamed: 0,A,B,C,D,E
0,1.0,0.617934,-0.846906,-0.339649,-3
1,2.0,,1.24639,,-5
2,3.0,0.249473,-0.409464,-1.133222,1
3,4.0,0.047224,1.212635,-0.387533,3
4,5.0,0.652359,-1.104956,-0.650134,-2
5,6.0,,-1.321813,,-3
6,7.0,0.264685,-1.748838,0.153076,0
7,8.0,0.953899,0.035857,-0.146805,-4


## Exercise 12

* By using the align parameter, we can show negative and positive values with different colors.

In [16]:
df.style.bar(align="mid", color=["yellow", "lightgreen"])

Unnamed: 0,A,B,C,D,E
0,1.0,0.617934,-0.846906,-0.339649,-3
1,2.0,,1.24639,,-5
2,3.0,0.249473,-0.409464,-1.133222,1
3,4.0,0.047224,1.212635,-0.387533,3
4,5.0,0.652359,-1.104956,-0.650134,-2
5,6.0,,-1.321813,,-3
6,7.0,0.264685,-1.748838,0.153076,0
7,8.0,0.953899,0.035857,-0.146805,-4


## Exercise 13

* The style functions we used here are pretty simple ones. 
* However, we can also create more complex style functions that enhance the informative power of DataFrames. 
* We may want to use same styling on all the DataFrames we work on. 
* Pandas offers a way to transfer styles between DataFrames.
* We first save the style to a styler object.

In [17]:
style = df.style.applymap(color_negative_values).apply(color_max)

style

Unnamed: 0,A,B,C,D,E
0,1.0,0.617934,-0.846906,-0.339649,-3
1,2.0,,1.24639,,-5
2,3.0,0.249473,-0.409464,-1.133222,1
3,4.0,0.047224,1.212635,-0.387533,3
4,5.0,0.652359,-1.104956,-0.650134,-2
5,6.0,,-1.321813,,-3
6,7.0,0.264685,-1.748838,0.153076,0
7,8.0,0.953899,0.035857,-0.146805,-4


## Exercise 14

In [18]:
df2 = pd.DataFrame(
    
    {
        "col1":np.random.random(8),
        "col2":np.random.randn(8),
        "col3":np.random.randint(-5,5,8)
    }

)

df2

Unnamed: 0,col1,col2,col3
0,0.563522,-0.358622,-2
1,0.424504,0.584879,-4
2,0.186882,-0.69776,-2
3,0.245806,-0.030282,-1
4,0.607126,0.382591,-2
5,0.674171,-0.14609,0
6,0.263375,0.253176,0
7,0.853755,2.930175,4


In [19]:
style2 = df2.style

style2.use(style.export())

style2

Unnamed: 0,col1,col2,col3
0,0.563522,-0.358622,-2
1,0.424504,0.584879,-4
2,0.186882,-0.69776,-2
3,0.245806,-0.030282,-1
4,0.607126,0.382591,-2
5,0.674171,-0.14609,0
6,0.263375,0.253176,0
7,0.853755,2.930175,4


## Exercise 15

In [20]:
churn = pd.read_csv(
    
    "Data/BankChurners.csv",
    usecols=["Attrition_Flag", "Customer_Age", "Education_Level", "Gender"]

)

churn.head()

Unnamed: 0,Attrition_Flag,Customer_Age,Gender,Education_Level
0,Existing Customer,45,M,High School
1,Existing Customer,49,F,Graduate
2,Existing Customer,51,M,Graduate
3,Existing Customer,40,F,High School
4,Existing Customer,40,M,Uneducated


In [21]:
churn.groupby(["Education_Level"], as_index=False)["Customer_Age"].mean()\
.style.highlight_min(color="yellow", subset=["Customer_Age"])\
.highlight_max(color="lightgreen", subset=["Customer_Age"])

Unnamed: 0,Education_Level,Customer_Age
0,College,45.970385
1,Doctorate,47.261641
2,Graduate,46.323529
3,High School,46.345256
4,Post-Graduate,45.562016
5,Uneducated,46.423672
6,Unknown,46.428571
