<img src='graphics/style_pandas.jpeg'>

<img src='graphics/spacer.png'>

<center><font style="font-size:40px;">Styling Pandas Output</font></center>
<center>Coded by and Adapted from Cornellius Yudha Wijaya</center>





[https://towardsdatascience.com/my-top-4-functions-to-style-the-pandas-dataframe-932cdc79be39]


Pandas Dataframe is the most used object for Data scientists to analyze their data. While the main function is to just place your data and get on with the analysis, we could still style our data frame for many purposes; namely, for presenting data or better aesthetics.

<p>_____

</p>
<font style="font-size:24px;">Loading the Libraries and Data</font>

Let’s take an example with a dataset. We will use the ‘planets’ data available from the Seaborn library.

In [None]:
#Importing the modules
import pandas as pd
import seaborn as sns

#Loading the dataset
planets = sns.load_dataset('planets')

#Showing 10 first row in the planets data
planets.head()

# Hiding Function

Sometimes when you do an analysis and present the results, you only want to show the most important aspects. I know when I present my DataFrame to the non-technical person, the question is often about the Index in their default number such as “what is this number?”. For that reason, we can hide the index with the `.hide_index()` function.

## .hide_index( )

In [None]:
#Using hide_index() from the style function
 
planets.head(10).style.hide_index()

## .hide_columns( ) 

Additionally, we can hide unnecessary columns. Let’s say we don't want to show the ‘method’ and ‘year’ columns. To do this, we can use the `.hide_columns([ ])` function.

In [None]:
#Using hide_columns to hide the unnecesary columns
planets.head(10).style.hide_index().hide_columns(['method','year'])

# Highlight Function

## .highlight_max( )

There are times when we want to highlight only the important numbers in our DataFrame, for example the highest number. In this case, we can use the built-in `highlight_max( )` method.

In [None]:
#Highlight the maximum number for each column
planets.head(10).style.highlight_max(color = 'yellow')

## .highlight_min( )

But what do we do when we want the minimum numbers highlighted. The built in `.highlight_min( )` function comes to our rescue.

In [None]:
planets.head(10).style.highlight_min(color = 'lightblue')

and if you want to chain it, we could also do that.

In [None]:
#Highlight the minimum number with lightblue color and the maximum number with yellow color
planets.head(10).style.highlight_max(color='yellow').highlight_min(color = 'lightblue')

In [None]:
#Adding Axis = 1 to change the direction from column to row
planets.head(10).style.highlight_max(color = 'yellow', axis =1)

As an addition, we could highlight the null value with the following code.

In [None]:
#Higlight the null value
planets.head(10).style.highlight_null(null_color = 'red')

# Gradient Function

In [None]:
#Gradient background color for the numerical columns
planets.head(10).style.background_gradient(cmap = 'Blues')

In [None]:
planets.head(10).style.set_caption('Colormap with a caption.').background_gradient(cmap='Blues')

# Custom Function

In [None]:
#Create the function to color the numerical value into red color
def color_below_20_red(value):
    if type(value) == type(''):
        return 'color:black'
    else:
        color = 'red' if value <= 20 else 'black'
        return 'color: {}'.format(color)
#We apply the function to all the element in the data frame by using the applymap function
planets.head(10).style.applymap(color_below_20_red)

# Bar Chart Integration

In Excel, one can conditionally format bar charts into a table/cell. We can do the same in Python using the `.bar()` command. Let's run this next block to see how it works. 

First, let's generate a DataFrame with which to work. 

In [None]:
import pandas as pd
import numpy as np

np.random.seed(26)
df = pd.DataFrame({'A': np.linspace(1, 10, 10)})
df = pd.concat([df, pd.DataFrame(np.random.randn(10, 4), columns=list('BCDE'))],
               axis=1)
df.iloc[4, 4] = np.nan
df.iloc[0, 2] = np.nan

df

In [None]:
df.style.bar(color='red')

AND we can do this with selected columns along by tweaking the parameters. By calling up a "subset" of the DataFrame, we can select specific columns. Look at the code below to see how this is done. 

In [None]:
df.style.bar(subset=['A', 'D'], color='red')

## Let's get really fancy

New in version 0.20.0 is the ability to customize further the bar chart: You can now have the df.style.bar be centered on zero or midpoint value (in addition to the already existing way of having the min value at the left side of the cell), and you can pass a list of [color_negative, color_positive].

Here’s how you can change the above with the new align='mid' option:

In [None]:
df.style.bar(subset=['C', 'E'], align='mid', color=['red', 'green'])

# Exercises

Let's do some practicing of what we have learned. 

First, let's randomly generate a DataFrame with which to work on these exercises. 

In [None]:
import pandas as pd
import numpy as np

np.random.seed(42)
df = pd.DataFrame({'A': np.linspace(1, 10, 10)})
df = pd.concat([df, pd.DataFrame(np.random.randn(10, 4), columns=list('BCDE'))],
               axis=1)
df.iloc[3, 3] = np.nan
df.iloc[0, 2] = np.nan

df

## Hide the `A` column

In the next block, write the code that hides the `A` column of our DataFrame. We want to retain the DataFrame's index column.  

In [None]:
# Write your code here



### Bonus 1

We have only hidden the `A`column; we have not eliminated it. You can see this by displaying the `df` variable in the next block. 

But what if we want to save our DataFrame with the `A` column hidden. Write the code to save the DataFrame without the `A` column. Call this new DataFrame `df_no_A`. 

In [None]:
df

In [None]:
# Write your code here


## Highlight Values (Conditional Formatting)

In the next block, write the code to highlight the maximum value in each column in yellow. Use the `df` variable so we are using the full DataFrame we established for these exercises.  

In [None]:
# Write your code here.



### Bonus 2

As a bonus, let's highlight the minimum value in each column in yellow and the maximum value in each column in red. Do this in one line of code. 

In [None]:
# Write your code here. 



## Gradient Highlighting

In the next block, let's write the code to highlight the DataFrame's values in graduated shades of green.

In [None]:
# Write your code here. 



## Integration of Several Attributes

Now for the big challenge. Let's integrate several elements to make our DataFrame much more insightful.

In the next block, write the code to:
1. hide column `A`
1. draw an integrated bar chart where the positive values are yellow and the negative values are orange 
1. add a caption

(**Note**: This can be done in one line of code. )

In [1]:
# Write your code here.

