___

<a href='https://github.com/ai-vithink'> <img src='https://avatars1.githubusercontent.com/u/41588940?s=200&v=4' /></a>
___

# Matrix Plots

Matrix plots allow you to plot data as color-encoded matrices and can also be used to indicate clusters within the data (later in the machine learning section we will learn how to formally cluster data).

Let's begin by exploring seaborn's heatmap and clutermap:

In [None]:
import seaborn as sns

In [None]:
from IPython.display import HTML
HTML('''<script>
code_show_err=false; 
function code_toggle_err() {
 if (code_show_err){
 $('div.output_stderr').hide();
 } else {
 $('div.output_stderr').show();
 }
 code_show_err = !code_show_err
} 
$( document ).ready(code_toggle_err);
</script>
To toggle on/off output_stderr, click <a href="javascript:code_toggle_err()">here</a>.''')
# To hide warnings, which won't change the desired outcome.

In [None]:
%%HTML
<style type="text/css">
table.dataframe td, table.dataframe th {
    border: 3px  black solid !important;
  color: black !important;
}
# For having gridlines 

In [None]:
import warnings
warnings.filterwarnings("ignore")

In [None]:
%matplotlib inline
sns.set_style('darkgrid')

In [None]:
tips = sns.load_dataset('tips')
flights = sns.load_dataset('flights')
tips.head()

In [None]:
flights.head()

## Heatmap

In order for a heatmap to work properly, your data should already be in a matrix form, the sns.heatmap function basically just colors it in for you. 
* What we mean by matrix form is that the index name and column name should match up, so that cell value shows something, which is relevant to both of the names.
For example:

* Right now we have tips where total_bill is a label or a variable and first value is 16.99 dollars, however row is not an actual variable here, in order to get the tips into matrix form we need to have variables on columns and rows.
* We can do so by multiple methods like pivot table or by getting correlation data.
* We will do corr first in the following manner :

In [None]:
tc = tips.corr() # Now as you see the tips is in matrix form with row and column both having variables.
# Now column and row variable name actually indicates relevancy to both column and rows where they are.

In [None]:
sns.heatmap(tc, annot=True, cmap='coolwarm')
# Heatmap just colours the values based on some gradient scale.
# Annotation -> annot to display the numbers on the heatmap, and cmap for colourmap you prefer.

In [None]:
# We have year month and passengers and we need to get it into matrix form.
flights.head()

In [None]:
flights.pivot_table(index='month', columns='year', values='passengers')
# index is your row, columns and values are numbers stored at the intersection of x,y row and column.

In [None]:
fp = flights.pivot_table(index='month',columns='year',values='passengers')
sns.heatmap(fp,cmap='magma',linecolor='white',linewidths=1)
#  cmap schemes : magma, coolwarm, spring

In [None]:
# !jt -r
# !jt -t monokai -T -N -kl

## clustermap

The clustermap uses hierarchal clustering to produce a clustered version of the heatmap. For example:

In [None]:
sns.clustermap(fp,cmap='coolwarm')

* Tries to cluster together rows and columns based on their similarity.
* Notice how months are not in the same order on y-axis as these have been clustered to put similar months together.
* Also some years are now out of order. e.g. 1959,1960 are similar to each other. So are August and July.
* For clarity try changing cmap.
* Another thing we can do is standardize the scale. In clustermap above we are on the scale 0 to 600. If we wanted to normalize this we can pass in an argument standard_scale and input 1 to normalize 0 to 600 scale to 0 to 1.

In [None]:
sns.clustermap(fp,cmap='coolwarm',standard_scale=1)

* After normalizing we can see that Winter months Feb,Jan and Nov are closer to less number of passengers and high passenger months are in summer months.
* Checkout documentation of clustermap if you are excited and want to know what actually is happening here.
* You might want to wait till we reach the machine learning clustering algorithms to understand and appreciate the mathematics behind the methods.
* Right now consider it just as a interesting way to interpret more information as you would get from a heatmap.