# 2-1 Data Visualization with Heatmaps

Heatmaps visualise data through variations in colouring.

When applied to a tabular format, Heatmaps are useful for cross-examining multivariate data, through placing variables in the rows and columns and colouring the cells within the table.

Heatmaps are good for showing variance across multiple variables, revealing any patterns, displaying whether any variables are similar to each other, and for detecting if any correlations exist in-between them.

# 2-1.0 Setup

In [None]:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

In [None]:
%matplotlib inline

# 2-1.1 Reading data from a csv file

We're going to use the same cyclist dataset that we analyzed last week. The dataset is a list of how many people were on 7 different bike paths in Montreal, each day.

In [None]:
df = pd.read_csv('comptagevelo2012.csv', parse_dates=['Date'], dayfirst=True, index_col='Date')
del df['Unnamed: 1']
df.head()

Now everything looks just fine :-)

# 2-1.2 Appending month and weekday columns

In [None]:
df['Month'] = df.index.month
df['Weekday'] = df.index.weekday
df['Weekday_name'] = df.index.weekday_name
df.head()

# 2-1.3 Playing with pivot tables

A pivot table is a table that summarizes data in another table, and is made by applying an operation such as sorting, averaging, or summing to data in the first table, typically including grouping of the data.

In [None]:
pvt = df.pivot_table(index="Month", columns="Weekday", values="Berri1")
pvt

# 2-1.4 Heatmap visualization

Heatmap is a natural way to visualize a data matrix.

In [None]:
sns.heatmap(pvt)

Customize your heatmap by adding more details.

In [None]:
weekday_names = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday']
sns.heatmap(pvt, lw=1, annot=True, fmt='.0f', xticklabels=weekday_names)

# 2-1.5 Standardization on your dataset

Use z-transform to have zero mean and unit variance.

In [None]:
from scipy.stats import zscore

In [None]:
zs = pd.DataFrame(zscore(pvt, axis=1), index=pvt.index, columns=weekday_names)
zs

In [None]:
sns.heatmap(zs, lw=1, annot=True, fmt='.1f')

In [None]:
sns.heatmap(zs, lw=1, annot=True, fmt='.1f', center=0)

# 2-1.6 Questions

* 若將地點改成Parc，可以觀察到類似的現象嗎？
有，因為假日的情況都依樣
* 以Parc為觀察對象，重複同樣的分析，將你畫出的z-standardized heatmap以附件上傳繳交作業。