## 🕶 All You Need is Time-Series Visualization

![](https://images.unsplash.com/photo-1518186285589-2f7649de83e0?ixlib=rb-1.2.1&ixid=eyJhcHBfaWQiOjEyMDd9&auto=format&fit=crop&w=1267&q=80)

Example code and explanations of more than 20 techniques for visualizing time series data.

I will continue to update the content further.

### Table of Contents

- **1. Bar**
    - single 
        - vertical [1]
        - horizontal [2]
    - multiple
        - subplots [3]
        - overlaped [4]
        - stacked 
            - amount [5]
            - ratio [6]
        - adjacent [7]
        
    
- **2. Line**
    - single 
        - line [8]
        - area [9]
        - step [10]
        - step area [11]
    - multiple 
        - subplots [12]
        - horizon chart [13]
        - overlaped [14]
        - overlaped area [15]
        - stacked
            - amount [16]
            - streamgraph[17]
            - ratio [18]
        

In [None]:
!pip install seaborn==0.11.0

In [None]:
import numpy as np
import pandas as pd
import matplotlib as mpl
import matplotlib.pyplot as plt
import seaborn as sns
from scipy import stats

In [None]:
sns.color_palette(["#00798c", "#d1495b", '#edae49', '#66a182'])

In [None]:
pd.options.display.max_columns = 999

In [None]:
data = pd.read_csv('/kaggle/input/house-prices-advanced-regression-techniques/train.csv')
data.head()

In [None]:
data.describe(include='O')

## 1. Bar

### Bar > Single > Vertical 

The most basic is a **bar** graph. 

Bars are easy to make absolute comparisons of their size, and ratios are easy to compare to nearby bars.

In [None]:
built = data['YearBuilt'].value_counts().sort_index()
fig, ax = plt.subplots(1, 1, figsize=(18, 5))
color = ['#4a4a4a' if val != max(built) else '#e3120b' for val in built]
ax.bar(built.index, built, color=color)

for s in ['top', 'right']:
    ax.spines[s].set_visible(False)

ax.grid()

plt.show()

### Bar > Single > Horizontal

On the web or in publications, the height is longer than the width.

If there is a large amount, it is also a technique to draw the bar graph by changing its axis.

In [None]:
fig, ax = plt.subplots(1, 1, figsize=(6, 12))
ax.barh(built.index, built, color=color)
ax.grid()

plt.show()

### Bar > Multiple > Subplots

The first way to compare multiple graphs over the same time period is to draw multiple graphs.

In [None]:
data['HouseStyle'].value_counts()

In [None]:
data['HouseStyle'] = data['HouseStyle'].apply(lambda x : 'ETC' if x in ['SLvl', 'SFoyer', '1.5Unf', '2.5Unf', '2.5Fin'] else x)

In [None]:
fig, ax = plt.subplots(4, 1, figsize=(20, 12), sharex=True)
color = ["#00798c", "#d1495b", '#edae49', '#66a182']

for i, hs in enumerate(data['HouseStyle'].value_counts().index):
    hs_built = data[data['HouseStyle']==hs]['YearBuilt'].value_counts()
    ax[i].bar(hs_built.index, hs_built, color=color[i], label=hs)
    ax[i].set_ylim(0, 50)
    ax[i].legend(loc='upper left')
    for s in ['top', 'right']:
        ax[i].spines[s].set_visible(False)



plt.show()

### Bar > Multiple > Overlaped

If you draw multiple graphs, absolute comparison is difficult, so you can draw them on one graph at the same time.

If you draw at the same time, you have to deal with the overlap.

The first is to adjust the transparency so that the overlapping part is visible.

Comparisons are much easier.

In [None]:
fig, ax = plt.subplots(1, 1, figsize=(18, 5))
color = ["#00798c", "#d1495b", '#edae49', '#66a182']

for i, hs in enumerate(data['HouseStyle'].value_counts().index):
    hs_built = data[data['HouseStyle']==hs]['YearBuilt'].value_counts()
    ax.bar(hs_built.index, hs_built, color=color[i], label=hs, alpha=0.4, edgecolor=color[i])
    
for s in ['top', 'right']:
    ax.spines[s].set_visible(False)

ax.set_ylim(0, 50)
ax.legend(loc='upper left')
plt.show()

### Bar > Multiple > Stacked > Amout

Another way to draw at the same time is to stack the bars.

In [None]:
data_sub = data.groupby('HouseStyle')['YearBuilt'].value_counts().unstack().fillna(0).loc[['ETC','1.5Fin','2Story', '1Story']].cumsum(axis=0).T
data_sub

In [None]:
fig, ax = plt.subplots(1, 1, figsize=(18, 5))
color = ["#00798c", "#d1495b", '#edae49', '#66a182']

for i, hs in enumerate(data['HouseStyle'].value_counts().index):
    hs_built = data_sub[hs]
    ax.bar(hs_built.index, hs_built, color=color[i], label=hs)
    
for s in ['top', 'right']:
    ax.spines[s].set_visible(False)

ax.legend(loc='upper left')
ax.grid()
plt.show()

### Bar > Multiple > Stacked > Ratio

You can also visualize the proportions of that bar over time.

In [None]:
data_sub = data.groupby('HouseStyle')['YearBuilt'].value_counts().unstack().fillna(0).loc[['ETC','1.5Fin','2Story', '1Story']].T
data_sum = data_sub.sum(axis=1)
data_sub = (data_sub.T / data_sum).cumsum().T

In [None]:
fig, ax = plt.subplots(1, 1, figsize=(18, 5))
color = ["#00798c", "#d1495b", '#edae49', '#66a182']

for i, hs in enumerate(data['HouseStyle'].value_counts().index):
    hs_built = data_sub[hs]
    ax.bar(hs_built.index, hs_built, color=color[i], label=hs)
    
for s in ['top', 'right']:
    ax.spines[s].set_visible(False)

ax.legend(loc='upper left')
ax.grid()
plt.show()

### Bar > Multiple > Adjacent

You can also draw using neighboring bars for a specific time period. 

However, this is not a good idea for time series, and line charts are more efficient than doing this.

In [None]:
fig, ax = plt.subplots(1, 1, figsize=(20, 5))
color = ["#00798c", "#d1495b", '#edae49', '#66a182']

width = 0.25
for i, hs in enumerate(data['HouseStyle'].value_counts().index):
    hs_built = data[(data['HouseStyle']==hs)&(data['YearBuilt']>1980)]['YearBuilt'].value_counts()
    ax.bar(hs_built.index+(width*(i-2)), hs_built, width, color=color[i], label=hs)
    
for s in ['top', 'right']:
    ax.spines[s].set_visible(False)

ax.set_ylim(0, 50)
ax.legend(loc='upper left')
plt.show()

## 2. Line

### Line > Single > Normal

For time series, line graphs are more efficient than bar graphs.

It shows trends and can emphasize a continuous feeling.

In [None]:
built = data['YearBuilt'].value_counts().sort_index()
fig, ax = plt.subplots(1, 1, figsize=(18, 5))

ax.plot(built.index, built, color='#4a4a4a')

for s in ['top', 'right']:
    ax.spines[s].set_visible(False)

ax.grid()

plt.show()

### Line > Single > Area

It is also a good idea to plot the area based on what the time series represents.

Area can represent trends and quantities.

In [None]:
built = data['YearBuilt'].value_counts().sort_index()
fig, ax = plt.subplots(1, 1, figsize=(18, 5))
ax.plot(built.index, built, color='#4a4a4a')

ax.fill_between(built.index, 0, built, color='#4a4a4a')

for s in ['top', 'right']:
    ax.spines[s].set_visible(False)

ax.grid()

plt.show()

### Line > Single > Step

Among the line graphs, the shape of the line can also be stepped.

It is easier to understand for comparisons of absolute quantities and avoids cognitive misunderstandings due to gradients.

In [None]:
built = data['YearBuilt'].value_counts().sort_index()
fig, ax = plt.subplots(1, 1, figsize=(18, 5))
ax.step(built.index, built, color='#4a4a4a')

for s in ['top', 'right']:
    ax.spines[s].set_visible(False)

ax.grid()

plt.show()

### Line > Single > Step Area 

You can plot this as an area graph.

In [None]:
built = data['YearBuilt'].value_counts().sort_index()
fig, ax = plt.subplots(1, 1, figsize=(18, 5))
ax.step(built.index, built, color='#4a4a4a')

ax.fill_between(built.index, 0, built, color='#4a4a4a', step='pre')

for s in ['top', 'right']:
    ax.spines[s].set_visible(False)

ax.grid()

plt.show()

Looks like building..?

### Line > Multiple > Subplots

When drawing multiple graphs at the same time, pay attention to the axis scale.

In [None]:
fig, ax = plt.subplots(4, 1, figsize=(20, 12), sharex=True)
color = ["#00798c", "#d1495b", '#edae49', '#66a182']

for i, hs in enumerate(data['HouseStyle'].value_counts().index):
    hs_built = data[data['HouseStyle']==hs]['YearBuilt'].value_counts().sort_index()
    ax[i].plot(hs_built.index, hs_built, color=color[i], label=hs)
    ax[i].set_ylim(0, 50)
    ax[i].legend(loc='upper left')
    for s in ['top', 'right']:
        ax[i].spines[s].set_visible(False)



plt.show()

### Line > Multiple > Horizon Chart

If you draw this as an area graph and draw it closer together, you can visualize a sophisticated time series.

In [None]:
fig, ax = plt.subplots(4, 1, figsize=(20, 12), sharex=True)
color = ["#00798c", "#d1495b", '#edae49', '#66a182']

for i, hs in enumerate(data['HouseStyle'].value_counts().index):
    hs_built = data[data['HouseStyle']==hs]['YearBuilt'].value_counts().sort_index()
    ax[i].plot(hs_built.index, hs_built, color=color[i], label=hs)
    ax[i].fill_between(hs_built.index, 0, hs_built, color=color[i])
    ax[i].set_ylim(0, 30)
    ax[i].legend(loc='upper left')

plt.subplots_adjust(hspace=0)
plt.show()

### Line > Multiple > Overlaped

Unlike bar charts, drawing at the same time is very efficient. 

Since it is a line, there is no need to handle most of the overlapping parts.

In [None]:
fig, ax = plt.subplots(1, 1, figsize=(18, 6))
color = ["#00798c", "#d1495b", '#edae49', '#66a182']

for i, hs in enumerate(data['HouseStyle'].value_counts().index):
    hs_built = data[data['HouseStyle']==hs]['YearBuilt'].value_counts().sort_index()
    ax.plot(hs_built.index, hs_built, color=color[i], label=hs)

ax.set_ylim(0, 50)
ax.legend(loc='upper left')
for s in ['top', 'right']:
    ax.spines[s].set_visible(False)

ax.grid()

plt.show()

you can custom linestyle.

In [None]:
fig, ax = plt.subplots(1, 1, figsize=(18, 6))
color = ["#00798c", "#d1495b", '#edae49', '#66a182']
linestyles = ['-', '--', '-.', ':']

for i, hs in enumerate(data['HouseStyle'].value_counts().index):
    hs_built = data[data['HouseStyle']==hs]['YearBuilt'].value_counts().sort_index()
    ax.plot(hs_built.index, hs_built, color=color[i], linestyle=linestyles[i], label=hs)

ax.set_ylim(0, 50)
ax.legend(loc='upper left')
for s in ['top', 'right']:
    ax.spines[s].set_visible(False)

ax.grid()

plt.show()

### Line > Multiple > Overlaped Area

You can plot this as an area graph.

It is effective because it can be difficult to express the total amount with only the line.

In [None]:
fig, ax = plt.subplots(1, 1, figsize=(18, 6))
color = ["#00798c", "#d1495b", '#edae49', '#66a182']

for i, hs in enumerate(data['HouseStyle'].value_counts().index):
    hs_built = data[data['HouseStyle']==hs]['YearBuilt'].value_counts().sort_index()
    ax.plot(hs_built.index, hs_built, color=color[i], label=hs)
    ax.fill_between(hs_built.index, 0, hs_built, color=color[i], alpha=0.4)

ax.set_ylim(0, 50)
ax.legend(loc='upper left')
for s in ['top', 'right']:
    ax.spines[s].set_visible(False)

ax.grid()

plt.show()

### Line > Multiple > Stacked > Amout

Stacking method is also possible.

In [None]:
data_sub = data.groupby('HouseStyle')['YearBuilt'].value_counts().unstack().fillna(0).loc[['ETC','1.5Fin','2Story', '1Story']].cumsum(axis=0).T

fig, ax = plt.subplots(1, 1, figsize=(18, 5))
color = ["#00798c", "#d1495b", '#edae49', '#66a182']

for i, hs in enumerate(data['HouseStyle'].value_counts().index):
    hs_built = data_sub[hs]
    ax.fill_between(hs_built.index, 0, hs_built, color=color[i], label=hs)
    
for s in ['top', 'right']:
    ax.spines[s].set_visible(False)

ax.legend(loc='upper left')
ax.grid()
plt.show()

### Line > Multiple > Stacked > Stream graph

A streamgraph, or stream graph, is a type of stacked area graph which is displaced around a central axis, resulting in a flowing, organic shape.

In [None]:
data_sub = data.groupby('HouseStyle')['YearBuilt'].value_counts().unstack().fillna(0).loc[['ETC','1.5Fin','2Story', '1Story']].cumsum(axis=0).T
data_sub.insert(0, "base", np.zeros(len(data_sub)))


data_sub = data_sub.add(-data['YearBuilt'].value_counts()/2, axis=0)
fig, ax = plt.subplots(1, 1, figsize=(18, 5))
color = ["#00798c", "#d1495b", '#edae49', '#66a182'][::-1]
hs_list = data_sub.columns


for i, hs in enumerate(hs_list):
    if i == 0 : continue
    ax.fill_between(hs_built.index, data_sub.iloc[:,i-1], data_sub.iloc[:,i], color=color[i-1])
    
for s in ['top', 'right', 'bottom', 'left']:
    ax.spines[s].set_visible(False)

ax.set_yticks([])
ax.legend(loc='upper left')
ax.grid(axis='x')
plt.show()

### Line > Multiple > Stacked > Ratio

Stacking according to each proportion is also one of the famous techniques.

In [None]:
data_sub = data.groupby('HouseStyle')['YearBuilt'].value_counts().unstack().fillna(0).loc[['ETC','1.5Fin','2Story', '1Story']].T
data_sum = data_sub.sum(axis=1)
data_sub = (data_sub.T / data_sum).cumsum().T

fig, ax = plt.subplots(1, 1, figsize=(18, 5))
color = ["#00798c", "#d1495b", '#edae49', '#66a182']

for i, hs in enumerate(data['HouseStyle'].value_counts().index):
    hs_built = data_sub[hs]
    ax.bar(hs_built.index, hs_built, color=color[i], label=hs)
    ax.fill_between(hs_built.index, 0, hs_built, color=color[i])
    
ax.legend(loc='upper left')
ax.grid()
ax.set_ylim(0, 1)
ax.set_xlim(1872, 2010)
plt.show()

## Related Work

- [🕶 Awesome Visualization with Titanic Dataset📊](https://www.kaggle.com/subinium/awesome-visualization-with-titanic-dataset)
- [Tips for making the Right Visualization](https://www.kaggle.com/subinium/tips-for-making-the-right-visualization)
- [Simple Matplotlib & Visualization Tips 💡](https://www.kaggle.com/subinium/simple-matplotlib-visualization-tips)
- [🛣️ Road to Viz Expert (1) - Unusual tools](https://www.kaggle.com/subinium/road-to-viz-expert-1-unusual-tools)