# Visualization with Matplotlib and Pandas
## Recipes

* [Getting started with matplotlib](#Getting-started-with-matplotlib)
* [Plotting basics with pandas](#Plotting-basics-with-pandas)
* [Visualizing the flights dataset](#Visualizing-the-flights-dataset)

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

%matplotlib inline

# (1) Getting started with matplotlib

## Getting Ready: Hierarchy of Matplotlib Objects


<img align = 'left' src="pic/structure.png" alt="insert" width="400"/>

- "Matplotlib uses a hierarchy of objects to display all of its plotting items in the output. This hierarchy is key to understanding everything about matplotlib. The Figure and Axes objects are the two main components of the hierarchy."
- "The Figure object is at the top of the hierarchy. It is the container for everything that will be plotted."
- "Contained within the Figure is one or more Axes object(s). The Axes is the primary object that you will interact with when using matplotlib and can be more commonly thought of as the actual plotting surface. The Axes contains the x/y axis, points, lines, markers, labels, legends, and any other useful item that is plotted."

### MATLAB like stateful interface

In [None]:
x = [-3, 5, 7]
y = [10, 2, 5]

plt.figure(figsize=(15,3))
plt.plot(x, y)
plt.xlim(-4, 8)
plt.ylim(0, 11)
plt.xlabel('X Axis')
plt.ylabel('Y axis')
plt.title('Line Plot')
plt.suptitle('Figure Title', size=20, y=1.03)

### Object-oriented interface

In [None]:
x = [-3, 5, 7]
y = [10, 2, 5]

fig, ax = plt.subplots(figsize=(15,3))
ax.plot(x, y)
ax.set_xlim(-4, 8)
ax.set_ylim(0, 11)
ax.set_xlabel('X axis')
ax.set_ylabel('Y axis')
ax.set_title('Line Plot')
fig.suptitle('Figure Title', size=20, y=1.03)

In [None]:
type(fig)

In [None]:
type(ax)

In [None]:
dir(fig)

In [None]:
fig.axes

In [None]:
fig.axes[0] is ax

In [None]:
plot_objects = plt.subplots()

In [None]:
type(plot_objects)

In [None]:
len(plot_objects)

In [None]:
fig = plot_objects[0]
ax = plot_objects[1]

In [None]:
plot_objects = plt.subplots(2, 4, figsize=(14, 4))

In [None]:
plot_objects[0]

In [None]:
plot_objects[1]

In [None]:
plot_objects[1][0][1]

# (2) Plotting basics with pandas

In [None]:
from IPython.display import HTML
HTML('<iframe src=https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.plot.html  width=950 height=700></iframe>')

In [None]:
df = pd.DataFrame({'Apples':[20, 10, 40, 20, 50],'Oranges':[35, 40, 25, 19, 33]}, index=['Atiya', 'Abbas', 'Cornelia', 'Stephanie', 'Monte'])
df

In [None]:
df.plot(kind='bar')

In [None]:
df.transpose().plot(kind='bar')

In [None]:
df.transpose().plot(kind='bar', figsize=(16,4))

In [None]:
fig, (ax1, ax2, ax3) = plt.subplots(1, 3, figsize=(16,4))
fig.suptitle('Two Variable Plots', size=20, y=1.02)
df.plot(kind='line', ax=ax1, title='Line plot')
df.plot(kind='bar', ax=ax2, title='Bar plot')
df.plot(x='Apples', y='Oranges', kind='scatter', ax=ax3, title='Scatterplot')

In [None]:
fig, (ax1, ax2, ax3) = plt.subplots(1, 3, figsize=(16,4))
fig.suptitle('One Variable Plots', size=20, y=1.02)
df.plot(kind='kde', ax=ax1, title='KDE plot')
df.plot(kind='box', ax=ax2, title='Boxplot')
df.plot(kind='hist', ax=ax3, title='Histogram')

In [None]:
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(16,4))
fig.suptitle('Histrograms', size=20, y=1.02)
df.loc[:,'Apples'].plot(kind='hist', ax=ax1, color = 'lime', title='Histrogram: Apples')
df.loc[:,'Oranges'].plot(kind='hist', ax=ax2, color = 'tab:orange', title='Histrogram: Oranges')

<img align = 'left' src="pic/color.png" alt="insert" width="900"/>

# (3) Visualizing the flights dataset

In [None]:
flights = pd.read_csv('data/flights.csv')
flights

In [None]:
ac = flights['AIRLINE'].value_counts()
ac

In [None]:
ac.plot(kind='barh', title ='Airline')

In [None]:
oc = flights['ORG_AIR'].value_counts()
oc

In [None]:
oc.plot(kind='bar', title='Origin City')

In [None]:
oc.plot(kind='bar', rot=True, title='Origin City')

In [None]:
dc = flights['DEST_AIR'].value_counts()
dc

In [None]:
dc.plot(kind='bar', title='Destination City')

In [None]:
dc = flights['DEST_AIR'].value_counts().head(10)
dc.plot(kind='bar', title='Destination City')

In [None]:
dc.plot(kind='bar', rot=True, title='Destination City')

In [None]:
flights

In [None]:
flights.info()

In [None]:
flights['DELAYED'] = flights['ARR_DELAY'].ge(15).astype('int64')

In [None]:
flights.columns

In [None]:
flights[['DIVERTED', 'CANCELLED', 'DELAYED']]

In [None]:
flights[['DIVERTED', 'CANCELLED', 'DELAYED']].any()

In [None]:
flights[['DIVERTED', 'CANCELLED', 'DELAYED']].any(axis='columns')

In [None]:
1 - flights[['DIVERTED', 'CANCELLED', 'DELAYED']].any(axis='columns')

In [None]:
flights['ON_TIME'] = 1 - flights[['DIVERTED', 'CANCELLED', 'DELAYED']].any(axis=1)

In [None]:
flights

In [None]:
flights[['DIVERTED', 'CANCELLED', 'DELAYED', 'ON_TIME']]

In [None]:
flights[['DIVERTED', 'CANCELLED', 'DELAYED', 'ON_TIME']].sum()

In [None]:
status = flights[['DIVERTED', 'CANCELLED', 'DELAYED', 'ON_TIME']].sum()
status

In [None]:
status.plot(kind='bar', title='Flight Status')

In [None]:
status.plot(kind='bar', rot=True, title='Flight Status')

In [None]:
fig, ((ax1, ax2), (ax3, ax4)) = plt.subplots(2, 2, figsize=(18,8))

fig.suptitle('US Flights - Univariate Summary', size=20)

ac = flights['AIRLINE'].value_counts()
ac.plot(kind='barh', ax=ax1, rot=True, title ='Airline')

oc = flights['ORG_AIR'].value_counts()
oc.plot(kind='bar', ax=ax2, rot=True, title='Origin City')

dc = flights['DEST_AIR'].value_counts().head(10)
dc.plot(kind='bar', ax=ax3, rot=True, title='Destination City')

status.plot(kind='bar', ax=ax4, rot=True, title='Flight Status')
