# ===========================================================
# 09 Plotting

# Objectives
- Create a time series plot showing a single data set.
- Create a scatter plot showing relationship between two data sets.

## Matplotlib
- [`matplotlib`](https://matplotlib.org/) is a widely used scientific plotting library in Python.
- Pandas is built on top of Matplotlib
- A commonly used sub-library is called [`matplotlib.pyplot`](https://matplotlib.org/api/pyplot_api.html).
- The Jupyter Notebook will render plots inline if we ask it to using a "magic" command.

In [None]:
%matplotlib inline
import matplotlib.pyplot as plt

## Matplotlib usage
- Here is the general outline for creating a plot
  using Matplotlib

In [None]:
time = [0, 1, 2, 3]
position = [0, 100, 200, 300]

plt.plot(time, position)
plt.xlabel('Time (hr)')
plt.ylabel('Position (km)');  # `;` is not Python. This makes the notebook surpress extra messages from Matplotlib

## Plotting with Pandas
- Since Pandas is built on matplotlib, we can plot data directly from a dataframe.
- Before plotting, we convert the column headings from a `string` to `integer` data type, since they represent numerical values

In [None]:
import pandas as pd

data = pd.read_csv('data/gapminder_gdp_oceania.csv', index_col='country')

# Extract year from last 4 characters of each column name
# The current column names are structured as 'gdpPercap_(year)', 
# so we want to keep the (year) part only; this provides clarity when plotting GDP against years.
# To do this we use strip(), which removes the characters stated in the argument from the string.
# This method works on strings, so we call str() before strip().

years = data.columns.str.strip('gdpPercap_')

# Convert year values to integers, saving results back to dataframe

data.columns = years.astype(int)

# Look at it now

data

In [None]:
# Plot the data for Australia
data.loc['Australia'].plot();

In [None]:
# 1. Plot the data for New Zealand
data.loc['New Zealand'].plot();

# Tranposing for a Plot

- By default, dataframes plot with the rows as the X axis.
- We can transpose the data in order to plot multiple series.

In [None]:
data.T.plot()
plt.ylabel('GDP per capita')

## Plot Types
- Many styles of plot are available.

In [None]:
plt.style.use('ggplot')
data.T.plot(kind='bar')
plt.ylabel('GDP per capita');

- `.plot` has many attributes, including all the plot types it can produce

In [None]:
# List available plots
[method_name for method_name in dir(data.plot) if not method_name.startswith("_")]

- Let's make a scatter plot of Australia's GDP against New Zealand's GDP.

In [None]:
data.T.plot.scatter(x = 'Australia', y = 'New Zealand');

# Objectives
- Create a time series plot showing a single data set.
- Create a scatter plot showing relationship between two data sets.