# Plotting Data
---
## Using the `mapplotlib` library to plot data 
* `matplotlib` is the most widely used scientific plotting library in Python.
*   A commonly use a sub-library called `matplotlib.pyplot`.


*   The Jupyter Notebook will render plots inline if we ask it to using a "magic" command.

In [None]:
# "Magic"
%matplotlib inline
import pandas
# Import the matplotlib.pyplot library as plt


*   Simple plots are then (fairly) simple to create.

In [None]:
x = [1, 2, 3, 4, 5]
y = [2, 4, 6, 8, 10]

plt.plot(x, y)
plt.xlabel('Numbers')
plt.ylabel('Doubles')

## Plot data directly from a Pandas data frame.

*   We can also plot Pandas data frames.
*   This implicitly uses `matplotlib.pyplot` for its own plot() function.

In [None]:
df = pandas.read_csv('../data/gapminder_gdp_oceania.csv', index_col='country')
df.loc['Australia'].plot()

Our graph is plotted, but the x axis labels are hard to read. Use the command `plt.xticks(rotation=90)` to rotate those labels.
    * Remember that we imported `matplotlib.pyplot` as the alias `plt`

In [None]:
df.loc['Australia'].plot()
plt.xticks(rotation=90)

## Select and transform data, then plot it.

*   By default, `DataFrame.plot` plots with the rows as the X axis.
*   We can transpose the data in order to plot multiple series.

In [None]:
df.T.plot()
plt.ylabel('GDP per capita') # adds a label to our y axis
plt.xticks(rotation=90)

---
## EXERCISE:
1. Read in the gapminder asia data and plot the data for Vietnam, Nepal, and "Mongolia" across all years

---

## Customize Axis Names

Lets create new label names for our X axis
* Extract years from the last four characters of the columns' names.
* Store these in a list 

In [None]:
# Create an empty list called 'years'


# Iterate through the column names, trim only the year off of each, and append it to our new list


* We can also convert data frame data to a list.

In [None]:
# Get a list of all GDP data for Australia data (remember .loc) as list using the .tolist() function

# Plot: 'b-' sets the line style.
plt.plot(years, gdp_australia, 'b-')

* The `'b-'` option in the plot function above sets the line style. Use the help function to learn about more options

In [None]:
help(plt.plot)

## Can plot many sets of data together.

In [None]:
# Select all of the data for Austrialia
gdp_australia = 

# Select all of the data for New Zealand
gdp_nz = 

# Plot with differently-colored markers.
plt.plot(years, gdp_australia, 'b-', label='Australia')
plt.plot(years, gdp_nz, 'g-', label='New Zealand')

# Create legend.
plt.legend(loc='upper left')

# Set axis labels
plt.xlabel('Year')
plt.ylabel('GDP per capita ($)')

# Plot with differently-colored markers.
plt.plot(years, gdp_australia, 'b-', label='Australia')
plt.plot(years, gdp_nz, 'g-', label='New Zealand')

---
## EXERCISE:
1. Create a new plot for Thailand, Nepal, and Mongolia for years after (and including 1982)
1. Place the legend in the the bottom right corner.
1. Label each axis appropriately
1. Give the legend a title

---

## CREATE A SCATTER PLOT
* We can create a different plot style by passing the scatter function to `plt`
* Plot a scatter plot correlating the GDP of Australia and New Zealand

In [None]:
plt.scatter(gdp_australia, gdp_nz)

* We'll need add some labels to these axes

In [None]:
plt.scatter(gdp_australia, gdp_nz)
plt.xlabel('Australia')
plt.ylabel('New Zealand')

---
## EXERCISE:
1. Fill in the blanks below to plot the minimum GDP per capita over time for all the countries in Europe.
    ~~~
    data_europe = pandas.read_csv('data/gapminder_gdp_europe.csv')
    data_europe.____.plot(label='min')
    data_europe.max().plot(label=____)
    plt.legend(loc='best')
    plt.xticks(rotation=90)
    ~~~

---

## Plotting Correlations

This short programs creates a plot showing the correlation between GDP and life expectancy for 2007, normalizing marker size by population:

    data_all = pandas.read_csv('../data/gapminder_all.csv')
    data_all.plot(kind='scatter', x='gdpPercap_2007', y='lifeExp_2007',
                   s=data_all['pop_2007']/1e6)

Using online help and other resources, explain what each argument to `plot` does.
A good place to look is the documentation for the plot function - help(data_all.plot).

>**kind:**

>**x:**

>**y:**

>**s:**

# -- COMMIT TO GITHUB --

---
# Keypoints:
 - "`matplotlib` is the most widely used scientific plotting library in Python."
 - "Plot data directly from a Pandas data frame."
 - "Select and transform data, then plot it."
 - "Many styles of plot are available."
 - "Can plot many sets of data together."