# Matplotlib Basic Data Plotting


### Introduction
In the previous notebook, we created Figure and Axes objects, and proceeded to change their properties without plotting any actual data. In this notebook, we will learn how to make basic line and scatter plots with the Axes object.

### The Axes API
The [matplotlib documentation][1] has a very nice layout of the Axes API. There are around 300 different calls you make with an Axes object. The API page categories and groups each method by what its functionality. The first third (approximately) of the categories in the API are used to create plots.

The simplest and most common plots are found in the **Basics** category and include **`plot`**, **`scatter`**, **`bar`**, **`pie`**, and others.

[1]: https://matplotlib.org/api/axes_api.html

## Creating a single Axes in a Figure
If you do not specify the number of rows or columns in the call the **`subplots`** function, a single Axes will be created inside of a Figure. A NumPy array will not be returned like we saw in the last notebook, but instead, an Axes will be. Let's see an example of this.

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

In [None]:
fig, ax = plt.subplots()

In [None]:
type(fig)

In [None]:
type(ax)

## The `plot` method - Creates a line plot
As usual, the bulk of the work will come from the Axes object. You will mainly be calling methods with it. The **`plot`** method's primary purpose is to create **line plots**. It does have the ability to create scatter plots as well, but that task is best reserved for **`scatter`**.

### Plotting 2D Data
The **`plot`** method is very flexible and can take on a variety of different inputs. The following teaches a straightforward and consistent approach that is explicit and easy to read.

The first two arguments to the **`plot`** method can be the x and y coordinates of the data. Below, we use NumPy arrays to hold our data. We simply plot the square of the x value.

In [None]:
x = np.arange(-5, 5)
y = x ** 2
ax.plot(x, y)

### What was returned?
As usual, no image was produced. The figure got updated, but you must explicitly put its name as the last line in your cell to have it output to the screen.

A list of Line objects were returned from our call to the **`plot`** method. The **`plot`** method can make produce many lines in a single call it, which is why it returns the results as a list. Let's output our Figure, revealing the line plot.

In [None]:
fig

## Formatting the line
The line can be formatted using many different parameters. Please see the documentation for the [Line object][1]. All of the possible parameters are available on that page. 

[1]: https://matplotlib.org/api/_as_gen/matplotlib.lines.Line2D.html#matplotlib.lines.Line2D

## Use Pandas to scrape the html table

In [None]:
pd.options.display.max_colwidth = 300

In [None]:
dfs = pd.read_html('https://matplotlib.org/api/_as_gen/matplotlib.lines.Line2D.html#matplotlib.lines.Line2D')
dfs[0]

### Change line style
Look at the documentation above and change the line style. The default is solid. If you don't recreate the figure, the old line will still remain.

In [None]:
fig, ax = plt.subplots()
ax.plot(x, y, linestyle='--')

### Remove all the objects from the Axes with `clear`

In [None]:
ax.clear()
fig

# Create several new lines with different line stlyes

In [None]:
# your code here

# Matplotlib Colors

There are many possible ways to identify a color in matplotlib. Read the [color documentation][1] to see all 8 ways to specify a color.

* an RGB or RGBA tuple of float values in [0, 1] (e.g., (0.1, 0.2, 0.5) or (0.1, 0.2, 0.5, 0.3)). RGBA is short for Red, Green, Blue, Alpha;
* a hex RGB or RGBA string (e.g., '#0F0F0F' or '#0F0F0F0F');
* a string representation of a float value in [0, 1] inclusive for gray level (e.g., '0.5');
* one of {'b', 'g', 'r', 'c', 'm', 'y', 'k', 'w'}; **I don't use these because they are confusing**
* a X11/CSS4 color name - **I do use these**

### Web Colors
You can use any of the following colors that are available to web developers in CSS - a language that styles web pages.
![][2]

[1]: https://matplotlib.org/tutorials/colors/colors.html#sphx-glr-tutorials-colors-colors-py
[2]: images/named_colors.png

In [None]:
fig, ax = plt.subplots()
ax.plot(x, y, color='mediumorchid')

### Grayscale
Use a **string** with a number between 0 and 1 for grayscale.

In [None]:
fig, ax = plt.subplots()
ax.plot(x, y, color='.7')

## Markers
There are a few dozen [styles for markers][1]. These are plotted on every point. Use a string to reference the one you want.

[1]: https://matplotlib.org/api/markers_api.html

In [None]:
dfs = pd.read_html('https://matplotlib.org/api/markers_api.html')
dfs[0]

In [None]:
fig, ax = plt.subplots()
ax.plot(x, y, color='darkred', linestyle='--', marker='s')

## Marker styles

In [None]:
fig, ax = plt.subplots()
ax.plot(x, y, color='darkred',
              linestyle='--',
              linewidth=4,
              marker='D',
              markerfacecolor='blue',
              markersize=15)

# Integration with Pandas - plotting real data
Matplotlib makes it quite simple to create some plots when our data is in a Pandas DataFrame.

In [None]:
pd.options.display.max_columns = 100
flights = pd.read_csv('data/flights.csv')
flights.head()

Let's first get find the average delay for each month.

In [None]:
month_delay = flights.groupby('MONTH', as_index=False).agg({'DEPARTURE_DELAY': ['mean', 'size']})
month_delay.columns = ['month', 'average delay', 'count']
month_delay

### Make a line plot with a DataFrame 
Matplotlib simplifies the process by providing a **`data`** parameter. Set this equal to the name of the DataFrame. Put the column names as strings as the first two parameters in the **`plot`** method.

In [None]:
fig, ax = plt.subplots(figsize=(10, 5))
ax.plot('month', 'average delay', data=month_delay, marker='s')
ax.set_xlabel('Month', fontsize=15)
ax.set_title('Average Delay', fontsize=20, color='tomato')

# Most common Plots
Visit the [Axes API][1] to see the most common plotting methods.

[1]: https://matplotlib.org/api/axes_api.html

In [None]:
dfs = pd.read_html('https://matplotlib.org/api/axes_api.html', r'Axes[.]plot')
axes_api = dfs[0]
axes_api.columns = ['Plotting Method', 'Description']
axes_api

## Univariate Analysis
These are the primary plots that you will make from your Axes. We just plotted a line with **`plot`** in our above example. Let's see a few more plots in action.

In [None]:
fig, ax = plt.subplots(figsize=(10, 5))
ax.boxplot(x='TAXI_OUT', data=flights)

In [None]:
fig, ax = plt.subplots(figsize=(10, 5))
ax.boxplot(x='TAXI_OUT', data=flights, vert=False)

# Bar plot

In [None]:
dd = flights.groupby('AIRLINE', as_index=False).agg({'DEPARTURE_DELAY': 'mean'})
dd

In [None]:
fig, ax = plt.subplots(figsize=(10, 5))
ax.bar(x='AIRLINE', height='DEPARTURE_DELAY', data=dd)