<a href="https://colab.research.google.com/github/MonroeDustin/odscwest-dv-python/blob/master/Test_notebooks/01_pandas_plotting.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

<img src="https://snipboard.io/hzEuw1.jpg">

# Data Visualization:  From Jupyter to Dashboards
### Session 1: Plotting with Pandas
Author:  David Yerrington

## Learning Objectives

- Become familiar with Jupyter Lab plotting convensions
- Describe the integration between matplotlib and Pandas
- Understand the concept of "figure" and "axes."
- Implement standard plots using Pandas

### Prerequisite Knowledge
- Basic Pandas 
  - Difference between Series vs Dataframe
  - Bitmasks, query function, selecting data
  - Aggregations

## Environment Setup

<span style="color: red">If you are reading this and you haven't setup your local Python environment, please review [the setup guide](../environment.md) ASAP!</span>

### Imports

In [7]:
import pandas as pd, numpy as np, matplotlib.pyplot as plt

### Load a Dataset
<img src="https://storage.googleapis.com/kaggle-datasets-images/2619/4359/e3ef5846d64dc9a747afd82273456328/dataset-cover.jpg" class="Header_CoverImg-sc-1431b7d ibFJYv">

This is a Pokemon dataset and it's from [Kaggle](https://www.kaggle.com/terminus7/pokemon-challenge).  

> Pokemon are creatures that fight each other, in a turn-based RPG game.

Inspect, then view a few records of the dataset. 

In [5]:
from google.colab import files

In [9]:
uploaded = files.upload()

Saving pokemon.csv to pokemon (1).csv


In [16]:
import io 
df = pd.read_csv(io.BytesIO(uploaded['pokemon.csv']))

In [17]:
from google.colab import drive 
drive.mount('/mntDrive')

Mounted at /mntDrive


In [18]:
df.head()

Unnamed: 0,#,Name,Type 1,Type 2,HP,Attack,Defense,Sp. Atk,Sp. Def,Speed,Generation,Legendary
0,1,Bulbasaur,Grass,Poison,45,49,49,65,65,45,1,False
1,2,Ivysaur,Grass,Poison,60,62,63,80,80,60,1,False
2,3,Venusaur,Grass,Poison,80,82,83,100,100,80,1,False
3,4,Mega Venusaur,Grass,Poison,80,100,123,122,120,80,1,False
4,5,Charmander,Fire,,39,52,43,60,50,65,1,False


### Enable Inline Plotting

Plots can be displayed many ways.  Saved as separate files.  Opened with a display application independant of Juptyer. Or within Jupyter itself.

> Read more about Jupyter [magic commands](https://ipython.readthedocs.io/en/stable/interactive/magics.html).

In [None]:
# Turn on inline plotting


### Quick Pandas Review:  Axis, Series, and DataFrame

![](https://snipboard.io/8i3yIz.jpg)

### Plotting with Pandas .plot

Generally, both series and DataFrame objects in Pandas offer the opportunity to plot.

>  We are just going to go a inch deep into this topic but to get a full sense of what plots are possible with Pandas, see the [user guide](https://pandas.pydata.org/pandas-docs/stable/user_guide/visualization.html).

### A line plot from a series

### A scatter plot from 2 dimension columns

### A histogram plot from a series

### A barh plot from a group DataFrame aggregation

### Box plot from masked filter 

### Plot Aethetics

### Controlling Aethetics

[Matplotlib Colormpas Reference](https://matplotlib.org/examples/color/colormaps_reference.html)

### With Slack: Try to plot plot something from this dataset and post a screenshot (thread).

We will check out the thread at the end of the 1st session.  Also share any code!

### Matplotlib: Figure vs Axes

**Figure** and **Axes** are two of the most fundamental constructs within Matplotlib.  

#### Figure

Figure is like a canvas.  The figure defines the space in which objects can be placed within.  Objects can be annotations and text objects, lines, scatters, and many other types of visual elements.


#### Axes

Axes is a visual object such as a set of points, line vectors, and generally "generate" visual objects that can be placed on a figure.

- Histograms
- Lines
- Horizontal or vertical lines
- x or y ticks

## Matplotlib vs Pandas

Matplotlib:  Line plot

Pandas:  Line Plot

Matplotlib:  Scatter

Pandas: Scatter

#### Figure vs Axes vs Plt Object

A Matplotlib figure is a canvas that can contain one or many visual axes elements.  It's possible to use both with `plt.` object because both are extended from `plt.`.

### Time Permitting:  Subplots with Matplotlib

Almost everyone asks about subplots with Matplotlib and while we may not fully engage on this topic, I wanted to give you this example just to show that it's actually not that bad.

**What's different with a single plot vs multiple plots?**
 - `ax` (the axes) are now an array that is the same size and shape of visual subplot matrix.
 - Each offset within `ax` refers to a subplot within the figure matrix.

**What's the same?**
- `fig` still refers to the entire canvas / plot image

> The size of the subplot matrix is determined by `nrows` and `ncols`.  If you passed `nrows = 5` and `ncols = 3`, ax would be a 5x3 matrix.  `ax` would have 5 rows and 3 columns.

The same thing but with matplotlib plots

## Summary

We will take this poll first
`/poll "What is the hardest point to understand in this session?" "Matplotlib" "Plotting with Pandas" "Nothing!" anonymous`

- Plotting with Pandas is actually an abstract way of plotting with Matplotlib through a single function `.plot()`
- Matplotlib `figure` is like a canvas that defines the space that visual objects can be placed.
- Matplotlib `axes` are visual objects such as lines, scatter points, and histograms.
