# Data Visualization with Python

**Objectives**

- Review `matplotlib` fundamentals
- Use `subplots` with `figure` and `axes` objects
- Plot directly from `DataFrame`
- Introduce `pandas.plotting`
- Make interactive plots with `widgets`


In [1]:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

### Anatomy of a `matplotlib` figure

[code](https://matplotlib.org/3.3.1/gallery/showcase/anatomy.html)
<center>
<img src = https://matplotlib.org/3.3.1/_images/sphx_glr_anatomy_001.png height = 200 width = 400/>
</center>

In [None]:
#define a quadratic function


In [None]:
#determine a domain (-3, 3)


In [None]:
#plot 


### Interacting with the figure *and* axes object 

```python
fig, ax = plt.subplots()
```

In [None]:
###create the figure and axis


In [None]:
###plot to the axis


In [None]:
###create figure and axis

###control other elements -- spines
###set left and bottom to zero position


###set top and right color to none




### Subplots

In addition to exposing the figure and axes objects, the subplots will produce multiple axes objects on a figure.  These will be an array of axes objects if we enter a shape argument for the arangement of rows and columns in the subplots.

In [None]:
###subplots 1 row 2 columns


In [None]:
#examine the ax object


In [None]:
#now plot to each


In [None]:
#control the figure size

#plot to the first axis


### Problems

1. Load the `bank-full.csv` file from the data folder into a `DataFrame`.  
2. Create a 2 column 1 row plot with:
 - a histogram of `age` on one plot
 - a histogram of `balance` on the second.
3. Create a 1 column 3 row plot with:
 - a scatterplot of `balance` vs. `age`
 - an `age` boxplot
 - a bar plot of counts of the `marital` feature

In [None]:
vehicle = pd.read_csv('bank-full.csv', sep = ';')

In [None]:
vehicle.info()

### Using with `Pandas`

In [None]:
# !pip install pandas-datareader

In [None]:
import pandas_datareader as pdr

In [None]:
#load tesla from yahoo


In [None]:
#take a peek


In [None]:
#plot adjusted close


In [None]:
#plot adjusted close
#and distribution of pct_change

#plot the adjusted close

#plot percent change


### `pandas.plotting`

There are additional plotting utilities available from the `pandas.plotting` module.  We will examine the:

- `scatter_matrix`
- `kde`
- `andrews_curves`
- `parallel_coordinates`


In [None]:
import seaborn as sns

In [None]:
#load iris data


In [None]:
#examine


##### `scatter_matrix`

Pairs each numeric feature against each other numeric feature.

In [None]:
from pandas.plotting import scatter_matrix, andrews_curves, parallel_coordinates

In [None]:
#make a scatter matrix


In [None]:
color_dict = {'setosa': 'purple', 'virginica': 'yellow', 'versicolor': 'green'}

In [None]:
#new color series


In [None]:
#scatter matrix colored by class


##### `kde`

Smoothed histogram of continuous features.

In [None]:
#plot the histogram of sepal_width


In [None]:
#kde of sepal_width


##### `andrews_curve`

Used for multivariate data to explore structure in data.  For details see [Wikipedia](https://en.wikipedia.org/wiki/Andrews_plot). 

In [None]:
#andrews of iris


##### `parallel_coordinates`

Similar to the `andrews_curve` but not smoothed.  Again can be used to explore structure in higher dimensional data.

In [None]:
#parallel_coordinates of iris


### PROBLEM

1. Load in the `penguins` dataset from `seaborn`.  
2. Create a scatter matrix colored by species.
3. Create an `andrews_curve` plot and `parallel_coordinate` plot.
4. After examining these plots do you think there are features that help to differentiate between different species?  Which?

### Subplots with `pandas`

We can use categorical features as grouping methods to create subplots.  To do so, we use the `by` argument and pass the grouping column.

In [None]:
#Histogram of flipper_length by species


We get similar behavior plotting from a `.groupby`.

In [None]:
#group by species and boxplot


By setting `subplots = True` we get subplots for each feature arranged by their indicies. 

In [None]:
#new df of TSLA and MSFT stocks


In [None]:
#subplots = True


We can also use the `layout` argument to specify the grid dimensions.

In [None]:
#(6, 2) layout


### BONUS: Widgets

In a terminal please run the following

```
conda install -c conda-forge nodejs
jupyter labextension install @jupyter-widgets/jupyterlab-manager
```

Restart your JupyterLab instance and run the cell below.

In [None]:
import ipywidgets as widgets
from ipywidgets import interact
def f(x): return x**2
interact(f, x = 5)

In [None]:
###using a slider


In [None]:
###exploring quadratic parameters
