# Tutorial 14 - Scatter Plots with `pandas`

The purpose of this tutorial is to demonstrate the `pandas` built-in functionality for creating scatter plots.

The financial task we will accomplish is demonstrating SPY's *implied leverage effect*:  when the market suffers losses, implied volatilty increases; when the market experiences gains, implied volatility decreases.

Our measure of SPY implied volatility will be the VIX index.  To verify the above relationship, we will plot SPY daily returns against daily changes in the VIX for 2014-2018.

### Loading Packages

Let's begin by loading the packages we will need.

In [None]:
##> import numpy as np
##> import pandas as pd
##> %matplotlib inline




### Wrangling SPY Data

Next, let's read in the SPY price data from 2014-2018:

In [None]:
##> df_spy = pd.read_csv('../data/spy_2014_2018.csv')
##> df_spy.head()




Next, we refactor the `date` column as a `dtype` of `datetime`.

In [None]:
##> df_spy['date'] = pd.to_datetime(df_spy['date'])




Finally, let's calculate the daily log-returns from the `adjusted` prices:

In [None]:
##> df_spy['return'] = np.log(df_spy['adjusted']).diff()
##> df_spy.head()




### Wrangling VIX Data

The second data set that we are going to need is the VIX data from 2014-2018:

In [None]:
##> df_vix_all = pd.read_csv('../data/vix_2014_2018.csv')
##> df_vix_all.head()




We only want the `date` and the `close` columns, so let's create a new `DataFrame` by copying these two columns.

In [None]:
##> df_vix = df_vix_all[['date', 'close']].copy()
##> df_vix.head()




**Knowledge Challenge:** What is the difference between using `.copy()` and not using `.copy()` in the code above?

With our newly copied data, `df_vix`, let's refactor the `date` column to be a `datetime`. 

In [None]:
##> df_vix['date'] = pd.to_datetime(df_vix['date'])
##> df_vix.dtypes




To help keep things organized down the road, we will rename the `close` column and call it `vix`.

In [None]:
##> df_vix.rename({'close':'vix'}, axis='columns', inplace=True)
##> df_vix.head()




Let's calculate the daily change in the VIX, and put it in a new column called `vix_chg`.

In [None]:
##> df_vix['vix_chng'] = df_vix['vix'].diff()
##> df_vix.head()




The `return` column in `df_spy` is expressed as a decimal, so let's change the `vix` and `vix_chng` columns of `df_vix` to also be expressed as decimals.

In [None]:
##> df_vix['vix'] = df_vix['vix'] / 100
##> df_vix['vix_chng'] = df_vix['vix_chng'] / 100
##> df_vix.head()




### Adding `vix` and `vix_chng` to `df_spy`

We next add the `vix` and `vix_chng` columns to `df_spy` by joining together the two tables with `pd.merge()`.  

We use the `date` columns to match entries of the two tables.

In [None]:
##> df_spy = pd.merge(df_spy, df_vix, on=['date'])
##> df_spy.head()




### Scatter Plot

Now that we have our data wrangled, we are in position to use the `DataFrame.plot.scatter()` method to plot daily SPY return against daily changes in the VIX.

In [None]:
##> df_spy.plot.scatter('return', 'vix_chng');



The following code improves the aesthetics of our plot:

In [None]:
##> df_spy.plot.scatter(
##>     x = 'return'
##>     , y = 'vix_chng'
##>     , grid=True   
##>     , c='k'
##>     , alpha=0.75
##>     , s=10  # changing the size of the dots
##>     , figsize=(8, 6)
##>     , title='SPY Return vs VIX Changes (2014-2018: daily)'
##> );


