# Exercise 7 - an analysis of SPY calls

The purpose of this analysis is to perform analysis on a set of 12 SPY option trade from 2018.

#### 1) What do you do at the beginning of each notebook?

In [1]:
import numpy as np
import pandas as pd

#### 2) Read-in the two CSVs named `spy_2018_call_trade.csv` and `spy_2018_call_pnl.csv`.  Call them `df_trade` and `df_pnl` respectively.  The former consists of data related to the individual trades, the latter gives price-history and pnl data related to the trades.

In [2]:
df_trade = pd.read_csv('../data/spy_2018_call_trade.csv')
df_pnl = pd.read_csv('../data/spy_2018_call_pnl.csv')

In [15]:
df_trade.head()

Unnamed: 0,execution_date,direction,quantity,underlying,type,strike,expiration,d2x,trade_price
0,2017-12-15,sell,1,SPY,call,270,2018-01-19,22,1.14
1,2018-01-19,sell,1,SPY,call,284,2018-02-16,20,1.27
2,2018-02-16,sell,1,SPY,call,278,2018-03-16,19,1.95
3,2018-03-16,sell,1,SPY,call,280,2018-04-20,24,1.57
4,2018-04-20,sell,1,SPY,call,272,2018-05-18,20,1.71


In [4]:
df_pnl.head()

Unnamed: 0,underlying,upx,type,expiration,data_date,strike,bid,ask,implied_vol,delta,dly_opt_pnl,dly_dh_pnl
0,SPY,266.529999,call,2018-01-19,2017-12-15,270,1.14,1.16,0.068257,0.328344,-0.02,0.0
1,SPY,268.230011,call,2018-01-19,2017-12-18,270,1.68,1.69,0.07145,0.421353,-0.53,0.558189
2,SPY,267.25,call,2018-01-19,2017-12-19,270,1.39,1.41,0.074841,0.365808,0.28,-0.412931
3,SPY,267.100006,call,2018-01-19,2017-12-20,270,1.1,1.11,0.070911,0.327058,0.3,-0.054869
4,SPY,267.540009,call,2018-01-19,2017-12-21,270,1.31,1.32,0.072183,0.372113,-0.21,0.143906


#### 3) Check that there are a total of 12 options represented in `df_trade`.  Recall that an option is uniquely defined by its `underlying`, `type`, `strike`, and `expiration`.

In [5]:
# checking the number of unique options in df_trade
df_trade.groupby(['underlying', 'type', 'strike', 'expiration']).size().reset_index()

Unnamed: 0,underlying,type,strike,expiration,0
0,SPY,call,270,2018-01-19,1
1,SPY,call,272,2018-05-18,1
2,SPY,call,275,2018-06-15,1
3,SPY,call,278,2018-03-16,1
4,SPY,call,280,2018-04-20,1
5,SPY,call,281,2018-07-20,1
6,SPY,call,281,2018-12-21,1
7,SPY,call,283,2018-08-17,1
8,SPY,call,283,2018-11-16,1
9,SPY,call,284,2018-02-16,1


#### 4)  Check that there are a total of 12 options represented in `df_pnl`.

In [6]:
df_pnl.groupby(['underlying', 'type', 'strike', 'expiration']).size().reset_index().shape

(12, 5)

#### 5) These trades were selected so that on their execution, the options had a delta of close to 0.30.  Verify that this is actually the case.

In [7]:
# separating out execution dates from df_trade
df_execution = df_trade[['expiration', 'execution_date']]

# joining df_pnl and df_execution to add execution_date to df_pnl
df_pnl = \
    pd.merge(df_pnl, df_execution, on=['expiration'])

In [8]:
# filtering the trades on execution-date and using .describe()
# to examine the distributions of the deltas
df_pnl[df_pnl.execution_date == df_pnl.data_date]['delta'].describe()

count    12.000000
mean      0.307980
std       0.014992
min       0.283518
25%       0.296913
50%       0.306119
75%       0.318692
max       0.330158
Name: delta, dtype: float64

#### 6) Verify that the trade-price for each of the trades is equal to the bid-price on execution date.

In [9]:
# separating premium by expiration
df_premium = df_trade[['expiration', 'trade_price']]

# getting bid prices on execution date
df_bid = \
    df_pnl[df_pnl.execution_date == df_pnl.data_date][['expiration', 'bid']]

# joining them together to do a comparison
# since it's a small number you can do a visual comparison
# for a larger set, you could save this to a dataframe and
# do a proper check with masking
pd.merge(df_premium, df_bid, on='expiration')

Unnamed: 0,expiration,trade_price,bid
0,2018-01-19,1.14,1.14
1,2018-02-16,1.27,1.27
2,2018-03-16,1.95,1.95
3,2018-04-20,1.57,1.57
4,2018-05-18,1.71,1.71
5,2018-06-15,1.46,1.46
6,2018-07-20,1.48,1.48
7,2018-08-17,1.35,1.35
8,2018-09-21,1.39,1.39
9,2018-10-19,1.15,1.15


#### 7) Our data consists of one option trade per monthly expiration in 2018.  Calculate the total PNL by expiration.

In [10]:
# this is a straight-forward .groupby().agg() pattern
df_pnl_expiration = \
    df_pnl.groupby(['expiration'])['dly_opt_pnl'].agg([np.sum]).reset_index()

# renaming the total PNL column
df_pnl_expiration.rename(columns={'sum':'exp_pnl'}, inplace=True)

df_pnl_expiration.head()

Unnamed: 0,expiration,exp_pnl
0,2018-01-19,-9.250015
1,2018-02-16,1.27
2,2018-03-16,1.95
3,2018-04-20,1.57
4,2018-05-18,1.71


#### 8) Delta-hedging is a risk-management practice that is foundational to option pricing and option replication.  The details of delta-hedging are beyond the scope of this course, but we'll touch on it a bit in the next few questions.

#### The `df_pnl['dly_dh_pnl']` column is the pnl from the delta-hedge that you would hold against the option position.  As a preliminary step, create a new column called `df_pnl['dly_tot_pnl']` that is the sum of the option pnl and the delta-hedge pnl.

In [16]:
# creating the total PNL column
df_pnl['dly_tot_pnl'] = df_pnl.dly_opt_pnl + df_pnl.dly_dh_pnl
df_pnl.head()

Unnamed: 0,underlying,upx,type,expiration,data_date,strike,bid,ask,implied_vol,delta,dly_opt_pnl,dly_dh_pnl,execution_date,dly_tot_pnl
0,SPY,266.529999,call,2018-01-19,2017-12-15,270,1.14,1.16,0.068257,0.328344,-0.02,0.0,2017-12-15,-0.02
1,SPY,268.230011,call,2018-01-19,2017-12-18,270,1.68,1.69,0.07145,0.421353,-0.53,0.558189,2017-12-15,0.028189
2,SPY,267.25,call,2018-01-19,2017-12-19,270,1.39,1.41,0.074841,0.365808,0.28,-0.412931,2017-12-15,-0.132931
3,SPY,267.100006,call,2018-01-19,2017-12-20,270,1.1,1.11,0.070911,0.327058,0.3,-0.054869,2017-12-15,0.245131
4,SPY,267.540009,call,2018-01-19,2017-12-21,270,1.31,1.32,0.072183,0.372113,-0.21,0.143906,2017-12-15,-0.066094


#### 9) Calculate the daily aggregates of the option-pnl, delta-hedge pnl, and total pnl.  Put these aggregate daily pnls into a dataframe called `df_comparison`.

In [12]:
# aggregating to get daily total
df_comparison = \
    df_pnl.groupby(['data_date']).\
    agg({'dly_opt_pnl':np.sum, 'dly_dh_pnl':np.sum, 'dly_tot_pnl':np.sum}).reset_index()

df_comparison.head()

Unnamed: 0,data_date,dly_opt_pnl,dly_dh_pnl,dly_tot_pnl
0,2017-12-15,-0.02,0.0,-0.02
1,2017-12-18,-0.53,0.558189,0.028189
2,2017-12-19,0.28,-0.412931,-0.132931
3,2017-12-20,0.3,-0.054869,0.245131
4,2017-12-21,-0.21,0.143906,-0.066094


#### 10) Delta-hedging is a *hedge* against the option position, which means the pnl from delta-hedging should move in the opposite direction as the pnl from the option.  If this is the case, what should the correlation be between `dly_opt_pnl` and `dly_dh_pnl`?  Use `np.corrcoef()` on `df_comparison` to verify this.

In [13]:
# the delta-hedging pnl is negatively correlated with the
np.corrcoef(df_comparison.dly_opt_pnl, df_comparison.dly_dh_pnl)

array([[ 1.       , -0.9274256],
       [-0.9274256,  1.       ]])

#### 11) A hedge should also reduce the risk of your portfolio.  Check that this is the case be comparing the standard deviation of `dly_opt_pnl` and `dly_tot_pnl`.

In [18]:
# comparing standard deviation of PNLs
# the delta-hedged position has less than half the risk of naked options
print('naked options: ', np.std(df_comparison.dly_opt_pnl))
print('delta hedged:  ', np.std(df_comparison.dly_tot_pnl))

naked options:  0.581733284292837
delta hedged:   0.2373096110672859
