In this notebook we want to extract data of options that share a given expiry.

### Set up environment

Loads:
- data
- relevant packages

In [1]:
@time include("../startup_script.jl")

elapsed time: 0.48341778 seconds (52396804 bytes allocated, 6.27% gc time)
elapsed time: 56.18824704 seconds (5419744236 bytes allocated, 74.06% gc time)
elapsed time: 0.732155466 seconds (112671656 bytes allocated, 58.50% gc time)
elapsed time: 74.062693611 seconds (5419744236 bytes allocated, 79.95% gc time)
elapsed time: 153.800062536 seconds (14794073832 bytes allocated, 68.11% gc time)


Unnamed: 0,Date,ID,Bid,Ask,Volume,Open_Interest
1,2006-07-03,c_20061215_1800,,,1,104
2,2006-07-03,p_20061215_1800,,,0,5515
3,2006-07-03,c_20061215_2000,,,0,2152
4,2006-07-03,p_20061215_2000,,,0,20941
5,2006-07-03,c_20061215_2200,,,0,2
6,2006-07-03,p_20061215_2200,,,0,4626


- show workspace variables

In [3]:
whos()

ArrayViews                    Module
Base                          Module
Compat                        Module
Core                          Module
DataArrays                    Module
DataFrames                    Module
DataStructures                Module
Dates                         Module
Docile                        Module
GZip                          Module
IJulia                        Module
IPythonDisplay                Module
JSON                          Module
Main                          Module
Nettle                        Module
Reexport                      Module
SortingAlgorithms             Module
StatsBase                     Module
ZMQ                           Module
addObs                        2025129x6 DataFrame
cohortParams                  21053x4 DataFrame
convertColToDates!            Function
daxVals                       1908x2 DataFrame
optPrices                     2025129x3 DataFrame
opts                          12917x4 DataFrame


- expiration dates usually are somewhere in the middle of the month:

In [5]:
sort(unique(cohortParams[:, :Expiry]))

97-element DataArray{Date,1}:
 2006-07-21
 2006-08-18
 2006-09-15
 2006-10-20
 2006-11-17
 2006-12-15
 2007-01-19
 2007-02-16
 2007-03-16
 2007-04-20
 2007-05-18
 2007-06-15
 2007-07-20
 ⋮         
 2013-07-19
 2013-08-16
 2013-09-20
 2013-10-18
 2013-11-15
 2013-12-20
 2014-01-17
 2014-02-21
 2014-03-21
 2014-06-20
 2014-09-19
 2014-12-19

- number of different expiration dates / cohorts:

In [6]:
nExpiry = length(unique(cohortParams[:, :Expiry]))

97

- select arbitrary maturity

In [7]:
optionIndex = 473
expDate = opts[optionIndex, :Expiry]

2007-06-15

- get all observations of options with this expiry

In [8]:
@time begin
    allObsData = join(optPrices, opts, on = :ID)
    relevObsData = allObsData[allObsData[:Expiry] .== expDate, :]
end

head(relevObsData)

elapsed time: 3.382853769 seconds (333553020 bytes allocated, 37.92% gc time)


Unnamed: 0,Date,ID,Price,Expiry,Strike,IsCall
1,2006-07-03,c_20070615_1800,3963.8,2007-06-15,1800,True
2,2006-07-04,c_20070615_1800,3984.4,2007-06-15,1800,True
3,2006-07-05,c_20070615_1800,3883.6,2007-06-15,1800,True
4,2006-07-06,c_20070615_1800,3949.7,2007-06-15,1800,True
5,2006-07-07,c_20070615_1800,3938.1,2007-06-15,1800,True
6,2006-07-10,c_20070615_1800,3964.1,2007-06-15,1800,True


- calculate time values and check whether they are positive

In [44]:
function intrinsicValue(x::DataFrame)
    nObs = size(x, 1)
    intrVals = zeros(nObs, 1)
    daxPrices, strikes = x[:DAX], x[:Strike]
    diffs = daxPrices - strikes
    for ii=1:nObs
        if x[ii, :IsCall]
            intrVals[ii] = maximum([diffs[ii], 0])
        else
            intrVals[ii] = maximum([-diffs[ii], 0])
        end
    end
    return intrVals
end

intrinsicValue (generic function with 1 method)

In [47]:
@time intrVals = intrinsicValue(tvData)
tvs = tvData[:Price] - intrVals[:]

sum(tvs .< 0)/size(tvData, 1)

elapsed time: 5.182277295 seconds (405649896 bytes allocated, 76.13% gc time)


0.08721814758467239

- create dataset

In [34]:
tvData = join(optPrices, opts, on = :ID) |>
x -> join(x, daxVals, on = :Date)

head(tvData)

Unnamed: 0,Date,ID,Price,Expiry,Strike,IsCall,DAX
1,2006-07-03,c_20060721_4500,1212.0,2006-07-21,4500,True,5712.69
2,2006-07-03,c_20060721_4600,1112.3,2006-07-21,4600,True,5712.69
3,2006-07-03,c_20060721_4700,1012.7,2006-07-21,4700,True,5712.69
4,2006-07-03,c_20060721_4800,913.2,2006-07-21,4800,True,5712.69
5,2006-07-03,c_20060721_4850,863.5,2006-07-21,4850,True,5712.69
6,2006-07-03,c_20060721_4900,813.9,2006-07-21,4900,True,5712.69


- split in call and put

In [35]:
tvDataCall = tvData[tvData[:IsCall], :]

tvsCall = timeValueCall(tvDataCall)
sum(tvsCall .< 0)/length(tvsCall)

0.029890330587380224

In [36]:
tvDataPut = tvData[!tvData[:IsCall], :]

tvsPut = timeValuePut(tvDataPut)
sum(tvsPut .< 0)/length(tvsPut)

0.14181716865059663

In [None]:
@time relevObsData = join(relevObsData, daxVals, on = :Date)

head(relevObsData)

- export for visualization

In [None]:
writetable("../data/chart_data/singleCohortLong.csv", relevObsData)

- transform to wide format

In [None]:
relevObsDataWide = unstack(relevObsData, :Date, :ID, :Price)
rename!(relevObsDataWide, :ID, :Date)
relevObsDataWide = join(daxVals, relevObsDataWide, on = :Date)
head(relevObsDataWide)

In [None]:
writetable("../data/chart_data/singleCohortWide.csv", relevObsDataWide)