## Extracting Option Prices from CSVs

The purpose of this code is the extract the Delta-Neutral option prices from CSVs.  The basic idea is to loop through all the large raw CSV files from Delta-Neutral, grab only the underlyings I want to focus on.  Create temp CSV files for each underlying on each trade-date.  And then in the next notebook, I am going to load those into the postgres database.

## Importing Packages

In [None]:
import os
import glob
import shutil
import pandas as pd
from zipfile import ZipFile

## Reading-In Underlyings to Focus On

We are going to focus our attention on the underlyings that have weekly options.  Ultimately only 40 of them made it into my `otm_history` database.

In [None]:
df_etf = pd.read_csv('weekly_underlyings_20210405.csv')
df_etf

Unnamed: 0,ticker,name
0,AMLP,ALPS ETF TR ALERIAN MLP
1,ARKF,ARK ETF TR FINTECH INNOVA
2,ARKK,ARK ETF TR INNOVATION ETF
3,ASHR,DBX ETF TR XTRACK HRVST CSI
4,BRZU,DIREXION SHS ETF TR BRZ BL 2X SHS
...,...,...
78,XLV,SELECT SECTOR SPDR TR SBI HEALTHCARE
79,XLY,SELECT SECTOR SPDR TR SBI CONS DISCR
80,XME,SPDR SER TR S&P METALS MNG
81,XOP,SPDR SER TR S&P OILGAS EXP


## Removing Old Temp Directories and Creating New Temp Directories

Let's remove the old temp directories and make the new temp directories.

In [None]:
# removing old directories
if os.path.isdir('/home/pritam/files/data/delta_neutral/temp/option_price'):
    shutil.rmtree('/home/pritam/files/data/delta_neutral/temp/option_price')
if os.path.isdir('/home/pritam/files/data/delta_neutral/temp/etl'):
    shutil.rmtree('/home/pritam/files/data/delta_neutral/temp/etl')


# making new directories
os.makedirs('/home/pritam/files/data/delta_neutral/temp/option_price')
os.makedirs('/home/pritam/files/data/delta_neutral/temp/etl')

Next, let's grab all the paths for all the raw Delta-Neutral files.  They live on an external hard drive.

In [None]:
lst_path_zip = glob.glob('/media/pritam/250gb/delta_neutral/*.zip')
#lst_path_zip

Finally, let's look through all the Delta-Neutral files and create the temp files for all the underlying/trade-date combinations.

In [None]:
# looping through all the zip files on the external hard drive
for ix_path_zip in lst_path_zip:
    print(os.path.basename(ix_path_zip))
    #print(ix_path_zip)

    # removing existing etl directory and creating a new one
    if os.path.isdir('/home/pritam/files/data/delta_neutral/temp/etl'):
        shutil.rmtree('/home/pritam/files/data/delta_neutral/temp/etl')
    os.makedirs('/home/pritam/files/data/delta_neutral/temp/etl')

    #extracting files in ix_path_zip
    with ZipFile(ix_path_zip, 'r') as zip:
        zip.extractall(path = '/home/pritam/files/data/delta_neutral/temp/etl/')

    # depending on how the extraction went - change to location of where to look for paths
    # sometimes extract all creates a subdirectory like L2_2019_04 and sometime it doesn't
    temp_unzipped_dir = os.path.basename(ix_path_zip)[0:10]
    path_csv = '/home/pritam/files/data/delta_neutral/temp/etl/'
    if os.path.isdir(path_csv  + temp_unzipped_dir):
        path_csv = path_csv + temp_unzipped_dir

    lst_path_csv = []
    lst_path_csv = glob.glob(path_csv  + '/L3_options_*.csv')
    lst_path_csv.sort()

    for ix_csv_path in lst_path_csv:
        print(ix_csv_path)
        df_all = pd.read_csv(ix_csv_path)
        df_all.rename(columns=lambda x: x.strip(), inplace=True)
        df_all = df_all.rename(columns={
            'Last':'LastPx',
            'Bid':'BidPx',
            'Ask':'AskPx'
            })
        #lst_underlying = ['SPY', 'IWM', 'GLD', 'SLV', 'QQQ']
        lst_underlying = df_etf['ticker']
        cols = ['UnderlyingSymbol', 'UnderlyingPrice', 'Flags', 'OptionSymbol',
        'Type', 'Expiration', 'DataDate', 'Strike', 'LastPx', 'BidPx', 'AskPx', 'Volume',
        'OpenInterest', 'T1OpenInterest']
        for ix_underlying in lst_underlying:
            df_underlying = df_all.query('UnderlyingSymbol == @ix_underlying')[cols]
            base_path = "/home/pritam/files/data/delta_neutral/temp/option_price/"
            ymd = pd.to_datetime(df_all['DataDate'][0]).strftime('%Y%m%d') 
            file_name = ymd + '_' + ix_underlying.lower() + '_options.csv'
            df_underlying.to_csv(base_path + file_name, index = False)

print("DONE!")

L3_2002_02.zip
/home/pritam/files/data/delta_neutral/temp/etl/L3_options_20020201.csv
/home/pritam/files/data/delta_neutral/temp/etl/L3_options_20020204.csv
/home/pritam/files/data/delta_neutral/temp/etl/L3_options_20020205.csv
/home/pritam/files/data/delta_neutral/temp/etl/L3_options_20020206.csv
/home/pritam/files/data/delta_neutral/temp/etl/L3_options_20020207.csv
/home/pritam/files/data/delta_neutral/temp/etl/L3_options_20020208.csv
/home/pritam/files/data/delta_neutral/temp/etl/L3_options_20020211.csv
/home/pritam/files/data/delta_neutral/temp/etl/L3_options_20020212.csv
/home/pritam/files/data/delta_neutral/temp/etl/L3_options_20020213.csv
/home/pritam/files/data/delta_neutral/temp/etl/L3_options_20020214.csv
/home/pritam/files/data/delta_neutral/temp/etl/L3_options_20020215.csv
/home/pritam/files/data/delta_neutral/temp/etl/L3_options_20020219.csv
/home/pritam/files/data/delta_neutral/temp/etl/L3_options_20020220.csv
/home/pritam/files/data/delta_neutral/temp/etl/L3_options_2002