# Democratic Caucus Miscellaneous Measurements

This notebook contains some miscellaneous calculations added to our paper during the review process. Among other things, it calculates the Holding Times found in Table 5.

To run this notebook you'll first need to run the Trader Analysis notebooks first.

In [1]:
import os
import sys
import pickle

from collections import defaultdict

import numpy as np
import pandas as pd
from pandas import Series

from research_tools import storage

# Load Data

In [2]:
os.chdir('..')

basename = 'dem'

trader_analysis, = storage.retrieve_all([basename + '.trader_analysis'])

Reading data from data/dem.trader_analysis.p


In [3]:
trader_analysis.head()

Unnamed: 0_level_0,Unnamed: 1_level_0,index,seq,market_id,contract_id,user_guid,date_executed,trade_type,price_per_share,quantity,placed_order_id,...,corrected_side,take_provide,notional,buy_sell,cash_flow,yes_no,gross_pnl,fee,pnl_net_fee,close_trade
contract_id,user_guid,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1
840,0034C80D-C854-3C60-8F01-64B48B565AA5,20625,1453407294297,1448,840,0034C80D-C854-3C60-8F01-64B48B565AA5,2016-01-21 15:14:54.297000-05:00,Buy No,0.39,100,1137522,...,-1,T,39.0,1,-39.0,No,0.0,0.0,0.0,False
840,0034C80D-C854-3C60-8F01-64B48B565AA5,6478,1454440788963,1448,840,0034C80D-C854-3C60-8F01-64B48B565AA5,2016-02-02 15:35:18.690000-05:00,Close,0.0,100,-1,...,-1,C,0.0,-1,0.0,No,-39.0,0.0,-39.0,True
840,005E1296-C898-3911-A4C1-0B33FAB05A29,13173,1451717188823,1448,840,005E1296-C898-3911-A4C1-0B33FAB05A29,2016-01-02 01:46:28.823000-05:00,Buy No,0.31,10,832820,...,-1,T,3.1,1,-3.1,No,0.0,0.0,0.0,False
840,005E1296-C898-3911-A4C1-0B33FAB05A29,13174,1451717189107,1448,840,005E1296-C898-3911-A4C1-0B33FAB05A29,2016-01-02 01:46:29.107000-05:00,Buy No,0.32,50,832820,...,-1,T,16.0,1,-16.0,No,0.0,0.0,0.0,False
840,005E1296-C898-3911-A4C1-0B33FAB05A29,13175,1451717189357,1448,840,005E1296-C898-3911-A4C1-0B33FAB05A29,2016-01-02 01:46:29.357000-05:00,Buy No,0.33,50,832820,...,-1,T,16.5,1,-16.5,No,0.0,0.0,0.0,False


# Traders with Positions at Market Close

How many traders had a position at market close?

In [4]:
close_traders = trader_analysis.groupby('user_guid').apply(lambda x: x.close_trade.any())

close_traders.head()

user_guid
0022AC92-4A31-3308-BCB9-D94C6F507A31    False
00318BA5-01FC-34A4-A4A1-3523BF5485C6     True
0034C80D-C854-3C60-8F01-64B48B565AA5     True
005E1296-C898-3911-A4C1-0B33FAB05A29    False
005E56D2-76B6-39DA-9199-366D761FE63D     True
dtype: bool

In [5]:
close_traders.sum() / close_traders.count()

0.66826666666666668

In [6]:
close_traders.sum(), close_traders.count()

(2506, 3750)

# Average Holding Time

What is the average holding time? And median holding time?

In [7]:
holding_times = []
position_open_timestamps = defaultdict(list)

for trade in trader_analysis.itertuples():
    if trade.buy_sell == 1:
        position_open_timestamps[(trade.contract_id, trade.user_guid)].extend([trade.seq] * trade.quantity)
    elif trade.buy_sell == -1:
        oldest_timestamps = position_open_timestamps[(trade.contract_id, trade.user_guid)][:trade.quantity]
        for t in oldest_timestamps:
            holding_times.append(trade.seq - t)
        position_open_timestamps[(trade.contract_id, trade.user_guid)] = (
            position_open_timestamps[(trade.contract_id, trade.user_guid)][trade.quantity:]
        )

holding_times = Series(holding_times)

In [8]:
pd.Timedelta(holding_times.mean(), 'ms')

Timedelta('13 days 09:38:50.557663')

In [9]:
holding_times.mean() / (24 * 60 * 60 * 1000)

13.401974047027426

In [10]:
pd.Timedelta(holding_times.quantile(0.5), 'ms')

Timedelta('1 days 16:25:25.883000')

In [11]:
holding_times.quantile(0.5) / (24 * 60 * 60 * 1000)

1.684327349537037

What are the combined holding times across both markets? Load the pickle file saved in the "DEM Miscellaneous Meseasurements" notebook and combine.

In [12]:
with open('data/dem.holding_times.p', 'wb') as f:
    pickle.dump(holding_times, f)

with open('data/gop.holding_times.p', 'rb') as f:
    rep_holding_times = pickle.load(f)

In [13]:
combined_holding_times = pd.concat([rep_holding_times, holding_times])

In [14]:
pd.Timedelta(combined_holding_times.mean(), 'ms')

Timedelta('15 days 09:16:05.284847')

In [15]:
combined_holding_times.mean() / (24 * 60 * 60 * 1000)

15.386172278326093

In [16]:
pd.Timedelta(combined_holding_times.quantile(0.5), 'ms')

Timedelta('3 days 17:29:33.996000')

In [17]:
combined_holding_times.quantile(0.5) / (24 * 60 * 60 * 1000)

3.7288656944444445

# Pre-Margin Linking Volume

How much volume was during the pre-margin linking period?

In [18]:
dem_margin_linking_cutoff_date = '2015-10-22'
pre_margin_linking = trader_analysis.date_executed < dem_margin_linking_cutoff_date

trader_analysis[pre_margin_linking].quantity.sum() / trader_analysis.quantity.sum()

0.050190773683601031