# Replication Explanation 
This notebook aims to explain the definitions and data and how we manage to replicate the columns in the He table.

## Variable Descriptions
intermediary_capital_ratio
--------------------------
The end of period ratio of total market cap to (total market cap + book assets - book equity)
of NY Fed primary dealers' publicly-traded holding companies.

intermediary_capital_risk_factor
--------------------------------
AR(1) innovations to the intermediary_capital_ratio scaled by lagged intermediary_capital_ratio.
Note: In extending the intermediary_capital_risk_factor, we retain the AR(1)
coefficients used in the paper.

intermediary_value_weighted_investment_return
---------------------------------------------
The value-weighted investment return to a portfolio of NY Fed primary dealers'
publicly-traded holding companies. Unlike the intermediary_capital_risk_factor,
this portfolio is tradable, and performed similarly as a pricing factor.

intermediary_leverage_ratio_squared
-----------------------------------
This level variable, defined as (1/intermediary_capital_ratio)^2 was used
for preliminary predictive tests in the paper, as prescribed by the HK model.

## Pull NY Fed Data
(+variable codebook)

In [2]:
#do i need to upload the file to github so that it can be downloaded automatically in the future?
import pandas as pd
dfs_raw = pd.read_excel("../data_manual/Dealer_Lists_1960_to_2014.xls", sheet_name="Dealer Alpha",skiprows=2)

In [3]:
dealer_df = dfs_raw[dfs_raw["End Date"] == "Current Dealer"].drop(dfs_raw.columns[-1],axis=1)
dealer_df["Primary Dealer"] = dealer_df["Primary Dealer"].str.strip()
dealer_df.count()

Primary Dealer    22
Start Date        22
End Date          22
dtype: int64

In [4]:
dealer_df

Unnamed: 0,Primary Dealer,Start Date,End Date
9,"BANK OF NOVA SCOTIA, NEW YORK AGENCY",2011-10-04,Current Dealer
11,BARCLAYS CAPITAL INC.,1998-04-01,Current Dealer
19,BMO CAPITAL MARKETS CORP.,2011-10-04,Current Dealer
21,BNP PARIBAS SECURITIES CORP.,2000-09-15,Current Dealer
28,CANTOR FITZGERALD & CO.,2006-08-01,Current Dealer
43,CITIGROUP GLOBAL MARKETS INC.,2003-04-07,Current Dealer
50,CREDIT SUISSE SECURITIES (USA) LLC,2006-01-16,Current Dealer
54,DAIWA CAPITAL MARKETS AMERICA INC.,2010-04-01,Current Dealer
61,DEUTSCHE BANK SECURITIES INC.,2002-03-30,Current Dealer
79,"GOLDMAN, SACHS & CO.",1974-12-04,Current Dealer


### Manually add company name

In [61]:
comp_holding_company = ["(bank of nova scotia)","BARCLAYS PLC", "(bank of montreal)", "BNP PARIBAS", "(cantor firzgerald)", "CITIGROUP INC", "CREDIT SUISSE GROUP", "DAIWA SECURITIES GROUP INC", "DEUTSCHE BANK AG", "GOLDMAN SACHS GROUP INC", "HSBC HLDGS PLC", "JPMORGAN CHASE & CO", "JEFFERIES FINANCIAL GRP INC", "BANK OF AMERICA CORP", "MIZUHO FINANCIAL GROUP INC", "MORGAN STANLEY", "NOMURA HOLDINGS INC", "(rbc)", "ROYAL BANK OF SCOTLAND LTD", "SOCIETE GENERALE GROUP", "(td)", "UBS GROUP AG" ]

In [59]:
# Deutsche Bank AG, Societe Generale, failed to find in redlookup
# Mizuho does not match table 1
# RBC not match
markit_holding_company = ["The Bank of Nova Scotia","Barclays PLC", "Bank of Montreal", "BNP PARIBAS", "Cantor Fitzgerald, L.P.", "Citigroup Inc.", "Credit Suisse Group AG", "Daiwa Securities Group Inc.", "Deutsche Bk AG", "The Goldman Sachs Group, Inc.", "HSBC HOLDINGS plc", "JPMorgan Chase & Co.", "Jefferies Group LLC", "Bank of America Corporation", "Mizuho Financial Group, Inc.", "Morgan Stanley", "Nomura Holdings, Inc.", "Royal Bank of Canada", "The Royal Bank of Scotland N.V.", "Societe Generale", "The Toronto-Dominion Bank", "UBS AG" ]

In [62]:
dealer_df["markit"] = markit_holding_company
dealer_df["comp"] = comp_holding_company

In [65]:
dealer_df = dealer_df.drop(columns=["End Date"])

In [58]:
dfdf[dfdf.referenceentity.str.contains("UBS")]
#dfdf[dfdf.ticker.str.contains("DB")]

Unnamed: 0,redcode,alt_redcode,distance,cusip,ticker,referenceentity
1878,137EBM,,0.0,12592K,COMUBS,COMM 2014-UBS5 Mortgage Trust
11508,8HHC9L,9BBB8Z,-1.0,89845G,TRUVO-SubCo,TRUVO SUBSIDIARY CORP.
11509,8HHC9L,,0.0,89845G,TRUVO-SubCo,TRUVO SUBSIDIARY CORP.
11510,8HHC9L,9BBB8Z,1.0,89845G,TRUVO-SubCo,TRUVO SUBSIDIARY CORP.
12739,9J287V,,0.0,90261U,UBS-Ams,UBS Americas Inc.
12799,9J37BM,,0.0,90347F,UBSPRE,UBS PREFERRED FUNDING TRUST V
12800,9J37CR,,0.0,90348J,UBBANK,UBS BANK USA
12807,9J387L,,0.0,90352J,UBSG,UBS Group Funding (Switzerland) AG
12808,9J3880,,0.0,90353X,UBSAME,UBS AMERICAS HOLDING LLC
12810,9J389V,9J37DO,-1.0,90354R,UBSGL,UBS GLOBAL INVESTMENTS LLC


## Pull WRDS data 

https://wrds-www.wharton.upenn.edu/pages/support/manuals-and-overviews/compustat/north-america-global-bank/wrds-overview-compustat-north-america-global-bank/
GVKEY
DATADATE
INDFMT
DATAFMT
CONSOL
POPSRC

In [12]:
import wrds

# Connect to WRDS
from settings import config
# Make sure that you have included the line
# WRDS_USERNAME = your_wrds_username
# in the .env file
WRDS_USERNAME = config("WRDS_USERNAME") 
db = wrds.Connection(wrds_username=WRDS_USERNAME)

# Check available datasets
db.list_libraries()[:10]

Loading library list...
Done


['aha_sample',
 'ahasamp',
 'audit',
 'audit_acct_os',
 'audit_audit_comp',
 'audit_common',
 'audit_corp_legal',
 'auditsmp',
 'auditsmp_all',
 'bank']

In [91]:
# from IPython.display import display, HTML
# db.connection.rollback()
# db.describe_table(library="comp", table="funda")

Approximately 909402 rows in comp.funda.


In [61]:
dealer_names = tuple(dealer_df["Primary Dealer"])
if len(dealer_names) == 1:
    dealer_names = f"('{dealer_names[0]}')"  # Ensure proper SQL syntax for a single value
else:
    dealer_names = str(dealer_names)  # Convert tuple to SQL-compatible format
    
query = f"""
    SELECT gvkey, datadate, indfmt, datafmt, consol, popsrc, conm
    FROM comp.funda
    WHERE datadate = '2014-12-31'
"""
#     WHERE UPPER(a.conm) IN {dealer_names}
df = db.raw_sql(query)

In [None]:
# db.list_tables(library='markit')
# https://wrds-www.wharton.upenn.edu/pages/wrds-research/database-linking-matrix/
db.describe_table(library="markit",table="cds2014")

Approximately 58357712 rows in markit.cds2014.


Unnamed: 0,name,nullable,type,comment
0,date,True,DATE,Data Contribution Date
1,batch,True,VARCHAR(3),
2,ticker,True,VARCHAR(25),Ticker
3,shortname,True,VARCHAR(100),Abbreviation of Reference Entity
4,redcode,True,VARCHAR(6),RED Code
...,...,...,...,...
158,irdv01,True,DOUBLE PRECISION,
159,rec01,True,DOUBLE PRECISION,
160,dp,True,DOUBLE PRECISION,
161,jtd,True,DOUBLE PRECISION,


In [None]:
db.get_table(library="markit", table="entity")

In [None]:
db.list_tables(library="markit",)

['cds2001',
 'cds2002',
 'cds2003',
 'cds2004',
 'cds2005',
 'cds2006',
 'cds2007',
 'cds2008',
 'cds2009',
 'cds2010',
 'cds2011',
 'cds2012',
 'cds2013',
 'cds2014',
 'cds2015',
 'cds2016',
 'cds2017',
 'cds2018',
 'cds2019',
 'cds2020',
 'cds2021',
 'cds2022',
 'cds2023',
 'cds2024',
 'cds2025',
 'cdslookup',
 'cdxcomps',
 'cdxcomps_chars',
 'cdxconst',
 'cdxconst_chars',
 'chars',
 'isdatt',
 'itraxxascomps',
 'itraxxascomps_chars',
 'itraxxasconst',
 'itraxxasconst_chars',
 'itraxxeucomps',
 'itraxxeucomps_chars',
 'itraxxeuconst',
 'itraxxeuconst_chars',
 'itraxxsocomps',
 'itraxxsocomps_chars',
 'itraxxsoconst',
 'itraxxsoconst_chars',
 'red_chars',
 'redent',
 'redentaucccy',
 'redentcorpact',
 'redentcredauc',
 'redentisda',
 'redentrating',
 'redentsuccession',
 'redindex',
 'redlookup',
 'redobl',
 'redoblindxconst',
 'redobllookup',
 'redoblprospectus',
 'redoblrefentity']

In [None]:
db.describe_table(library="markit",table="red")

Approximately 1 rows in markit.red_chars.


Unnamed: 0,name,nullable,type,comment
0,redcode,True,VARCHAR(6),RED Code
1,tier,True,VARCHAR(8),Instrument Seniority Tier
2,ccy,True,VARCHAR(3),Instrument Currency
3,ticker,True,VARCHAR(100),Ticker
4,shortname,True,VARCHAR(100),Abbreviation of Reference Entity
...,...,...,...,...
82,pairiscurrent,True,VARCHAR(1),Pair Is Current
83,ispreferred,True,VARCHAR(1),Pref Obligation for this entity/tier
84,publiccomments,True,VARCHAR(4000),Special Features Comments
85,isdatt_tier,True,VARCHAR(10),Transaction Tier


In [21]:
dfdf = db.get_table(library="markit",table="redlookup")
dfdf

Unnamed: 0,redcode,alt_redcode,distance,cusip,ticker,referenceentity
0,001AAV,6EA36E,-1.0,00191U,ASGNINC,ASGN Incorporated
1,001AAV,6EA36E,1.0,00191U,ASGNINC,ASGN Incorporated
2,001AEC,,0.0,001957,T,AT&T Corp.
3,002AA6,,0.0,002824,ABT,Abbott Laboratories
4,002ABI,007D9P,-1.0,00283F,ABT-AMOI,Abbott Medical Optics Inc.
...,...,...,...,...,...,...
25846,YZAA9A,,0.0,Y19182,DAHSIN,Dah Sing Financial Holdings Limited
25847,YZAGF8,,0.0,Y19780,DANGCAP,Danga Capital Berhad
25848,YZSR4V,,0.0,Y1R04R,CHINNAT,China National Petroleum Corporation
25849,YZSR6N,,0.0,Y1R06H,CHINAPE,China Petrochemical Corporation


In [9]:
df.datadate


0    1961-12-31
1    1962-12-31
2    1963-12-31
3    1964-12-31
4    1965-12-31
5    1966-12-31
6    1967-12-31
7    1968-12-31
8    1969-12-31
9    1970-12-31
Name: datadate, dtype: object

In [68]:
db.list_tables(library="comp")[-20:]

['seg_annfund',
 'seg_customer',
 'seg_geo',
 'seg_naics',
 'seg_product',
 'seg_type',
 'spidx_cst',
 'spind',
 'spind_dly',
 'spind_mth',
 'tmptable_pkg6153_tbl4023',
 'wrds_g_exrate',
 'wrds_idx_cst_current',
 'wrds_ratios',
 'wrds_seg_customer',
 'wrds_seg_geo',
 'wrds_seg_product',
 'wrds_segmerged',
 'xfl_column',
 'xfl_table']

In [75]:
dealer_df

Unnamed: 0,Primary Dealer,Start Date,End Date
9,"BANK OF NOVA SCOTIA, NEW YORK AGENCY",2011-10-04,Current Dealer
11,BARCLAYS CAPITAL INC.,1998-04-01,Current Dealer
19,BMO CAPITAL MARKETS CORP.,2011-10-04,Current Dealer
21,BNP PARIBAS SECURITIES CORP.,2000-09-15,Current Dealer
28,CANTOR FITZGERALD & CO.,2006-08-01,Current Dealer
43,CITIGROUP GLOBAL MARKETS INC.,2003-04-07,Current Dealer
50,CREDIT SUISSE SECURITIES (USA) LLC,2006-01-16,Current Dealer
54,DAIWA CAPITAL MARKETS AMERICA INC.,2010-04-01,Current Dealer
61,DEUTSCHE BANK SECURITIES INC.,2002-03-30,Current Dealer
79,"GOLDMAN, SACHS & CO.",1974-12-04,Current Dealer


In [7]:
query = f"""
    SELECT gvkey, datadate, indfmt, datafmt, consol, popsrc, conm
    FROM comp.fundq
    WHERE datadate = '2014-12-31'
"""
#     WHERE UPPER(a.conm) IN {dealer_names}
df = db.raw_sql(query)

In [10]:
df[df.conm.str.contains("DAIWA")]

Unnamed: 0,gvkey,datadate,indfmt,datafmt,consol,popsrc,conm
