# Replicating _Intermediary asset pricing: New evidence from many asset classes_
### James, Young Jin Song, Jaehwa Youm, Monica Panigrahy, and Jacob Simeral 

In [1]:
import pandas as pd
import wrds
import config
from datetime import datetime
import unittest
import matplotlib.pyplot as plt
import numpy as np
import Table_01
import Table_A1
import Table02Analysis
import Table02Prep
import warnings
warnings.filterwarnings("ignore", category=FutureWarning)

Attempting to load from: /Users/jamessong/Desktop/GitRepository/intermediary-asset-pricing/data/pulled/nyfed_primary_dealers_list.xls
Loading from cache.
Attempting to load from: /Users/jamessong/Desktop/GitRepository/intermediary-asset-pricing/data/pulled/nyfed_primary_dealers_list.xls
Loading from cache.
Attempting to load from: /Users/jamessong/Desktop/GitRepository/intermediary-asset-pricing/data/pulled/nyfed_primary_dealers_list.xls
Loading from cache.


* Utilize the 'pull_nyfed_primary_dealers_list' function, as defined in the 'load_nyfed.py' file, to download the 'nyfed_primary_dealers_list.xls' Excel file. This function saves the downloaded file into the 'data/pulled' directory. 

### Table 01

#### 1.1 Pull the NY Fed primary dealers list from the NY Fed website

* This process is managed by the task_pull_nyfed() function in the 'dodo.py' file. It handles downloading the 'nyfed_primary_dealers_list.xls' Excel file and saving it in the 'data/pulled' directory.

#### 1.2 Load the NY Fed primary dealers list from the cached file in the 'data/pulled' directory

* Use the 'load_nyfed_primary_dealers_list' function to access the NY Fed primary dealers list from the cached data in the 'data/pulled' directory. Focus on the '2000s' and 'Dealer Alpha' worksheets to extract the list of primary dealers as of February 2014, aiming to replicate Table 1.

#### 1.3 Replicate Table 1 using data from the 'nyfed_primary_dealers_list.xls' Excel file

* The '2000s' worksheet provides annual listings of primary dealers from 2000 to 2014; refine this data to isolate dealers active as of February 2014.

In [2]:
Table_01.df_2014.head()

Unnamed: 0,Primary Dealer
0,"BANK OF NOVA SCOTIA, NEW YORK AGENCY"
1,BARCLAYS CAPITAL INC.
2,BMO CAPITAL MARKETS CORP.
3,BNP PARIBAS SECURITIES CORP.
4,CANTOR FITZGERALD & CO.


* The 'Dealer Alpha' worksheet covers all primary dealers from 1960 to 2014, including their start and end dates. Clean this data to facilitate matching with the primary dealer listings found in the '2000s' worksheet."

In [3]:
Table_01.df_dealer_alpha.head()

Unnamed: 0,Primary Dealer,Start Date,End Date
0,"ABN AMRO BANK, N.V., NY BR",2002-12-09,2006-09-15
1,ABN AMRO INCORPORATED,1998-09-29,2002-12-08
2,"AUBREY G. LANSTON & CO., INC.",1960-05-19,2000-04-17
3,"BA SECURITIES, INC.",1994-04-18,1997-09-30
4,BANC OF AMERICA SECURITIES LLC,1999-05-17,2010-11-01


* Match the 2014 primary dealer list with their start dates, taking care to handle name discrepancies across two Excel sheets, such as extra spaces or differences in punctuation. Also, for dealers who were active, paused, and then resumed, use their latest start date. The code should adjust for these variations to correctly align with the table in the paper. Lastly, arrange the start dates from earliest to most recent.

In [4]:
Table_01.merged_df.head()

Unnamed: 0,Primary Dealer,Start Date
0,"GOLDMAN, SACHS & CO.",1974-12-04
1,BARCLAYS CAPITAL INC.,1998-04-01
2,HSBC SECURITIES (USA) INC.,1999-06-01
3,BNP PARIBAS SECURITIES CORP.,2000-09-15
4,DEUTSCHE BANK SECURITIES INC.,2002-03-30


* The author manually matched dealers with their publicly-traded holding companies. For the replication, a 'ticks.csv' file was created and placed in the 'data/manual' directory, containing the mapping information between primary dealers and their holding companies. Subsequently, an additional column was introduced to the 'merged_df' table to display the corresponding holding company for each dealer, utilizing the information from the 'ticks.csv' file.

In [5]:
Table_01.merged_df_final.head()

Unnamed: 0,Primary Dealer,Holding Company,Start Date
,"GOLDMAN, SACHS & CO.","The Goldman Sachs Group, Inc.",12/4/1974
,BARCLAYS CAPITAL INC.,Barclays PLC,4/1/1998
,HSBC SECURITIES (USA) INC.,HSBC Holdings PLC,6/1/1999
,BNP PARIBAS SECURITIES CORP.,BNP Paribas Group,9/15/2000
,DEUTSCHE BANK SECURITIES INC.,Deutsche Bank AG,3/30/2002


#### 1.4 Convert the table to LaTeX format using the to_latex() function

* The to_latex function, located in the 'Table_01_to_latex.py' file, is utilized to convert the 'merged_df_final' table into LaTeX format. The resulting LaTeX code is then saved into the 'Table_01_to_latex.tex' file within the output directory.

* Before converting the table to LaTeX format, replace '&' with '\\&' in company names. Since LaTeX uses '&' to recognize table columns, an unescaped '&' in company names can cause errors when generating the table in LaTeX format.

#### 1.5 Complete the LaTeX setup to replicate Table 1, incorporating the formatted table

* Use the 'Table_01_to_latex.tex' file from the 'output' directory, which contains the table in LaTeX format, to create a 'Report_Table_01.tex' file in the 'reports' directory. This action replicates the entire table. The table is inserted into 'Report_Table_01.tex' by employing the '\input{\PathToOutput/Table_01_to_latex.tex}' command.

### Table A.1

### Table 02

The following code reads in a manual data file that contains necessary information on primary dealers from 1960-2012 and then merges it with the CRSP Compustat Merge Linkhist table to get additional information on each, such as the SIC codes. The linkhist table is also used as the main reference table to pull the other comparison groups for the table.

In [None]:
db = wrds.Connection(wrds_username=config.WRDS_USERNAME)
merged_main, link_hist = Table02Prep.prim_deal_merge_manual_data_w_linktable()
merged_main

As mentioned above, the next step is now to use the linkhist table to determine what the other comparison groups are. We use SIC codes to determine broker dealers and banks, and we make sure to exclude any firms that are already in the primary dealer group so we do not have duplicates. Below is the reference table for broker dealers, which had explicit SIC codes mentioned in the paper - banks did not have explicit SIC codes mentioned and required research.

In [None]:
comparison_group_link_dict = Table02Prep.create_comparison_group_linktables(link_hist, merged_main)
comparison_group_link_dict['BD']

We then used each of the reference tables (primary dealers, broker dealers, banks, all firms in Compustat) and pulled data from the Compustat Fundamentals Annual table. The paper mentioned use of monthly data, but there was no apparent monthly table for financial statement data from Compustat. This led to some confusion about how these authors generated their ratios - I am wondering if they computed monthly ratios given what months came out of annual and then took the average. The annuals data did come out in a strange format, however, where it seemed like there was data for several months of the year with disproportionate amounts in each month. The quarterly dataset was not used because it did not appear to have the same gvkey compatibility as the annual table.

![title](img/table02_annuals_dataset.png)

Below is the dataset for broker dealers. We calculate or directly pull the values we need in our query so we don't need to do it after. It was mentioned in class this was a best practice because it would run on WRDS servers.

In [None]:
datasets = Table02Prep.pull_data_for_all_comparison_groups(db, comparison_group_link_dict)
datasets['PD']

We then prep that data further by aggregating by year and standardizing the date to the first of the year. We also convert the datadate to a datetime column that can be sliced.

In [None]:
prepped_datasets = Table02Prep.prep_datasets(datasets)
prepped_datasets['PD']

We then weave in some of our analysis into the process, whenever the necessary dataset for the analysis is first available. Below is our main table of ratios, where we have computed the
$$
\frac{\text{Primary dealers amount}}{\text{Comparison group amount (less PD) + Primary dealers amount}}
$$

In [None]:
Table02Analysis.create_summary_stat_table_for_data(datasets)
table = Table02Prep.create_ratios_for_table(prepped_datasets)
table

We create a figure that can give the reader insight into how the ratios have shifted over time for each category and comparison group. We had to clean some of the data and fill null values to have the graph look reasonable but without changing the overall shape of it too dramatically.

In [None]:
Table02Analysis.create_figure_for_data(table)

Lastly, we get our final table which was what we were trying to replicate from the paper. This table is then converted to LaTeX and outputted to a .tex file.

In [None]:
formatted_table = Table02Prep.format_final_table(table)
formatted_table

### Table 03