Workflow for TMX project
1. Market-maker labels and risk-sharing results.
- All the Python codes below can be run (in order) from batch file
Code\0_batch_inventory.py
1.1 Generate market-maker labels
- Run
generate_list_traders.py
to get a list of trading accounts. Saves resulting panel atProcessedData\account.csv
. - Run
account_dailysummary.py
to obtain a trader-stock-day panel with Kirilenko, Kyle, Samadi and Tuzun (2017, JF) variables -- net position, number of trades/volume traded, standard deviation of inventory. Saves a panel atProcessedData\account_EODdata.csv
. - Run
BBOpresence.py
to obtain a panel with each traders' time share at best bid and ask (or both ask and bid). Saves the panel atProcessedData\BBO_presence_MM.csv
. - Run
mm_hft_label.py
to assign labels for market makers (using KKST-based methodology and presence at the BBO) as well as HFT labels based on TMX information. Saves file with labels atProcessedData\mm_hft_labels.csv
.
1.2 Study market-maker inventories
- Run
inventory_panel.py
to get inventory panels for each market maker at various frequencies (e.g., 30 second, 1 minute, 30 minutes, 1 hour). Panels are saved inProcessedData/Inventories/inventory_panel_FREQUENCY.csv
. - Run
mean_reversion_inventory.py
to estimate inventory half-life (in minutes) for each market-maker using an AR(1) process. Half-life panel saved asProcessedData/inventory_halflife.csv
. - Run
mean_reversion_graph.py
to generate a two-panel figure of BBO presence and inventory half-lives across market makers. Figure is saved asOutput/BBOshares_Halflife_MM.png
.
1.3 Study risk-sharing by market-makers
- Run
risksharing_correlations.py
to obtain the average pairwise correlation between MM inventory based on 30-second inventory panel. Output saved asProcessedData/Inventories/InvCorrelations.csv
. - Run
risksharing_inefficiency.py
to obtain estimates of MM risk-sharing inefficiency (based on a quadratic inventory model). Inefficiency panel saved asProcessedData/Inventories/InvInefficiency.csv
. - Run
risksharing_graphs.py
to plot the average pairwise correlation and risk-sharing inefficiency. Output saved asOutput/mm_risksharing.png
2. Generate the snapshot-trader marginal quote panel and snapshot depth panel.
- All the codes below can be run (in order) from batch file
Code\1_batch_getmainpanel.py
.
- Run
build_panel.py
on the raw data to generate a folder with marginal quote panels for all traders (ProcessedData\MarginalQuotePanels\
). The code uses auxiliary functions fromfunctions_tmxdata.py
. - Run
fill_zero_quant.py
to take each marginal quote panel and:- Assign quantities of zero on the no-quote side for traders active on one side of the book. Such zero-quantities enter the book "as if" they are at the lowest priority.
- Merge the trader IDs with the market-maker or HFT label.
- Resample the panels every 30s (to avoid double-counting quotes)
- Save the resulting panels in
ProcessedData\MarginalQuotePanels_ZF\
- Running
fill_zero_quant.py
takes a lot of time. The folderCode\ForSlurm\
includes a slightly modified version and a SLURM script to run the code in parallel on the Rotman Research Node. Panels need to be manually transferred to the repo, using e.g., WinSCP.
- Run
select_mm_data.py
to only select the market-maker quotes and save to fileProcessedData\mquotes_ZF_mm.csv
. This is the main panel for regressions. - Run
depth_processing.py
to build a panel with 30-second order book snapshots and correlation coefficients between inventory and queue position. Save file asProcessedData\depth_snapshots.csv
. The file also generates two representative figures:Output/depth_correlation.png
(impact of inventory-queue correlation on depth) andOutput/rho_distribution.png
(distribution of inventory-queue correlations). - Use
sumstats.py
to generate the figures for adverse selection (Output/queue_quantities.png
) and inventory (Output/inventory_concerns.png
). - Run
mm_quotes_preliminaries.py
to generatemquotes_mm.csv
which is the same asmquotes_ZF_mm.py
but without zero quantities filled. - Run
pivot_snapshots.py
to obtain a panel with marginal inventories for MMs as a function of their relative priority level in the book. Panel is saved atProcessedData/pivot_quotes_inventories.csv
.
3. Econometric analysis
- Use
RegressionCode/summary_stats.R
to produce summary stats tables for market-makers (Output/Tables/mmstats.tex
), non-market makers (Output/Tables/nonmmstats.tex
) and for all quote snapshots (Output/Tables/sumstats.tex
) - Use
RegressionCode/regression_tests.R
for econometric analysis onmquotes_ZF_mm.csv
. Output is a table with structural regression estimates:Output/Tables/ASICtable_main.tex
- Use
RegressionCode/depth_tests.R
for econometric analysis ondepth_snapshots.csv
. Output is a table with estimates on depth effects:Output/Tables/Depth_main.tex
. - Use
RegressionCode/inventory_order_depth.R
for econometric analysis onpivot_quotes_inventories.csv
. Output is a table with the marginal impact of inventory as a function of queue position:Output/Tables/Depth_Ordering.tex
.
4. Miscellenea
- Use
theory_figure.py
to generate theory-supporting figuresOutput/theory_example_params.png
andOutput/marginal_impact_queue.png
. - Use
get_average_trade.py
to estimate the average trade size in the data.