# MATH97230 - Market microstructure

Imperial College London, MSc Mathematics and Finance

Academic year 2019-2020


Lecturer: Johannes Muhle-Karbe

Teaching assistant: Claudio Bellani



Coursework project

The following cells show how to load the pandas dataframes using pickle, a python module that implements binary protocols for serializing and de-serializing python object structures. A brief explanation of the dataframe entries is provided at the end of this notebook. Our pandas dataframes were derived from LOBSTER csv files, see https://lobsterdata.com/info/DataStructure.php

Import the required modules:

In [1]:
import sys
import pandas as pd
import pickle

Specify the ticker of interest:

In [2]:
symbol='INTC'

In [3]:
# load the list of available dates 
with open('./data/{}/dates'.format(symbol),'rb') as source:
    dates=pickle.load(source)

Let us look at  e.g. the first date of trades:

In [4]:
date=dates[0]   
print('date: {}\n'.format(date))  
with open('./data/{}/{}_{}_trades'.format(symbol,symbol,date),'rb') as source:
    df_trades = pickle.load(source)
df_trades    

date: 2019-01-02



Unnamed: 0,direction,eventType,level,price,size,time,delta_t
0,-1,4,1,465500,58,36000.074815,36000.074815
1,1,4,0,465400,300,36000.120767,0.045953
2,1,4,1,465300,200,36000.426168,0.305400
3,1,4,0,465300,550,36000.441095,0.014927
4,-1,4,0,465300,1,36001.308240,0.867145
...,...,...,...,...,...,...,...
11264,1,4,1,470800,397,57599.065836,0.048791
11265,1,4,1,470800,700,57599.204977,0.139140
11266,1,4,1,470800,600,57599.413461,0.208484
11267,1,4,0,470850,600,57599.906749,0.493288


# Entries of our dataframe

The dataframe reports trades happened from 10am to 4pm on the corresponding date.

The column "eventType" reports the LOBSTER code for the event, which in our case can be either $4$ (execution of a visible limit order) or $5$ (execution of a hidden limit order). The column "direction" is also aligned with LOBSTER labelling: it reports the value $-1$ if the execution was that of a sell limit order (hence a buyer initiated trade), and it reports the value $1$ if the execution was that of a buy limit order (hence a seller initiated trade).

I already aggregated executions with the same time stamp, so that the column "time" is strictly increasing and the column "size" reports the aggregated size of the limit orders executed with the corresponding time stamp.
The column "price" reports the price of the trade. 
Time is measured in seconds after midnight, size is measured in number of shares, and price is measured in $10^{-4}$US dollars.

In case of an order that walks the book, the reported price is that of the first level. However, not all information is lost, thanks to the column "level". The column "level" (not originally present in the lobster message file) reports the level of the LOB at which trades occur compared to the snapshot immediately after their execution. Hence, every order eats $1-\ell$ levels, where $\ell$ is the number reported in the column "level".    