# FE 520 - Final Project

By
Naveen Nagarajan, Andy Dominiquez and Andrea Cherayath


## Market Microstructure Analysis Using finlib Package

This notebook demonstrates market microstructure analysis using the custom finlib package, which implements several key measures and models:

### Trade Direction Classification
- Implementation of Lee-Ready algorithm for classifying trades as buys or sells based on trade price relative to quotes

### Information-Based Trading Measures  
- Probability of Informed Trading (PIN) estimation using EKOP model to measure information asymmetry

### Volatility Analysis
- Roll's model for estimating effective spread and volatility from trade price series

### Liquidity Measures
- Quoted spreads: Difference between best ask and bid prices
- Effective spreads: Actual execution costs incorporating trade_direction

The analysis is performed on Tesla (TSLA) high-frequency trade and quote data from October 21, 2024.


In [8]:
import pandas as pd
import finlib.trade_direction as td
import finlib.pin_measure as pm
import finlib.volatility_measures as vm
import finlib.liquidity_measures as lm
import finlib.investment_metrics as im
import numpy as np
import warnings
warnings.filterwarnings('ignore')

In [9]:
import importlib
importlib.reload(lm)

<module 'finlib.liquidity_measures' from '/home/nnagarajan/workspace/Stevens FA/FE520 - Introduction to Python for Financial Apps/project/finlib/liquidity_measures.py'>

In [2]:
tsla_taq=pd.read_csv("tsla_taq_20241021_condensed.csv")
tsla_taq['Date.Time'] = pd.to_datetime(tsla_taq['Date.Time'])
tsla_taq.drop('Unnamed: 0',axis=1,inplace=True)

In [3]:
tsla_taq

Unnamed: 0,Date.Time,Ex.Cntrb.ID,Bid.Price,Bid.Size,Ask.Price,Ask.Size,Tick.Dir.,Price,Volume
0,2024-10-21 04:00:00.098902812-04:00,PSE,,,,,,219.70,541.0
1,2024-10-21 04:00:00.116955090-04:00,DEX,219.66,1.0,219.70,2.0,,221.00,6.0
2,2024-10-21 04:00:00.116955090-04:00,DEX,219.66,1.0,219.70,2.0,,220.89,1.0
3,2024-10-21 04:00:00.116955090-04:00,DEX,219.66,1.0,219.70,2.0,,220.89,10.0
4,2024-10-21 04:00:00.116955090-04:00,DEX,219.66,1.0,219.70,2.0,,220.70,3.0
...,...,...,...,...,...,...,...,...,...
817711,2024-10-21 19:59:55.963028354-04:00,PSE,218.50,4.0,218.55,30.0,,218.54,9.0
817712,2024-10-21 19:59:55.963570609-04:00,PSE,218.50,4.0,218.55,30.0,,218.54,1.0
817713,2024-10-21 19:59:56.043247510-04:00,ADF,218.50,4.0,218.55,30.0,,218.50,25.0
817714,2024-10-21 19:59:56.078043716-04:00,ADF,218.50,4.0,218.55,30.0,,218.50,1.0


In [4]:
trade_directions=td.lee_ready_direction(tsla_taq["Price"],tsla_taq["Bid.Price"],tsla_taq["Ask.Price"])

In [5]:
tsla_taq["Direction"]=trade_directions

In [6]:
tsla_buy_vols=tsla_taq[tsla_taq["Direction"]==1]["Volume"]
tsla_sell_vols=tsla_taq[tsla_taq["Direction"]==-1]["Volume"]

In [10]:
pm.pin_ekop(tsla_buy_vols, tsla_sell_vols)

np.float64(0.09090909090909091)

In [65]:
vm.roll_model_analysis(tsla_taq["Price"].dropna())

{'Average Price': 218.31344714070912,
 'Daily Volatility (Roll)': 458.1250074867424,
 'Annualized Volatility (Roll)': 7272.509035137145,
 'Log-Normal Annualized Volatility (Roll)': 33.31223582599476,
 'Total Daily Volatility': 904.2754005279586,
 'Total Annualized Volatility': 14354.926959061826,
 'Log-Normal Total Annualized Volatility': 65.75374603383764}

In [80]:
lm.quoted_spread(tsla_taq["Bid.Price"],tsla_taq["Ask.Price"])

-0.042639700433158845

In [81]:
lm.effective_spread(tsla_taq["Price"],tsla_taq["Bid.Price"],tsla_taq["Ask.Price"],trade_directions)

0.04251152849774049

In [None]:
aapl_data = pd.read_csv("AAPL_Recent_May2023_May2024.csv", parse_dates=["Date"])
aapl_data.set_index("Date", inplace=True)
prices = aapl_data["Adj Close"]

### Investment Metrics Module

In [None]:
start_val = prices.iloc[0]
end_val = prices.iloc[-1]
returns = np.log(prices / prices.shift(1)).dropna()

# Calculate metrics
cagr_val = im.cagr(start_val, end_val, 1)
roi_val = im.roi(end_val, start_val)
sharpe_val = im.sharpe_ratio(returns)
volatility_val = im.calculate_volatility(prices)

# Display metrics
df_metrics = pd.DataFrame({
    "Metric": ["CAGR", "ROI", "Sharpe Ratio", "Volatility"],
    "Value": [cagr_val, roi_val, sharpe_val, volatility_val]
})
print(df_metrics)

# Plot
im.plot_price_and_returns(prices)