Expected Mispricing

by Turan G. Bali, Heiner Beckmeyer and Timo Wiedemann (2023)

Overview

This repository provides replication code for the paper Expected Mispricing by Bali, Beckmeyer and Wiedemann (2023). Please cite this paper if you are using the code or data:

@techreport{bali2023expected,
  title={{Expected Mispricing}},
  author={Bali, G. Turan and Beckmeyer, Heiner and Wiedemann, Timo},
  type={{Working Paper}},
  institution={{Available at SSRN}},
  year={2023}
}

1. Dataset creation:

The file 1_create_dataset.py creates the inital dataset. We use the code provided by Jensen, Kelly and Pedersen (2023) (GitHub) to get a a time-series for a set of 153 monthly firm-level characteristics and apply filters proposed by the authors.

2. Obtain stock-specific realized mispricing via IPCA:

The file 2_run_ipca.py estimates monthly stock-specific realized mispricing (MP) defined as the residual return component relative to a six-factor IPCA model. We carefully set up an estimation procedure that avoids the inclusion of forward-looking information by including only information available at time $t$ when calculating the realized mispricing for the next month, $t+1$.

3. Calculate expected mispricing:

Our measure of firm $i$'s expected mispricing is expressed as a non-linear function $g$ of today's firm-characteristics $z_{i,t}$, $E_t[MP_{i,t+1}] = g(z_{i,t})$. We approximate $g(\cdot)$ by three well-established machine learning estimators: (i) a feed-forward neural network with tree hidden layers; (ii) a gradient-boosted regression tree; and (iii) a random forest. The folling files estimate the models, respectively:

3_run_nn.py estimates the feed-forward neural network.
3_run_gbt.py estimates the gradient-boosted regression tree.
3_run_rf.py estimates the random forest.

We obtain our final measure of expected mispricing based on an equal-weighted ensemble of these three forecasts: $E_t[MP_{i,t+1}] = (g_j(z_{i,t})^{NN} + g_j(z_{i,t})^{GBT} + g_j(z_{i,t})^{RF}) / 3$.

4. Generate mispricing dataset:

The file create_online_data.py exemplifies how we create the ensemble forecasts and creates a file to be made publicly available. The Apache Parquet file EMP_data.pq contains the following information:

Variable	Description
date	Datetime index $t$ (monthly)
permno	CRSP permanent stock identifier
expected_mp	Next month $t+1$ (expected) mispricing (model prediction)
lead1m_mp	Next month $t+1$ realized mispricing (i.e., IPCA residual component)

We also provide a .csv file (EMP_data.csv) with the same information.

For convenience, both files can be downloaded directly from Dropbox, with firm-level mispricing data covering January 1993 through December 2022.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
predictable_mispricing		predictable_mispricing
.gitignore		.gitignore
1_create_dataset.py		1_create_dataset.py
2_run_ipca.py		2_run_ipca.py
3_run_gbt.py		3_run_gbt.py
3_run_nn.py		3_run_nn.py
3_run_rf.py		3_run_rf.py
README.md		README.md
_setup.py		_setup.py
create_online_data.py		create_online_data.py
jkp_chars.xlsx		jkp_chars.xlsx
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

predictable_mispricing

predictable_mispricing

.gitignore

.gitignore

1_create_dataset.py

1_create_dataset.py

2_run_ipca.py

2_run_ipca.py

3_run_gbt.py

3_run_gbt.py

3_run_nn.py

3_run_nn.py

3_run_rf.py

3_run_rf.py

README.md

README.md

_setup.py

_setup.py

create_online_data.py

create_online_data.py

jkp_chars.xlsx

jkp_chars.xlsx

requirements.txt

requirements.txt

Repository files navigation

Expected Mispricing

Overview

1. Dataset creation:

2. Obtain stock-specific realized mispricing via IPCA:

3. Calculate expected mispricing:

4. Generate mispricing dataset:

About

Releases

Packages

Languages

heinerbeckmeyer/Expected-Mispricing

Folders and files

Latest commit

History

Repository files navigation

Expected Mispricing

Overview

1. Dataset creation:

2. Obtain stock-specific realized mispricing via IPCA:

3. Calculate expected mispricing:

4. Generate mispricing dataset:

About

Resources

Stars

Watchers

Forks

Languages