# Topological Martingale Risk

**Author:** Zacharia Moussallati  
**Date:** 2025-08-31

---

## Overview

This project investigates the relationship between **market correlation topology** and **martingale-based pricing errors** using historical stock data. By combining **topological data analysis (TDA)** with **Monte Carlo martingale simulations**, we explore whether structural changes in asset correlations can predict deviations in pricing models.

The pipeline is fully automated, from data acquisition to integration and plotting, and is ready to run for any set of tickers.

---

## Motivation

- Traditional quantitative finance assumes that asset prices follow a martingale process under the risk-neutral measure.  
- Empirically, deviations occur due to complex interdependencies between assets.  
- **Topological data analysis (TDA)** provides tools to quantify the “shape” of correlations using **persistent homology**.  
- By integrating **persistent homology features** with Monte Carlo pricing, we aim to detect periods where market topology indicates higher risk of pricing errors.

---

## Methodology

### 1. Data Acquisition
- Adjusted close prices are downloaded from Yahoo Finance using `yfinance`.  
- Daily log returns are computed for each asset.

### 2. Correlation Networks
- Compute the correlation matrix of asset returns.  
- Convert correlations to a distance metric:  

$$
d_{ij} = \sqrt{2(1 - \rho_{ij})}
$$

where $\rho_{ij}$ is the Pearson correlation between assets $i$ and $j$.  
- Construct a **Minimum Spanning Tree (MST)** to visualise the correlation network.

### 3. Topological Analysis (TDA)
- Use **Vietoris-Rips persistence** to compute 0D and 1D homology of the correlation distance matrix.  

**Key features:**
- **H0**: connected components (Betti0)  
- **H1**: loops (Betti1 / persistence intervals)  

Compute **H1 total persistence**:

$$
\text{H1 Total Persistence} = \sum_i (\text{death}_i - \text{birth}_i)
$$

This measures the “strength” of loop structures in the correlation network.

### 4. Monte Carlo Martingale Pricing
For each rolling window of returns:

$$
S_{t+\Delta t} = S_t \exp\Big((\mu - 0.5\sigma^2)\Delta t + \sigma \sqrt{\Delta t} Z_t\Big), \quad Z_t \sim N(0,1)
$$

- $S_t$ is the asset price, $\mu = 0$, $\sigma$ is estimated from returns.  
- Compute mean Monte Carlo terminal price and compare to observed returns.  
- Define pricing error as the absolute difference:

$$
\text{Pricing Error} = \big| \text{MC Price} - \text{Observed Mean Price} \big|
$$

### 5. Integration
- For each rolling window, compute H1 total persistence and Monte Carlo pricing error.  
- Produce a **scatter plot** of `H1 Total Persistence` vs `Pricing Error`.  
- Save integration results as CSV for further analysis.

---

## Installation

```bash
# Clone repository
git clone https://github.com/zachmoussallati/topological-martingale-risk.git
cd topological-martingale-risk

# Create virtual environment (recommended)
python -m venv .venv
# Activate:
# Windows:
.venv\Scripts\activate
# Linux/Mac:
source .venv/bin/activate

# Install dependencies
pip install -r requirements.txt


## Usage

Run the full pipeline:

```bash
python main.py --tickers AAPL MSFT AMZN GOOGL META --start 2015-01-01 --end 2025-08-01


## Outputs

- `results/summaries/`: CSVs of prices, returns, persistence diagrams, Monte Carlo pricing, and integration results  
- `results/plots/`: MST plots, persistence histograms, and Betti vs Price scatter plots  

**Example plots:**

- `mst.png`: Correlation network  
- `persistence_histogram.png`: Histogram of H0/H1 lifetimes  
- `h1_total_vs_price.png`: H1 Total Persistence vs Pricing Error  

---

## Derivation Notes

1. **Correlation distance**: transforms correlation to Euclidean-like distance suitable for TDA.  
2. **Persistent homology**: captures multi-scale topological features of the correlation network.  
3. **Monte Carlo pricing**: simple discrete-time GBM under martingale assumption.  
4. **Rolling window**: ensures temporal variation and produces meaningful scatter plots rather than a single point.  
5. **Integration metric**: `H1 Total Persistence` correlates structural loops in the correlation network with deviations in martingale pricing.  

---

## Results / Interpretation

- Scatter plots reveal periods where **high H1 total persistence** coincides with **larger pricing errors**.  
- Suggests that **complex correlation structures (loops) may signal higher risk** in martingale pricing.  
- MST and persistence histograms provide visual verification of the evolving market topology.  

---

## References

- Carlsson, G. (2009). _Topology and data_. Bulletin of the American Mathematical Society, 46(2), 255–308.  
- Chazal, F., et al. (2017). _GUDHI: Geometry Understanding in Higher Dimensions_. Journal of Machine Learning Research.  
- Adams, H., et al. (2017). _Persistence images: A stable vector representation of persistent homology_. Journal of Machine Learning Research.  
- Yahoo Finance API: [https://pypi.org/project/yfinance/](https://pypi.org/project/yfinance/)
