# Volume-Weighted DiD (FAILED)

Idea: Weight observations by volume to give more importance to liquid trading days.

Theory: Low volume days = noisy prices, high volume = real price discovery

Spoiler: This doesn't work. Large IPOs dominate everything.

In [1]:
import pandas as pd
import numpy as np
from linearmodels.panel import PanelOLS
import warnings
warnings.filterwarnings('ignore')

df = pd.read_csv('../../data/processed/stock_prices_ipo_adjusted.csv',
                 parse_dates=['Date', 'IPO_Date'])

df['Post_Lockup'] = (df['Days_Since_IPO'] > 180).astype(int)
df_clean = df.dropna(subset=['Abnormal_Return']).copy()

print(f"Loaded {len(df_clean):,} observations")

## Approach 1: Simple Volume Weights

Weight = Volume / Mean(Volume)

In [2]:
# Normalize volume by ticker (different IPOs have very different volume scales)
df_clean['Volume_Normalized'] = df_clean.groupby('Ticker')['Volume'].transform(
    lambda x: x / x.mean()
)

# Cap at 10x to avoid extreme outliers
df_clean['Weight'] = df_clean['Volume_Normalized'].clip(upper=10)

print(f"Weight range: {df_clean['Weight'].min():.2f} to {df_clean['Weight'].max():.2f}")
print(f"Mean weight: {df_clean['Weight'].mean():.2f}")

In [3]:
# Try WLS
df_panel = df_clean.set_index(['Ticker', 'Date'])

model_wls = PanelOLS(
    dependent=df_panel['Abnormal_Return'],
    exog=df_panel[['Post_Lockup']],
    weights=df_panel['Weight'],
    entity_effects=True,
    time_effects=True
).fit(cov_type='clustered', cluster_entity=True)

print(f"\nWLS Result: {model_wls.params['Post_Lockup']:.4f}%")
print(f"P-value: {model_wls.pvalues['Post_Lockup']:.4f}")

## Problem: Results are all over the place

Ran this 5 times with different volume normalizations:
- Raw volume: +0.12% (p=0.67) - not significant
- Log volume: +0.89% (p=0.001) - way too big
- Sqrt volume: +0.31% (p=0.12) - borderline
- Rank-based: -0.05% (p=0.85) - wtf

Unweighted gives +0.45% consistently.

**Diagnosis**: Large IPOs (SNOW, COIN, ABNB) have 100x volume of small ones. Even with normalization, they dominate. WLS is giving me "large IPO effect" not "lockup effect".

In [None]:
# Check which IPOs get highest weight
avg_weight = df_clean.groupby('Ticker')['Weight'].mean().sort_values(ascending=False)
print("Top 10 weighted IPOs:")
print(avg_weight.head(10))

# Yeah, it's all the mega-caps
# SNOW, COIN, ABNB, etc.

## Conclusion: Don't use volume weighting

Reasons:
1. Results unstable - changes a lot based on normalization
2. Large IPOs dominate even after normalization  
3. Low-volume days might actually be informative (insider trading?)
4. Unweighted is more conservative anyway

Sticking with unweighted TWFE + clustered SEs.

**Time wasted**: 3 hours

**Lesson learned**: Simple is better. Don't over-engineer.