# IPO Lockup Analysis: Do Lockups Affect Stock Prices?

**Research Question:** When IPO lockups expire (Day 180), do stock prices drop?

---

## Background

When companies go public, insiders (founders, employees, VCs) can't sell shares for 180 days. Wall Street says when this expires, insider selling floods the market and prices tank.

But is that actually true?

**My approach:**
- Compare same IPO before vs after Day 180
- Control for company quality + market conditions
- Method: Staggered Difference-in-Differences with two-way fixed effects

**Data:**
- 71 tech IPOs (2018-2024)
- Daily stock prices for 365 days post-IPO
- Market-adjusted returns (removes COVID/macro effects)

## Setup

In [1]:
import pandas as pd
import numpy as np
import plotly.graph_objects as go
from scipy import stats
from linearmodels.panel import PanelOLS
import statsmodels.formula.api as smf
from pathlib import Path
import warnings
warnings.filterwarnings('ignore')

fig_output_dir = Path("../outputs/figures")
fig_output_dir.mkdir(parents=True, exist_ok=True)

results_output_dir = Path("../outputs/results")
results_output_dir.mkdir(parents=True, exist_ok=True)


IPO LOCKUP DiD ANALYSIS


## Step 1: Load Data

In [2]:
# Load market-adjusted data from Notebook 01
df = pd.read_csv('../data/processed/stock_prices_ipo_adjusted.csv',
                 parse_dates=['Date', 'IPO_Date'])

print(f"   Companies: {df['Ticker'].nunique()}")
print(f"   Observations: {len(df):,}")
print(f"   Date range: {df['Date'].min().date()} to {df['Date'].max().date()}")

# Sanity check
mean_ret = df['Abnormal_Return'].mean()
print(f"\n   Mean abnormal return: {mean_ret:.4f}%")
if abs(mean_ret) < 0.1:
    print(f"   ✅ Good (market adjustment working)")

# Create treatment variable
df['Post_Lockup'] = (df['Days_Since_IPO'] > 180).astype(int)
df['Days_To_Lockup'] = df['Days_Since_IPO'] - 180

print(f"   Pre-lockup: {(df['Post_Lockup']==0).sum():,} obs")
print(f"   Post-lockup: {(df['Post_Lockup']==1).sum():,} obs")


📊 Data:
   Companies: 71
   Observations: 17,874
   Date range: 2018-03-16 to 2025-06-13

   Mean abnormal return: 0.0256%
   ✅ Good (market adjustment working)

🎯 Treatment:
   Pre-lockup: 8,851 obs
   Post-lockup: 9,023 obs


### Quick Look (Not Causal)

Just comparing averages pre vs post. Doesn't control for anything.

In [3]:
pre_avg = df[df['Post_Lockup']==0]['Abnormal_Return'].mean()
post_avg = df[df['Post_Lockup']==1]['Abnormal_Return'].mean()

print(f"Simple comparison:")
print(f"   Pre: {pre_avg:.4f}%")
print(f"   Post: {post_avg:.4f}%")
print(f"   Diff: {post_avg - pre_avg:+.4f}%")
print(f"\n   (Misleading - doesn't control for company quality or time trends)")

Simple comparison:
   Pre: -0.0524%
   Post: 0.1016%
   Diff: +0.1540%

   (Misleading - doesn't control for company quality or time trends)


## Step 2: Simple DiD (Learning Step)

Before adding controls, let me see what simple DiD gives.

Model: Y = β₀ + β₁·Post_Lockup + ε

In [4]:
simple_model = smf.ols('Abnormal_Return ~ Post_Lockup', data=df).fit()

print(f"\nSimple DiD:")
print(f"   Coefficient: {simple_model.params['Post_Lockup']:.4f}%")
print(f"   P-value: {simple_model.pvalues['Post_Lockup']:.4f}")
print(f"   R²: {simple_model.rsquared:.4f}")
print(f"\n   Problem: No controls for company or time differences.")


Simple DiD:
   Coefficient: 0.1540%
   P-value: 0.0219
   R²: 0.0003

   Problem: No controls for company or time differences.


## Step 3: Two-Way Fixed Effects DiD (Main Analysis)

Now with proper controls:

Model: Y = β₀ + β₁·Post_Lockup + αᵢ + γₜ + ε

- **αᵢ** = company fixed effects (controls for company quality)
- **γₜ** = time fixed effects (controls for market conditions)
- **β₁** = treatment effect (what we're estimating)

In [5]:
print("\n" + "="*80)
print("TWFE DiD")

df_clean = df.dropna(subset=['Abnormal_Return']).copy()
print(f"\n{len(df_clean):,} observations, {df_clean['Ticker'].nunique()} companies")

df_panel = df_clean.set_index(['Ticker', 'Date'])

print(f"\n⏳ Running regression (10-30 seconds)...\n")

model = PanelOLS(
    dependent=df_panel['Abnormal_Return'],
    exog=df_panel[['Post_Lockup']],
    entity_effects=True,
    time_effects=True
).fit(cov_type='clustered', cluster_entity=True)

print("✅ Done!\n")
print(model.summary)


TWFE DiD

17,802 observations, 71 companies

⏳ Running regression (10-30 seconds)...

✅ Done!

                          PanelOLS Estimation Summary                           
Dep. Variable:        Abnormal_Return   R-squared:                        0.0007
Estimator:                   PanelOLS   R-squared (Between):             -0.5749
No. Observations:               17802   R-squared (Within):              -0.0008
Date:                Tue, Dec 02 2025   R-squared (Overall):             -0.0029
Time:                        17:26:48   Log-likelihood                -4.905e+04
Cov. Estimator:             Clustered                                           
                                        F-statistic:                      11.267
Entities:                          71   P-value                           0.0008
Avg Obs:                       250.73   Distribution:                 F(1,15916)
Min Obs:                       248.00                                           
Max Obs:     

## Step 4: What Does This Mean?

In [6]:
treatment_effect = model.params['Post_Lockup']
se = model.std_errors['Post_Lockup']
t_stat = treatment_effect / se
p_value = model.pvalues['Post_Lockup']
ci_lower = treatment_effect - 1.96 * se
ci_upper = treatment_effect + 1.96 * se

print("\n" + "="*80)
print("RESULT")

print(f"\nTreatment effect: {treatment_effect:.4f}%")
print(f"Standard error: {se:.4f}%")
print(f"T-statistic: {t_stat:.2f}")
print(f"P-value: {p_value:.4f}")
print(f"95% CI: [{ci_lower:.4f}%, {ci_upper:.4f}%]")

print(f"\n{'='*80}")
print(f"INTERPRETATION")
print(f"{'='*80}")

if p_value < 0.05:
    if treatment_effect > 0:
        print(f"\n✅ Statistically significant POSITIVE effect")
        print(f"\nPrices rise {treatment_effect:.2f}% after lockup expires.")
        print(f"\nThis is the OPPOSITE of the Wall Street narrative")
        print(f"that says 'lockups tank prices.'")
        print(f"\nThe effect is small but highly significant (p = {p_value:.4f}).")
        print(f"\nWith 95% confidence, the true effect is between")
        print(f"{ci_lower:.2f}% and {ci_upper:.2f}%.")
    else:
        print(f"\n✅ Statistically significant NEGATIVE effect")
        print(f"\nPrices drop {abs(treatment_effect):.2f}% after lockup expires.")
        print(f"\nThis confirms the Wall Street narrative: insider selling")
        print(f"pressure drives prices down.")
else:
    print(f"\n❌ Not statistically significant")
    print(f"\nFound a small effect ({treatment_effect:.2f}%), but the")
    print(f"uncertainty is too large to be confident.")
    print(f"\nThe 95% CI ({ci_lower:.2f}% to {ci_upper:.2f}%) includes zero,")
    print(f"so I can't rule out no effect at all.")


RESULT

Treatment effect: 0.4545%
Standard error: 0.1289%
T-statistic: 3.53
P-value: 0.0004
95% CI: [0.2019%, 0.7072%]

INTERPRETATION

✅ Statistically significant POSITIVE effect

Prices rise 0.45% after lockup expires.

This is the OPPOSITE of the Wall Street narrative
that says 'lockups tank prices.'

The effect is small but highly significant (p = 0.0004).

With 95% confidence, the true effect is between
0.20% and 0.71%.


### Why Might Lockups INCREASE Prices?

In [7]:
if treatment_effect > 0 and p_value < 0.05:
    print("\nPossible explanations:\n")
    
    print("1. EFFICIENT MARKETS")
    print("   Everyone knows lockup expires Day 180 (public info).")
    print("   If selling pressure was going to crash prices, it would've")
    print("   happened beforehand. By Day 180, already priced in.\n")
    
    print("2. SURVIVAL SIGNAL")
    print("   Making it to Day 180 without imploding = good news.")
    print("   Weak IPOs crash or delist before lockup expires.\n")
    
    print("3. LESS SELLING THAN FEARED")
    print("   Founders/VCs have long-term horizons.")
    print("   They don't necessarily dump shares immediately.\n")
    
    print("4. EFFECT IS TINY")
    daily_vol = df_clean['Abnormal_Return'].std()
    print(f"   {treatment_effect:.2f}% vs {daily_vol:.2f}% daily volatility.")
    print(f"   That's {treatment_effect/daily_vol:.1%} of typical noise.")
    print(f"   Statistically significant ≠ economically meaningful.\n")
    
    print("My take: Probably #1 and #2.")
    print("Markets are efficient, and surviving is a real signal.")


Possible explanations:

1. EFFICIENT MARKETS
   Everyone knows lockup expires Day 180 (public info).
   If selling pressure was going to crash prices, it would've
   happened beforehand. By Day 180, already priced in.

2. SURVIVAL SIGNAL
   Making it to Day 180 without imploding = good news.
   Weak IPOs crash or delist before lockup expires.

3. LESS SELLING THAN FEARED
   Founders/VCs have long-term horizons.
   They don't necessarily dump shares immediately.

4. EFFECT IS TINY
   0.45% vs 4.48% daily volatility.
   That's 10.1% of typical noise.
   Statistically significant ≠ economically meaningful.

My take: Probably #1 and #2.
Markets are efficient, and surviving is a real signal.


## Step 5: Model Check

In [8]:
print("\n" + "="*80)
print("DIAGNOSTICS")

print(f"\nR² (Within): {model.rsquared_within:.4f}")
print(f"   Fixed effects explain {model.rsquared_within*100:.2f}% of variation.")

if model.rsquared_within < 0.01:
    print(f"\n   That's really low - is something wrong?")
    print(f"   → No. Stock returns are extremely noisy.")
    print(f"   → Most daily moves are unpredictable.")
    print(f"   → Low R² is normal for this type of data.")

print(f"\nSample: {int(model.nobs):,} observations")
print(f"   Large enough to detect small effects.")

print(f"\nStandard errors: Clustered by company")
print(f"   Accounts for correlation within same IPO over time.")
print(f"   More conservative than regular SEs.")

print(f"\nShould I trust this?")
print(f"   ✅ Large sample")
print(f"   ✅ Proper controls")
print(f"   ✅ Robust SEs")
print(f"   ✅ Data validated")
print(f"\n   → Yes.")


DIAGNOSTICS

R² (Within): -0.0008
   Fixed effects explain -0.08% of variation.

   That's really low - is something wrong?
   → No. Stock returns are extremely noisy.
   → Most daily moves are unpredictable.
   → Low R² is normal for this type of data.

Sample: 17,802 observations
   Large enough to detect small effects.

Standard errors: Clustered by company
   Accounts for correlation within same IPO over time.
   More conservative than regular SEs.

Should I trust this?
   ✅ Large sample
   ✅ Proper controls
   ✅ Robust SEs
   ✅ Data validated

   → Yes.


## Step 6: Parallel Trends Check

DiD assumes parallel trends. Checking if pre-lockup trends are flat.

In [9]:
print("\n" + "="*80)
print("PARALLEL TRENDS")

pre_lockup = df_clean[df_clean['Days_To_Lockup'].between(-30, -1)].copy()

print(f"\nPre-lockup (Days -30 to -1): {len(pre_lockup):,} obs")
print(f"Mean: {pre_lockup['Abnormal_Return'].mean():.4f}%")

# Test for trend
pre_trend = stats.linregress(
    pre_lockup['Days_To_Lockup'],
    pre_lockup['Abnormal_Return']
)

print(f"\nTrend test:")
print(f"   Slope: {pre_trend.slope:.6f}% per day")
print(f"   P-value: {pre_trend.pvalue:.4f}")

if pre_trend.pvalue < 0.05:
    print(f"   ❌ Significant pre-trend detected")
    print(f"   DiD assumptions may be violated.")
else:
    print(f"   ✅ No significant pre-trend")
    print(f"   Parallel trends assumption holds.")

# Visualize
avg_by_day = pre_lockup.groupby('Days_To_Lockup')['Abnormal_Return'].agg([
    'mean', 'sem'
]).reset_index()

fig = go.Figure()

fig.add_trace(go.Scatter(
    x=avg_by_day['Days_To_Lockup'],
    y=avg_by_day['mean'],
    mode='lines+markers',
    line=dict(color='#3498DB', width=2),
    error_y=dict(type='data', array=avg_by_day['sem'] * 1.96, visible=True)
))

trend_x = np.array([-30, -1])
trend_y = pre_trend.slope * trend_x + pre_trend.intercept
fig.add_trace(go.Scatter(
    x=trend_x, y=trend_y, mode='lines',
    name=f'Trend (p={pre_trend.pvalue:.3f})',
    line=dict(color='red', dash='dash')
))

fig.add_hline(y=0, line_dash="dot", line_color="gray")

fig.update_layout(
    title='Parallel Trends: Pre-Lockup',
    xaxis_title='Days Before Lockup',
    yaxis_title='Avg Abnormal Return (%)',
    height=500,
    template='plotly_white',
    showlegend=True
)

fig.show()
fig.write_image(f"{fig_output_dir}/01_rdd_estimate_vs_true_effect.png", scale=2)


PARALLEL TRENDS

Pre-lockup (Days -30 to -1): 1,429 obs
Mean: -0.2072%

Trend test:
   Slope: -0.011251% per day
   P-value: 0.4301
   ✅ No significant pre-trend
   Parallel trends assumption holds.


## Step 7: Event Study

Day-by-day view around lockup expiration.

In [10]:
print("\n" + "="*80)
print("EVENT STUDY")

window_df = df_clean[df_clean['Days_To_Lockup'].between(-30, 30)].copy()
print(f"\nWindow (Days -30 to +30): {len(window_df):,} obs")

event_avg = window_df.groupby('Days_To_Lockup')['Abnormal_Return'].agg([
    'mean', 'sem'
]).reset_index()

fig = go.Figure()

fig.add_trace(go.Scatter(
    x=event_avg['Days_To_Lockup'],
    y=event_avg['mean'],
    mode='lines+markers',
    line=dict(color='#3498DB', width=2),
    marker=dict(size=4),
    error_y=dict(type='data', array=event_avg['sem'] * 1.96, visible=True)
))

fig.add_hline(y=0, line_dash="dash", line_color="gray")
fig.add_vline(x=0, line_dash="solid", line_color="red", line_width=2,
              annotation_text="Lockup Expires", annotation_position="top")

fig.add_vrect(x0=-30, x1=0, fillcolor="green", opacity=0.05)
fig.add_vrect(x0=0, x1=30, fillcolor="red", opacity=0.05)

fig.update_layout(
    title='Event Study: Returns Around Lockup',
    xaxis_title='Days (0 = Lockup Expiration)',
    yaxis_title='Avg Abnormal Return (%)',
    height=600,
    template='plotly_white'
)

fig.show()
fig.write_image(f"{fig_output_dir}/02_event_study.png", scale=2)

pre_mean = event_avg[event_avg['Days_To_Lockup'] < 0]['mean'].mean()
post_mean = event_avg[event_avg['Days_To_Lockup'] > 0]['mean'].mean()

print(f"\nPre (Days -30 to -1): {pre_mean:.4f}%")
print(f"Post (Days +1 to +30): {post_mean:.4f}%")
print(f"Difference: {post_mean - pre_mean:+.4f}%")


EVENT STUDY

Window (Days -30 to +30): 2,996 obs



Pre (Days -30 to -1): -0.2182%
Post (Days +1 to +30): -0.0059%
Difference: +0.2122%


## Summary

In [11]:
print("\n" + "="*80)
print("SUMMARY")

print(f"\nMain finding:")
if p_value < 0.05:
    if treatment_effect > 0:
        print(f"   Lockup expirations → +{treatment_effect:.2f}% price increase")
        print(f"   (p = {p_value:.4f}, significant)")
        print(f"\n   Contradicts Wall Street narrative.")
        print(f"   Markets are efficient + surviving is a signal.")
    else:
        print(f"   Lockup expirations → {treatment_effect:.2f}% price decrease")
        print(f"   (p = {p_value:.4f}, significant)")
        print(f"\n   Confirms insider selling pressure.")
else:
    print(f"   No significant effect (p = {p_value:.4f})")

print(f"\nWhat worked:")
print(f"   - Market adjustment")
print(f"   - TWFE DiD with proper controls")
print(f"   - Large sample ({int(model.nobs):,} obs)")
print(f"   - Parallel trends validated")

print(f"\nLimitations:")
print(f"   - Effect tiny ({abs(treatment_effect):.2f}% vs {df_clean['Abnormal_Return'].std():.1f}% volatility)")
print(f"   - Only 1 year post-IPO")
print(f"   - No heterogeneity analysis")

print(f"\nLearnings:")
print(f"   - Fixed effects matter (simple: {simple_model.params['Post_Lockup']:.4f}%, TWFE: {treatment_effect:.4f}%)")
print(f"   - Stat sig ≠ econ sig")
print(f"   - Markets pretty efficient")

print(f"\nNext:")
print(f"   - Heterogeneity (size, sector, time)")
print(f"   - Volume analysis")
print(f"   - Longer horizon")

# Save
results_df = pd.DataFrame({
    'Treatment_Effect': [treatment_effect],
    'Std_Error': [se],
    'T_Stat': [t_stat],
    'P_Value': [p_value],
    'CI_Lower': [ci_lower],
    'CI_Upper': [ci_upper],
    'N_Obs': [int(model.nobs)],
    'N_Companies': [df_clean['Ticker'].nunique()]
})

results_df.to_csv(f'{results_output_dir}/did_main_results.csv', index=False)
print(f"✅ Done")


SUMMARY

Main finding:
   Lockup expirations → +0.45% price increase
   (p = 0.0004, significant)

   Contradicts Wall Street narrative.
   Markets are efficient + surviving is a signal.

What worked:
   - Market adjustment
   - TWFE DiD with proper controls
   - Large sample (17,802 obs)
   - Parallel trends validated

Limitations:
   - Effect tiny (0.45% vs 4.5% volatility)
   - Only 1 year post-IPO
   - No heterogeneity analysis

Learnings:
   - Fixed effects matter (simple: 0.1540%, TWFE: 0.4545%)
   - Stat sig ≠ econ sig
   - Markets pretty efficient

Next:
   - Heterogeneity (size, sector, time)
   - Volume analysis
   - Longer horizon

💾 Saved results
✅ Done
