Headline estimate. U.S. common-stock IPOs from Jay Ritter's IPO database, 1975–2021, earn a mean 3-year buy-and-hold abnormal return (BHAR) of −19.7% relative to the CRSP value-weighted market (skewness-adjusted t = −7.2; bootstrap 95% CI [−23.2%, −16.1%]). The median IPO does far worse (−55.7%); 70.5% underperform the market. Equivalently, a dollar in the average IPO grows to $0.86 per dollar in the market over three years (wealth relative = 0.862). This is the "new-issues puzzle" (Ritter 1991; Loughran-Ritter 1995): going public is followed by significant long-run underperformance.
Full numbers: output/results_table.md.
CAR cross-check. A cumulative-abnormal-return (CAR, additive) analysis tells the same
story: the mean 36-month CAAR is −17.8% vs the VW market (cross-sectional t = −14.9;
54.3% of IPOs negative). CAAR is small and positive for the first ~3 months (a brief
post-IPO drift) then declines steadily. It is slightly less negative than BHAR because
summing returns does not compound the typical IPO's losses (firm-level corr ≈ 0.61). See
output/car_table.md and the figure below.
- Builds the event sample from Ritter's
IPO-age.xlsx(offer dates + CRSP permnos). - Validates Ritter's permnos against CRSP and keeps U.S. common stock (
shrcd ∈ {10,11}). - Pulls CRSP monthly returns, delisting returns, and the market index (legacy SIZ tables).
- Computes delisting-adjusted, event-time buy-and-hold abnormal returns over months 1–36.
- Produces the BHAR results table and figure with skewness-adjusted and bootstrap inference.
- Runs a companion cumulative-abnormal-return (CAR/CAAR) analysis with its own table/figure.
- Validates the constructed measures against Ritter's own published series (
IPOALL.xlsx): IPO counts and first-day initial returns (see "Validation against Ritter" below). - Reconstructs the analysis from CRSP daily data, aggregated up to monthly, so the IPO month is included and a higher-frequency view is possible (see "Daily reconstruction" below).
- Recovers true first-day (initial) returns by bringing in IPO offer prices — from
Ritter's 1975–84
IPO2609.xls(stage 8), Capital IQ (stage 9), and parsed SEC 424B prospectuses for 1996+ (stage 10). See "First-day returns via offer prices" below. - Unifies those three sources into one per-IPO first-day-return series with provenance
(stage 11) →
data/processed/firstday_returns.parquet.
Stages 3b/4b/5b (daily) run alongside the monthly stages 3/4/5.
Method choices (BHAR vs CAR, benchmark, windows, delisting, matching) are each argued in
DECISIONS.md; a stage-by-stage narrative is in LOG.md; and a
sample-attrition table is written for every stage (output/attrition_*.csv).
pip install -r requirements.txt # or: make deps
./run.sh # or: make allOutputs land in output/. Re-running is deterministic (bootstrap is seeded).
- Their own WRDS access. Create a
.pgpassfile (libpq format, one line:host:port:dbname:user:password, e.g.wrds-pgdata.wharton.upenn.edu:9737:wrds:USER:PW) and either place it at./.pgpassor setPGPASSFILE=/path/to/.pgpass. The.pgpassfile is git-ignored and must never be committed. - The Ritter file is auto-downloaded from
https://site.warrington.ufl.edu/ritter/ipo-data/ (
IPO-age.xlsx); if the site is unreachable, download it manually intodata/raw/. - Raw WRDS extracts and the Ritter file are git-ignored (
data/); only code, docs, and the final figure/table are tracked.
Python 3.11; pandas 2.x, numpy, matplotlib, scipy, psycopg2, openpyxl (see
requirements.txt). CRSP monthly data on the WRDS instance used ran 1925-12 .. 2024-12,
which sets the 2021 sample cutoff (every cohort gets a full 36-month window).
Ritter's IPO-age file already contains a CRSP Perm (permno) for each IPO, so we match on
permno — never tickers (which are recycled). We validate every permno against
crsp.msenames and cross-check Ritter's CUSIP against CRSP ncusip. Match results:
| step | N |
|---|---|
| Ritter rows (IPO-age) | 16,030 |
| with a CRSP permno, 1975–2021, non-ADR, unique | 14,084 |
| permno present in CRSP | 14,080 |
U.S. common stock (shrcd ∈ {10,11}) |
11,283 |
| after $5 penny screen + ≥1 aftermarket month | 10,101 |
CUSIP is sparse in Ritter (so it is not the key), but conditional on Ritter carrying a CUSIP, 6-char issuer agreement with CRSP is 95.6%, confirming the permnos are correct.
Jay Ritter publishes his own constructed IPO measures in IPOALL.xlsx (monthly IPO counts
and average first-day returns). Stage 7 (src/s07_validate_ritter.py) checks ours
against his; full numbers in output/ritter_validation_table.md.
- Event universe — validated. Our final sample of 10,101 IPOs (1975–2021) is bracketed by Ritter's restrictive net count (9,199) and gross count (15,905), and the year-by-year counts track his almost one-for-one: corr = 0.96 vs the net count (0.96 vs gross). We picked up the right events.
- First-day initial returns — not recoverable from CRSP monthly data, by construction.
In the IPO month, 95.7% of our firms have a CRSP closing price but only 0.1% have a CRSP
monthly return: CRSP does not compute the first partial-month return (no prior
month-end base price, and the offer price is not used), so the offer→first-close "pop"
that is Ritter's first-day return is absent from
ret. Our nearest measure, the first full aftermarket month (event month 1), averages 1.6% vs Ritter's 17.3% first-day average (annual corr 0.38). This is precisely the "CRSP issue" anticipated in the request — and exactly why the long-run analysis excludes the IPO month and starts at month 1. The underpricing pop sits outside our window, so it neither inflates nor contaminates the −19.7% three-year BHAR headline.
Because the monthly file misses the IPO month (above), stages 3b/4b/5b pull CRSP
daily returns, aggregate them up to monthly, and rebuild the analysis. Full numbers in
output/daily_results_table.md.
- Consistency check passes. Over the same window (months 1–36), the daily→monthly reconstruction gives mean BHAR −20.0% vs the monthly analysis's −19.7% — agreement to a fraction of a point confirms the daily pull and aggregation are sound.
- The IPO month is now included. Daily CRSP carries a return from the first trading day, so we can start the event clock at month 0. The fuller months 0–36 window gives mean BHAR −20.5% (the recovered IPO-month return averages +1.5%) — the new-issues puzzle is unchanged.
- Higher-frequency view. A daily event-time CAAR over the first ~3 months shows the early post-IPO drift at daily resolution (+2.6% by trading day 63).
- First-day underpricing still isn't recoverable — even from daily data. CRSP has no offer price, and the pop is mostly the overnight offer→open move; CRSP's first secondary- market return averages just 0.5% vs Ritter's 17.3% (annual corr 0.33). Matching Ritter's first-day return requires his offer-price-based data, not CRSP. This refines #1: the monthly file misses the IPO month entirely; daily recovers the month but not the first-day pop — either way the underpricing is outside the long-run window and doesn't drive the headline.
Issues #1/#3 showed the one thing CRSP lacks is the offer price, so the offer→first-close underpricing pop is invisible to CRSP at any frequency. Stages 8–9 bring the offer price in.
Stage 8 — Ritter IPO2609.xls (1975–84). The only public Ritter file with per-firm offer
prices. Splicing its offer price onto the CRSP first-day close we already pull recovers a true
first-day return of 9.7% (median 2.7%) over 1,163 firms — matching Ritter's own
8.3% (corr 0.96) — versus the CRSP-only first-day return of 0.4% (corr with Ritter
just 0.12). Ritter's first aftermarket price equals our CRSP first close (corr 0.995), so the
splice is clean. Numbers: output/offer_price_table.md.
Stage 9 — Capital IQ (post-1984). Ritter posts no offer prices after 1984, so for the
modern era we use Capital IQ's ready-made first-day return (eqtypuboffr1dayretpct), linked to
our sample via permno→gvkey→CIQ companyid (≈2,200 firms, mostly 1996+). Its annual median
first-day return tracks Ritter's IPOALL series (corr 0.50); the field has a clean median
(~12%) but a junk tail, so we screen outliers and report the median and a winsorized mean,
never the raw mean. CRSP-only first-day returns for the same firms are ~0% (corr 0.01 with
CIQ) — the pop is still outside CRSP. Numbers:
output/capitaliq_table.md.
Stage 10 — SEC 424B prospectuses (1996+). The authoritative source: the final Rule
424(b) prospectus carries the offer price on its cover. We map each IPO permno→gvkey→CIK
(Compustat), pull the firm's filings from EDGAR, take the 424B nearest the offer date, and
parse the offer price off the cover (filings cached under data/interim/edgar/). We parsed an
offer price for 4,572 of 5,444 (84%) 1996–2021 IPOs with a CIK — 4,429 kept after a
plausibility screen (81% overall; 92% for 2010–2021). Splicing the offer price onto the
CRSP first-day close yields a true first-day return whose annual mean tracks Ritter's IPOALL
series with corr 0.80 — far better than Capital IQ (0.50) or CRSP-only (~0). This is the
cleanest path for 1996–present and the natural complement to Ritter's 1975–84 file. (The
permno→CIK link uses the CCM link nearest the offer date, not one required valid at it —
an IPO's Compustat link usually starts at its first fiscal year-end, so the strict version
dropped ~half the firms.) Numbers: output/edgar_table.md.
Stage 11 — unified series. Stitches the three offer-price sources into one per-IPO
first-day return, best-source-per-firm (Ritter 1975–84 → SEC 424B 1996+ → Capital IQ fallback),
in data/processed/firstday_returns.parquet alongside the CRSP-only first-day
return and the provenance of each value. Coverage: 6,122 of 10,231 analysis IPOs (60%
overall, but 97% of 1975–84, 87% of 1996–2009, 97% of 2010–21 — the 60% is dragged down
only by the 1985–95 gap). Independent cross-check: SEC 424B and Capital IQ agree at corr 0.70
on 3,808 overlapping firms. Numbers:
output/unified_firstday_table.md.
Still missing: offer prices for 1985–1995 (before EDGAR's electronic coverage and after Ritter's IPO2609). The remaining options are licensed SDC New Issues (not on this WRDS instance) or hand-keyed sources.
- Benchmark = market, not a size/B-M matched firm. IPOs are small, high-growth firms; a style-matched benchmark (Loughran-Ritter's preferred control) typically shows more underperformance than the market. Using the equal-weighted market already shrinks the estimate to −9.6%, so the benchmark is the single biggest lever. A Compustat-based size×book-to-market control firm is the natural next step.
- Delisting-return handling. We incorporate
dlretand impute −30% for performance delists missing it (Shumway 1997). This affects only 21 firms, and the realized-horizon variant gives nearly the same answer (−19.1%), so this is unlikely to flip the result — but it is the classic place survivorship bias hides. - BHAR vs CAR and skewness. BHAR is right-skewed; the mean is sensitive to a few extreme winners (winsorizing pushes it to −25.1%, the median to −55.7%). We report skewness-adjusted and bootstrap inference, but a calendar-time-portfolio (Fama-French alpha) cross-check would be the most decisive robustness test.
Across 10,101 U.S. common-stock IPOs from Ritter's database (1975–2021), the average firm
underperforms the CRSP value-weighted market by 19.7% over the three years after going
public (median −55.7%; wealth relative 0.86), a result that is statistically strong under
skewness-adjusted and bootstrap inference and robust to the delisting treatment, the return
horizon, winsorizing, and using cumulative rather than buy-and-hold abnormal returns. The
three judgment calls a reviewer should check first: (1) the benchmark — market-adjusted
rather than size/book-to-market matched, which is the choice most likely to move the number;
(2) the delisting-return convention — incorporating dlret and imputing −30% for
performance delists, the usual home of survivorship bias; and (3) the abnormal-return
construction — fixed 36-month BHAR reinvesting delisting proceeds in the market, with the
known right-skew of BHAR addressed via skewness-adjusted/bootstrap statistics.







