QuantRank

Open-source US equity stock ranking — fundamental, technical, factor, sentiment, and ML signals combined into a single 0–100 composite StockRank, refreshed every US trading day.

QuantRank is a static web app. A Python pipeline runs in GitHub Actions on a Mon-Fri cron (after US market close), computes scores for the S&P 500, and writes JSON files into the repo. A Next.js static site reads those JSON files at build time and is served from Vercel's free tier. No backend. No database. No live API calls from the browser.

⚠️ Disclaimer — please read

QuantRank is for educational and research purposes only.

Nothing here is investment advice, a recommendation, or an offer to buy or sell securities.
Scores and "fair prices" are model outputs derived from public data. They can be wrong, stale, or misleading.
Do not use these scores for real-money trading decisions.
Past performance does not predict future results.
The author is not a registered investment adviser.
This project does not connect to a brokerage and never will.

Honest limits of quantitative analysis — full breakdown in the Honest Limitations section below. The short version:

Quantitative fraud detection has irreducible false-positive (~30% in broad market) and false-negative (~25-40%) rates. Defense flags indicate elevated risk, not confirmed fraud.
Madoff-style fabrication (where revenue, cash, and counter-parties are all fictitious) is undetectable by any system based on filed financials.
Published anomalies typically decay 30-60% post-publication (McLean-Pontiff 2016).

If you're not comfortable losing 100% of any capital you might allocate based on quantitative models, do not use this app for investing.

Honest Limitations

QuantRank ships academic-quality defenses (Altman Z″, Sloan accruals, net-stock-issuance, going-concern phrase scan, Beneish M-score, Dechow F-score, tangible-book sanity guard). Despite this, several classes of manipulation and several structural realities remain outside what any filed-financials-based screener can address. v1.0 ships with the honest accounting below — readers should weight QuantRank's outputs accordingly.

Frauds we cannot catch

Pure financial-statement screeners can only detect manipulation that leaves a footprint in the filed numbers. Four manipulation classes leave no detectable footprint:

Madoff-style total fabrication. When revenue, cash, customers, and bank confirmations are all fictitious, the screener has no anchor to a real economy to compare against. Detection requires forensic accounting + bank cross-confirmation outside SEC EDGAR's reach.
Off-shore related-party round-trips. Wirecard's Asian "third- party acquirers" recorded as customers + suppliers cancel out on the consolidated balance sheet. The ratios all behave normally because the cash never existed but the offsetting fiction is symmetric.
Audit-firm complicity. When the audit itself is fraudulent (Arthur Andersen / Enron pattern), the 10-K is a primary source for manipulated numbers — no quantitative cross-check inside the same document can break the loop.
Post-acquisition baseline reset. Fraud disguised by an acquisition that resets the accounting baseline (Tyco-pattern roll- ups) — the year-over-year ratios all reset with the acquisition, so prior-period manipulation gets washed out of the comparison window.

Realistic error rates

Every defense layer ships with documented false-positive and false- negative rates from the academic literature. v1.0 uses these unmodified — no proprietary "tuning":

Defense	False positive rate	False negative rate	Source
Beneish M-score (M > −2.22)	~30% broad market, ~15-20% S&P 500	~25-40%	Beneish 1999, Beneish 2022
Dechow F-score (F > 2.45)	~27% broad market	~27% (sensitivity 73%)	Dechow et al. 2011
Going-concern phrase scan	~1-3% (post-MD&A restriction)	~10-15%	Mayew-Sethuraman-Venkatachalam 2015
Altman Z″ < 1.10 (distress)	~5-10% manufacturing	~20% (FE-heavy sectors)	Altman 1968, Altman 2017
Sloan accruals top-decile	~10% (sector-confounded)	n/a (definitional)	Sloan 1996
Net stock issuance top-decile	~10% (definitional)	n/a (definitional)	Pontiff-Woodgate 2008

All defense flags are risk stratifiers, not fraud verdicts. A flagged stock is a "look harder" signal, not a "this is fraud" signal. Multiple flags compounding (e.g., Beneish + Dechow + going-concern simultaneously) is the actionable pattern.

Anomaly decay reality

Academic factor research has a well-documented decay curve:

Out-of-sample drop: ~26% lower returns vs the original in-sample period (McLean-Pontiff 2016).
Post-publication drop: an additional ~32% lower (publication- informed trading erodes the edge).
Cumulative: ~58% of the original effect, on average, by year 5 post-publication.

This applies to most of QuantRank's classical signals (value, momentum, quality, low-vol). The methodology page tracks expected vs realized IC where comparable; the rolling-IC decay monitor will land in Phase 4+.

Free-data fragility

QuantRank uses only free data sources, which trade cost for fragility:

yfinance is an unofficial scraper — multiple 2023-2024 incidents broke fundamental endpoints without warning. Daily-close prices have been the most reliable surface; pre-market and after-hours data are not.
SEC EDGAR XBRL has documented 2025 taxonomy drift — the CostOfGoodsAndServicesSold vs CostOfRevenue split, the IntangibleAssetsNetExcludingGoodwill vs OtherIntangibleAssetsNet variant — every Phase 3 PR added a fallback chain for one of these. Some filers report values only under tags the parser doesn't know about yet, and the data goes missing silently.
A cross-source validator (Phase 4) catches large discrepancies but not small systematic biases.

Diminishing returns on stacking defenses

Beneish-Vorst 2021 measured marginal AAER capture across an ensemble of fraud signals:

Beneish + Dechow + Sloan + going-concern = ~4 signals
Adding a 5th signal (e.g., Bao-Ke 2020 ML) captures less than 5% additional AAERs beyond the first 4
Adding more produces proportionally more false positives without proportional true positives

QuantRank's v1.0 defense set is intentionally fixed at this "diminishing-returns inflection point." Future versions will rotate signals based on IC decay rather than stacking. Treat the v1.0 defense list as ceiling, not floor.

What QuantRank is — and is not

QuantRank is:

A risk-stratifier and screener built from public filings + free data
An educational research tool with transparent methodology
A pre-computed JSON pipeline tied to a git commit (every score is reproducible)

QuantRank is not:

A fraud guarantor — flags indicate elevated risk, not confirmed fraud
A backtested live-trading strategy — anomaly decay is real and unpredictable in direction
A registered investment adviser — the author is not, and this is not investment advice
A connection to any brokerage — and will not become one

See docs/RESEARCH_FINDINGS.md for the academic bibliography backing each defense layer.

Architecture

flowchart LR
    A[GitHub Actions cron<br/>Mon-Fri 22:00 UTC] -->|run daily| B[Python compute pipeline]
    B -->|fetch| C[(yfinance / SEC EDGAR<br/>FRED / Finnhub / Reddit)]
    B -->|write| D[JSON files in<br/>frontend/public/data/]
    D -->|git push| E[GitHub repo]
    E -->|webhook| F[Vercel build]
    F -->|next build --output export| G[Static HTML/CSS/JS on CDN]
    H[User browser] -->|fetch| G

Why this design (Option D — static site):

Free forever: public GitHub repo = unlimited Actions minutes; Vercel hobby tier = unlimited static hosting.
One system to debug: only the Python script + the GitHub Actions logs.
Fast for users: pre-computed JSON served via CDN — no DB queries, no rate limits.
Reproducible: every score is tied to a git commit.

This is not a FastAPI/Flask backend, not a database, and not a live-data system. See SKILL.md for the full architecture rules.

Tech stack

Layer	Tool
Compute language	Python 3.11+
Compute runtime	GitHub Actions (`ubuntu-latest`)
Frontend framework	Next.js 14 (App Router, static export)
Styling	Tailwind CSS
Charts	Recharts
Data storage	JSON files in `frontend/public/data/`
Hosting	Vercel (frontend) + GitHub (data)
Free data sources	yfinance, edgartools, fredapi, finnhub-python, PRAW
ML	LightGBM + SHAP (Phase 5+)

Setup

You don't need to run anything locally. The whole app builds in CI.

Push this repo to GitHub as a public repository.
Connect Vercel:
- vercel.com → "Add New Project" → import the repo.
- Framework preset: Next.js.
- Root directory: frontend.
- Build command: npm run build.
- Output directory: out.
- Production branch: main.
- Click Deploy.
Trigger first compute (after Phase 1 lands): GitHub → Actions → "Compute Rankings" → "Run workflow".
Done. From now on, every Sunday at 22:00 UTC the pipeline refreshes the JSON, commits it, and Vercel auto-deploys.

Required GitHub secrets — by phase

Phase	Secret	Why
0	none	Stub workflow only
1	none	yfinance + Wikipedia are unauthenticated
2	`EDGAR_USER_AGENT`	SEC requires `"<Your Name> <email>"` for EDGAR access
4	`FINNHUB_API_KEY`, `REDDIT_CLIENT_ID`, `REDDIT_CLIENT_SECRET`, `REDDIT_USER_AGENT`	News + Reddit sentiment
6	`FRED_API_KEY`	Macro / regime detection

Add secrets at: Repo → Settings → Secrets and variables → Actions → New repository secret.

Project status

See PHASE_STATUS.md for the current build phase and acceptance checklist.

Methodology

See docs/METHODOLOGY.md for the user-facing methodology summary, and stock_ranking_knowledge.md for the full formula reference (~1600 lines covering fundamental, technical, factor, sentiment, ML, regime, and validation techniques).

Architecture rules: SKILL.md and docs/ARCHITECTURE.md. Phase-by-phase build plan: WORKFLOW.md.

License

MIT — see LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 110 Commits
.claude/skills		.claude/skills
.github/workflows		.github/workflows
compute		compute
docs		docs
frontend		frontend
tests		tests
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
PHASE_STATUS.md		PHASE_STATUS.md
README.md		README.md
SKILL.md		SKILL.md
WORKFLOW.md		WORKFLOW.md
pyproject.toml		pyproject.toml
stock_ranking_knowledge.md		stock_ranking_knowledge.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

QuantRank

⚠️ Disclaimer — please read

Honest Limitations

Frauds we cannot catch

Realistic error rates

Anomaly decay reality

Free-data fragility

Diminishing returns on stacking defenses

What QuantRank is — and is not

Architecture

Tech stack

Setup

Required GitHub secrets — by phase

Project status

Methodology

License

About

Uh oh!

Releases 5

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

QuantRank

⚠️ Disclaimer — please read

Honest Limitations

Frauds we cannot catch

Realistic error rates

Anomaly decay reality

Free-data fragility

Diminishing returns on stacking defenses

What QuantRank is — and is not

Architecture

Tech stack

Setup

Required GitHub secrets — by phase

Project status

Methodology

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 5

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages