MLB Intelligence

MLB betting intelligence system with ML-driven predictions, live execution on Kalshi, and transparent performance analytics.

Problem

Sports betting (especially MLB) has a few persistent problems:

Too much data, too little time — schedule context, pitcher changes, weather, team form, and live odds all move quickly.
Execution is the hard part — even good models fail if betting decisions aren’t consistent, timed correctly, and recorded.
Post-hoc analysis is usually missing — without clean history, you can’t answer “did this strategy work?” with confidence.

Action

MLB Intelligence turns the day-to-day work into a repeatable system:

Ingest schedules, odds, weather, pitcher stats, and team stats
Engineer features (80+ game-level features)
Predict win probabilities (LR default; optional LightGBM/XGBoost/MLP/ensemble)
Compare to the market to identify edge
Execute on Kalshi (live or dry-run)
Settle + audit results and ROI in a single database

Solution

MLB Intelligence is an end-to-end stack for MLB trading:

One orchestrator that runs the full workflow (run_pipeline.py)
A dashboard (FastAPI) for transparent performance analytics + operator controls
A Postgres-backed history for bets, balances, orders, and model artifacts
A public modeling snapshot (data/master_mlb.csv) that can be updated periodically (not continuously)

What’s inside

Dashboard — FastAPI app serving public analytics + private operator controls
Pipeline — runs ahead of scheduled first pitch; predicts and places bets in parallel
Model — calibrated classifiers + walk-forward validation; Kelly stake sizing
DB — PostgreSQL source of truth for games, bets, and run history

Key components:

run_pipeline.py       # Main orchestrator
run_pipeline_v2.py    # Parallel V2 orchestrator (LightGBM; dry-run by default)
fetch/                # Data ingestion — odds, weather, stats
models/model_v1/train.py   # Model training + walk-forward evaluation
models/model_v1/predict.py # Inference — win probabilities + sizing
models/model_v2/    # V2 LightGBM predict + nightly deterministic eval (`eval.py`)
bet/place_bet.py      # Kalshi execution
settle_games.py       # Post-game settlement
dashboard/app.py      # Dashboard API
homelab.py            # SSH helper for app/db LXCs

Data model philosophy

Postgres is the source of truth.
Files under data/ are primarily caches/backups and may be stale.
data/master_mlb.csv is intended as a public snapshot of modeling data and can be refreshed on a controlled cadence.

Docs

Contributing / local development

See CONTRIBUTING.md for local setup, testing, and dev commands.

Deployment

Push to GitHub → GitHub Actions runner auto-deploys to the app LXC.

Direct LXC SSH debugging via homelab.py:

python3 homelab.py app "systemctl status mlb-dashboard --no-pager -l | tail -40"
python3 homelab.py app "journalctl -u mlb-dashboard -n 100 --no-pager"

Data sources

License

BUSL-1.1 (Business Source License 1.1)

See LICENSE
See LICENSE-FAQ.md

Name		Name	Last commit message	Last commit date
Latest commit History 447 Commits
.github/workflows		.github/workflows
agent_templates		agent_templates
bet		bet
dashboard		dashboard
data		data
data_quality		data_quality
deploy		deploy
design		design
docs		docs
fetch		fetch
migrations		migrations
models		models
scripts		scripts
tests		tests
.env.example		.env.example
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
LICENSE-FAQ.md		LICENSE-FAQ.md
README.md		README.md
config.py		config.py
db.py		db.py
find_missing_games.py		find_missing_games.py
homelab.py		homelab.py
kalshi_client.py		kalshi_client.py
notify.py		notify.py
pyproject.toml		pyproject.toml
run_pipeline.py		run_pipeline.py
run_pipeline_v2.py		run_pipeline_v2.py
settle_games.py		settle_games.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MLB Intelligence

Problem

Action

Solution

What’s inside

Data model philosophy

Docs

Contributing / local development

Deployment

Data sources

License

About

Licenses found

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

MLB Intelligence

Problem

Action

Solution

What’s inside

Data model philosophy

Docs

Contributing / local development

Deployment

Data sources

License

About

Resources

License

Licenses found

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages