Proteus

v0 Alpha -- Proof of Concept | BASE Sepolia Testnet

A prediction market protocol on BASE where users stake ETH on the exact text a public figure will post. Winners are determined by on-chain Levenshtein distance. This is a working prototype that demonstrates why text prediction markets are a fundamentally different -- and more durable -- market structure than binary yes/no contracts.

Status: Alpha prototype on BASE Sepolia testnet. Not audited. Not production-ready. Do not deploy to mainnet.

tl;dr

Research project, not intended for commercial use, exploring how AIs can roleplay as public figures and bet on the exact words they will use on posts on X. "I bet 1 ETH that @elonmusk will post 'Mars by 2030' on X between Mar 1 2026 and Apr 1 2026." and then if the real post is "Moon base by 2030" the Levenshtein distance between the real text posted (or lack thereof) is calculated among all competitors and the closest to the exact character-by-character post within the time period wins.

The Thesis

Binary prediction markets encode exactly one bit of information per contract. The outcome space is {0, 1}, and as AI systems approach superhuman forecasting along an exponential capability curve, the edge any participant can capture in a binary market collapses toward zero because the correct answer becomes trivially computable. Text prediction over an alphabet with strings up to length n has a combinatorially explosive outcome space, and Levenshtein distance induces a proper metric on that space, meaning payoffs aren't a binary cliff but a continuous gradient surface where every character of precision is rewarded. Information density per market scales as O(n log|alphabet|) versus O(1) for yes/no contracts, and the Levenshtein metric ensures the payoff function is Lipschitz-continuous with respect to prediction quality, so marginal improvements in language modeling always translate to marginal improvements in expected payout. As AI capabilities hit the steep part of the curve, binary markets become commoditized -- everyone's model says 87% yes and the spread vanishes -- but the text prediction space remains strategically rich because the distance between the 99th and 99.9th percentile language model still corresponds to dozens of edit operations, each worth money. This is a market structure where the approaching AI capability explosion doesn't destroy the game -- it deepens it.

	Binary (Polymarket, Kalshi)	Text (Proteus)
Outcome space	{0, 1}	Σⁿ — all strings up to length n
Information density	O(1) — one bit per market	O(n log\|Σ\|) — scales with text length
Payoff function	Cliff: right or wrong	Continuous gradient: every character pays
AI resistance	Collapses — models converge on same probability	Deepens — edit gap between frontier models is monetizable
Scoring	None (binary)	Levenshtein distance (proper metric space)

Coinbase/Kalshi launched binary prediction markets to all 50 US states in January 2026. They run off-chain through Kalshi's CFTC-regulated backend. Polymarket does ~$12B/month in binary yes/no volume on Polygon. Neither supports text prediction. Neither scores on a continuous distance metric. That's the gap this prototype explores.

X API Update (Feb 2026): X now offers pay-per-use API access for individual developers -- no subscriptions, no monthly caps, just credit-based billing. This is a significant unlock for oracle resolution: each oracle node can independently fetch posts by ID, verify authorship, and confirm timestamps via the X API v2. Previously, the $200/mo Basic tier (15K reads) or $5,000/mo Pro tier (1M reads) made multi-oracle verification cost-prohibitive. Pay-per-use makes it feasible for multiple independent oracles to verify the same post at minimal cost.

Rendered Futures

Proteus validates Rendered Futures against reality. Winners are the best renderers — their predictions are candidates for graduation to the Clockchain as validated causal paths. Every resolved market strengthens the Bayesian prior.

flowchart LR
    V1["V1: Exact text\n(Levenshtein)"] --> V2["V2: Event descriptions\n(semantic distance)"]
    V2 --> V3["V3: Causal subgraphs\n(graph distance)"]
    V3 --> CC["Clockchain\n(validated paths)"]
    CC -. "strengthens\nBayesian prior" .-> V1

The distance metric evolves across three levels. The continuous-metric primitive — closest match wins on a gradient, not a cliff — generalizes at every level.

Proteus predictions will be expressible as TDF records (Phase 2), enabling direct interoperability with the Clockchain and the broader Timepoint suite.

flowchart TD
    P["Proteus\n(resolve markets)"] --> |"validated predictions"| CC["Clockchain"]
    CC --> |"grounding"| Pro["Pro\n(simulate futures)"]
    Pro --> |"rendered futures"| P
    CC --> |"stronger prior"| F["Flash\n(render past)"]
    F --> |"verified events"| CC

What This Is (and Isn't)

This is a v0 alpha. It was largely vibe-coded to validate the core idea: that on-chain Levenshtein distance creates a viable, AI-resistant prediction market primitive. The smart contracts work. The market lifecycle works. The math works. Everything else -- the Flask backend, the wallet auth, the admin dashboard -- is scaffolding around that proof of concept.

Do not deploy this to mainnet. There is no security audit, no multisig, no production wallet integration. The embedded wallet service uses a PBKDF2 shim. The resolution mechanism is centralized (single EOA). These are known, accepted tradeoffs for a prototype. This is not meant for commercial use, whatsoever.

What works (BASE Sepolia testnet)

Full market lifecycle: create -> submit predictions -> resolve -> claim payouts
On-chain Levenshtein distance for winner determination (PredictionMarketV2)
259+ passing tests (109 contract, 135 unit, 15 integration)
Genesis NFT (60/100 minted, finalized) with on-chain SVG art
JWT wallet auth (MetaMask) + email OTP (Coinbase Embedded Wallet shim)
Admin resolution dashboard, Redis caching, rate limiting, structured logging
CI/CD pipeline, Slither static analysis complete

What's intentionally not done

External security audit
Real Coinbase CDP wallet integration (no credentials)
Multisig for contract owner key
Production RPC (Alchemy/QuickNode)
Production monitoring (Sentry)
Decentralized oracle resolution

How It Works

Market: "What will @elonmusk post?"
             |
             v
    Competitors submit predictions + stake ETH
    ┌────────────────────────────────────────────────────┐
    │ AI (Claude):  "Starship flight 2 confirmed for     │
    │               March. Humanity becomes               │
    │               multiplanetary or dies trying."       │
    │ Human fan:    "The future of humanity is Mars       │
    │               and beyond"                           │
    │ Random bot:   "a8j3kd9xmz pqlw7 MARS ufk2         │
    │               rocket lol"                           │
    └────────────────────────────────────────────────────┘
             |
             v
    Market ends. Oracle submits actual text:
    "Starship flight 2 is GO for March. Humanity
     becomes multiplanetary or we die trying."
             |
             v
    On-chain Levenshtein distance:
      AI (Claude) → 12 edits   ← WINNER
      Human fan   → 59 edits
      Random bot  → 72 edits
             |
             v
    AI (Claude) wins the pool (minus 7% platform fee)

The scoring is continuous, not binary. Every character of precision is rewarded. The closest match wins.

Worked Examples

Six scenarios showing the full spectrum of prediction quality in a Levenshtein-scored market. Each demonstrates a different strategic insight.

Example 1: AI Roleplay Wins (Elon Musk)

Market: What will @elonmusk post?

Actual text: Starship flight 2 is GO for March. Humanity becomes multiplanetary or we die trying.

Submitter	Prediction	Distance
AI Roleplay (Claude)	`Starship flight 2 confirmed for March. Humanity becomes multiplanetary or dies trying.`	12
Human fan	`The future of humanity is Mars and beyond`	59
AI (lazy prompt GPT)	`Elon will probably tweet about SpaceX rockets going to space soon`	66
Bot (entropy)	`a8j3kd9xmz pqlw7 MARS ufk2 rocket lol`	72

Winner: AI Roleplay (Claude) at distance 12.

Lesson: A well-prompted AI captures tone, structure, and vocabulary. The 47-edit gap between the AI roleplay and the human fan is monetizable -- that's the entire pool. The human got the theme right ("Mars") but theme doesn't pay; exact wording does.

Example 2: Human Insider Beats AI (Sam Altman)

Market: What will @sama post?

Actual text: we are now confident AGI is achievable with current techniques. announcement soon.

Submitter	Prediction	Distance
Ex-OpenAI engineer	`we are now confident AGI is achievable with current techniques. big announcement soon.`	4
AI Roleplay (GPT)	`we now believe AGI is achievable with current techniques. announcement coming soon.`	18
Human (cynical)	`Sam will say AGI is close again like he always does nothing new`	59

Winner: Ex-OpenAI engineer at distance 4.

Lesson: Insider information beats AI. Someone who heard rehearsed phrasing gets within 4 edits. The AI roleplay is good (distance 18) but the insider's edge -- knowing the exact phrase "we are now confident" -- is worth 14 edits of advantage. Information asymmetry is priced continuously.

Example 3: Insider Leaks Exact Wording (Zuckerberg)

Market: What will @zuck post?

Actual text: Introducing Meta Ray-Ban with live AI translation. 12 languages. The future is on your face.

Submitter	Prediction	Distance
Meta intern	`Introducing Meta Ray-Ban with live AI translation in 12 languages. The future is on your face.`	3
AI Roleplay	`Introducing Meta Ray-Ban AI glasses with real-time translation in 8 languages. The future is on your face.`	25
Human (guessing)	`zuck will announce glasses or something idk`	73
Spam bot	`BUY META NOW GLASSES MOONSHOT 1000X GUARANTEED`	83

Winner: Meta intern at distance 3.

Lesson: Product launches have rehearsed copy. Seeing a draft deck = 22-edit advantage over the best AI. The AI gets the structure right ("Introducing Meta Ray-Ban... The future is on your face.") but misses the specific product name and number. Insider access to marketing materials is worth money in this market.

Example 4: Null Submission Wins (Jensen Huang Stays Silent)

Market: What will @JensenHuang post?

Actual text: (nothing posted) -- resolved with __NULL__

Submitter	Prediction	Distance
Null trader	`__NULL__`	0
Human (guessing)	`Jensen will flex about Blackwell sales numbers`	46
AI Roleplay	`NVIDIA Blackwell Ultra is sampling ahead of schedule. The next era of computing starts now.`	90

Winner: Null trader at distance 0 (exact match).

Lesson: Binary markets can't express "person won't post." AI roleplay always generates text -- it can't predict silence. The __NULL__ sentinel lets traders bet on inaction, and distance 0 means they take the entire pool. This is a market primitive that doesn't exist in yes/no contracts.

Example 5: AI vs AI Race -- THE THESIS EXAMPLE (Satya Nadella)

Market: What will @sataborasu post?

Actual text: Copilot is now generating 46% of all new code at GitHub-connected enterprises. The AI transformation of software is just beginning.

Submitter	Prediction	Distance
Claude roleplay	`Copilot is now generating 45% of all new code at GitHub-connected enterprises. The AI transformation of software is just beginning.`	1
GPT roleplay	`Copilot is now generating 43% of all new code at GitHub-connected enterprises. The AI transformation of software has just begun.`	8
Human (vague)	`Microsoft AI is great and will change the world of coding forever`	101

Winner: Claude roleplay at distance 1 (single character: 5 → 6).

Lesson: This is the thesis example. Two frontier AI models, same public training corpus, same prompt. Claude gets within 1 edit. GPT gets within 8. The 7-edit gap between two frontier models is worth the entire pool. A binary market would split nothing -- both AIs "predicted correctly" in a yes/no framing. Levenshtein rewards marginal calibration. The game deepens as models improve.

Example 6: Bot Entropy Wastes Money (Tim Cook)

Market: What will @tim_cook post?

Actual text: Apple Intelligence is now available in 30 countries. Privacy and AI, together.

Submitter	Prediction	Distance
AI Roleplay	`Apple Intelligence is now available in 24 countries. We believe privacy and AI go hand in hand.`	28
Human (thematic)	`Tim will say something about privacy and AI like always`	53
Random bot	`x7g APPLE j2m PHONE kq9 BUY zw3 intelligence p5 cook`	65
Degenerate bot	`aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa`	73

Winner: AI Roleplay at distance 28.

Lesson: Levenshtein distance is a natural anti-bot mechanism. Random strings have expected distance ≈ max(len(a), len(b)). Bots can't get lucky -- in a character-level outcome space, there's no shortcut. Even the degenerate "aaaa..." bot that tries to game string length scores worse than a thematic human guess. The metric itself is the spam filter.

Deployed Contracts (BASE Sepolia)

Contract	Address	Status
PredictionMarketV2	`0x5174Da96BCA87c78591038DEe9DB1811288c9286`	Active
GenesisNFT	`0x1A5D4475881B93e876251303757E60E524286A24`	60/100 minted
PredictionMarket (V1)	`0x667121e8f22570F2c521454D93D6A87e44488d93`	Deprecated

Use PredictionMarketV2 for everything. V1 lacks a resolution mechanism.

Deployment

The backend auto-deploys from main.

Service	Purpose
Backend (gunicorn + Flask)	API, admin dashboard, marketing pages
Redis	Caching, Celery broker, auth stores
Postgres	Available but unused (chain-only mode)
Smart contracts (BASE Sepolia)	All market data on-chain

Local Development

# Install dependencies
npm install
python3 -m venv .venv
source .venv/bin/activate
pip install -e ".[test]"

# Run tests
make test-all             # Everything
make test-unit            # Python unit tests
make test-contracts       # Solidity tests (Hardhat)

# Start the app locally
redis-server &
python main.py            # http://localhost:5000

You'll need BASE Sepolia ETH from the faucet.

Project Structure

contracts/src/     # Solidity smart contracts (the core primitive)
contracts/test/    # Hardhat tests
services/          # Python backend services (prototype scaffolding)
routes/            # Flask API endpoints
scripts/           # Deployment, seeding, and utility scripts
static/js/         # Frontend JavaScript
templates/         # HTML templates (marketing, app, admin)
tests/             # Python tests (unit, integration, load)
docs/              # Documentation
railway.json       # Railway start command config

Architecture

Frontend (Web3.js, wallet connect)
    |  JWT Auth
Flask Backend (gunicorn, Railway)
    |  Web3.py          |  Redis
BASE Sepolia            Cache, Celery, Auth
(PredictionMarketV2,    (nonces, OTPs,
 GenesisNFT, + 12)      rate limiting)

All market data lives on-chain. Zero database. Redis is used only for caching RPC responses, auth nonces/OTPs, and rate limiting.

Fee Structure

7% platform fee on market volume, split:

Recipient	Share
Genesis NFT Holders	20% (1.4% of volume)
Oracles	28.6% (2%)
Market Creators	14.3% (1%)
Node Operators	14.3% (1%)
Builder Pool	28.6% (2%)

Technology

Blockchain: BASE (Coinbase L2, OP Stack)
Contracts: Solidity 0.8.20, OpenZeppelin, Hardhat
Backend: Python 3.11+, Flask, gunicorn, Web3.py, Celery, Redis
Auth: JWT (MetaMask) + Firebase email OTP (Coinbase Embedded Wallet shim)
Infra: Railway (auto-deploy from GitHub)

Documentation

Architecture - System design and contract stack
Setup Guide - Development environment
API Reference - REST API endpoints
Contracts - Smart contract reference
Roadmap - What's done, what's next
Gap Analysis - Honest accounting of remaining work
Security Analysis - Slither static analysis results
Audit Preparation - Contract inventory for future audit
Whitepaper - Full research paper

What Would Make This Real

In rough priority order:

Validate demand -- Do people actually want to bet on exact tweet text? Ship the testnet demo and find out.
Security audit -- The ~1,025 lines of in-scope Solidity need a real audit before touching mainnet.
Decentralize resolution -- Replace single-EOA resolution with oracle consensus (commit-reveal). X API pay-per-use access now makes multi-oracle tweet verification economically viable.
Real wallet integration -- Replace PBKDF2 shim with Coinbase CDP Server Signer.
Multisig -- Gnosis Safe 2-of-3 for contract owner key.
Production RPC -- Alchemy/QuickNode for mainnet (public RPC is fine for Sepolia testnet).

Timepoint Suite

Open-source engines for temporal AI. Render the past. Simulate the future. Score the predictions. Accumulate the graph.

Service	Type	Repo	Role
Flash	Open Source	timepoint-flash	Reality Writer — renders grounded historical moments (Synthetic Time Travel)
Pro	Open Source	timepoint-pro	Rendering Engine — SNAG-powered simulation, TDF output, training data
Clockchain	Open Source	timepoint-clockchain	Temporal Causal Graph — Rendered Past + Rendered Future, growing 24/7
SNAG Bench	Open Source	timepoint-snag-bench	Quality Certifier — measures Causal Resolution across renderings
Proteus	Open Source	proteus	Settlement Layer — prediction markets that validate Rendered Futures
TDF	Open Source	timepoint-tdf	Data Format — JSON-LD interchange across all services
Web App	Private	timepoint-web-app	Browser client at app.timepointai.com
iPhone App	Private	timepoint-iphone-app	iOS client — Synthetic Time Travel on mobile
Billing	Private	timepoint-billing	Payment processing — Apple IAP + Stripe
Landing	Private	timepoint-landing	Marketing site at timepointai.com

The Timepoint Thesis — a forthcoming paper formalizing the Rendered Past / Rendered Future framework, the mathematics of Causal Resolution, the TDF specification, and the Proof of Causal Convergence protocol. Follow @seanmcdonaldxyz for updates.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 195 Commits
.github/workflows		.github/workflows
attached_assets		attached_assets
contracts		contracts
deployments		deployments
docs		docs
logo		logo
monitoring		monitoring
routes		routes
scripts		scripts
services		services
static		static
tasks		tasks
templates		templates
tests		tests
utils		utils
visualizations		visualizations
.env.example		.env.example
.gitignore		.gitignore
.hardhat-analytics.json		.hardhat-analytics.json
.replit		.replit
BETTING-MECHANISM-MAP.md		BETTING-MECHANISM-MAP.md
CLAUDE.md		CLAUDE.md
CODEOWNERS		CODEOWNERS
Makefile		Makefile
README.md		README.md
WHITEPAPER.md		WHITEPAPER.md
app.py		app.py
config.py		config.py
config_chain.py		config_chain.py
debug_e2e_test.py		debug_e2e_test.py
docs.json		docs.json
hardhat.config.js		hardhat.config.js
main.py		main.py
package-lock.json		package-lock.json
package.json		package.json
proteus_node_monolith.py		proteus_node_monolith.py
pyproject.toml		pyproject.toml
python_run.sh		python_run.sh
python_setup.sh		python_setup.sh
python_test.sh		python_test.sh
railway.json		railway.json
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Proteus

The Thesis

Rendered Futures

What This Is (and Isn't)

What works (BASE Sepolia testnet)

What's intentionally not done

How It Works

Worked Examples

Deployed Contracts (BASE Sepolia)

Deployment

Local Development

Project Structure

Architecture

Fee Structure

Technology

Documentation

What Would Make This Real

Timepoint Suite

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Proteus

The Thesis

Rendered Futures

What This Is (and Isn't)

What works (BASE Sepolia testnet)

What's intentionally not done

How It Works

Worked Examples

Deployed Contracts (BASE Sepolia)

Deployment

Local Development

Project Structure

Architecture

Fee Structure

Technology

Documentation

What Would Make This Real

Timepoint Suite

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages