Skip to content

OfficialSubmind/SubMind

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1,095 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Submind — Cryptographically-Anchored AI Predictions

Submind cryptographically anchors AI predictions at mint time, so anyone can confirm a claim wasn't altered after the AI made it. Calibration is tracked separately via public Brier score.

Live: submind.us

What "verification" means here (precise framing): Submind's verification rail proves a prediction's content, evidence, and timestamp have not been altered since the moment the AI made it. It does NOT prove the prediction is true. A receipt for "the moon is made of cheese" with high confidence would still verify VERIFIED — the verifier confirms the AI said that, not that the AI was right. Calibration over time (tracked via public Brier score on /stats) is the mechanism for measuring truth. See tools/README.md for the full integrity-vs-correctness contract.


What It Does

VERDICT takes a claim and produces a structured verdict with:

  • Evidence gathered from multiple independent search providers
  • Adversarial dual-search: supporting AND counter-evidence required for every claim
  • Probabilistic verdicts with confidence intervals derived from source quality variance
  • Signal convergence scoring across independent sources
  • A cryptographic receipt anchoring the verdict's content + evidence + timestamp at mint time (SHA-256 of the canonicalized payload — RFC 8785 JCS). Anyone can independently confirm the receipt wasn't altered after issuance via tools/verify-receipt.js.
  • Public Brier score accountability — every prediction's resolved outcome is tracked over time so calibration (the real measure of "is this AI any good?") is measurable.

Quick Start

Verify mode (primary)

Submit a claim — returns a structured VERDICT with full audit trail:

curl -X POST https://submind.us/api/predict \
  -H "Content-Type: application/json" \
  -d '{"claim": "Inflation in the US will fall below 2% by end of 2025"}'

Response:

{
  "verdict": {
    "claim": "Will inflation in the US fall below 2% by end of 2025?",
    "probability": 0.38,
    "confidence_interval": [0.25, 0.52],
    "rating": "unlikely",
    "summary": "Deterministic synthesis over 4 sub-claims. 18 raw sources...",
    "audit_trail": {
      "synthesizer": "deterministic",
      "formula": "base = effective_count_for / (effective_count_for + effective_count_against); confidence_width = 0.4 / (1 + effective_sources * avg_independence * grounding_rate)",
      "claim_audits": [...],
      "weight_sum": 4.0,
      "overall_inputs": {
        "weighted_probability": 0.38,
        "weighted_lower": 0.25,
        "weighted_upper": 0.52
      }
    }
  },
  "evidence": {
    "sources_analyzed": 18,
    "effective_sources": 9.4,
    "for": [...],
    "against": [...]
  },
  "reasoning": {
    "sub_claims": [...],
    "convergence_score": 0.74
  },
  "meta": {
    "pipeline_version": "v2",
    "processing_time_ms": 4821,
    "model_used": "deterministic"
  }
}

Predict mode (legacy SSE)

Submit a question for probabilistic forecasting — returns a Server-Sent Events stream:

curl -X POST https://submind.us/api/predict \
  -H "Content-Type: application/json" \
  -H "Accept: text/event-stream" \
  -d '{"query": "Will the Fed cut rates in Q3 2025?"}'

Rating scale: false (<10%) · likely_false (10–30%) · uncertain (30–70%) · likely_true (70–90%) · true (>90%)


Architecture

src/           → Next.js frontend (React 19 + Tailwind CSS 4)
lib/           → Floor layer: shared utilities, LLM routing, search, grounding
pipeline/      → Product layer: evidence → synthesis → formatting pipeline
api/           → Thin glue: Vercel serverless API routes

Pipeline

  1. Parse Query (pipeline/parseQuery.js) — Normalize claim, detect domain, generate search queries
  2. Decompose (pipeline/decompose.js) — Break into 3–6 independently verifiable sub-claims
  3. Fetch Evidence (pipeline/fetchEvidence.js) — Adversarial dual-search (FOR + AGAINST each claim), source independence scoring, URL grounding
  4. Synthesize (pipeline/synthesize.js) — LLM analysis over evidence with Zod-validated structured output
  5. Format (pipeline/format.js) — Assemble verdict with provenance tags and convergence scores; persist to Supabase
  6. Extract Entities (pipeline/extractEntities.js) — Fire-and-forget NER for knowledge graph

Core Libraries

Module Purpose
lib/llm.js Multi-provider LLM router with auto-failover and native structured output
lib/search.js Cascading search (Tavily → Brave → Serper) with Supabase caching and source independence scoring
lib/grounding/sourceExistenceChecker.js URL verification — HEAD/GET checks with concurrency limiting
lib/grounding/entityVerifier.js Named entity verification via Yahoo Finance (tickers) and Wikidata (organizations)
lib/schemas.js Zod schemas for structured LLM output validation
lib/scoreConvergence.js Signal convergence scoring across independent sources
lib/provenanceTag.js Source provenance tier classification
lib/brier.js Brier score computation, decomposition, calibration, and benchmarking
lib/deduplicateSources.js Source deduplication by URL and content hash
lib/supabase.js Database client with search cache and prediction persistence
lib/ratelimit.js IP-based rate limiting

Tech Stack

Layer Technology
Frontend Next.js 15, React 19, Tailwind CSS 4
API Vercel Serverless Functions (Node.js)
LLM (primary) Gemini 2.0 Flash (native structured output)
LLM (self-hosted) vLLM, Ollama, or any OpenAI-compatible server (optional, second priority)
LLM (failover) Cerebras, OpenRouter, Anthropic
Search (primary) Tavily API
Search (failover) Brave Search, Serper.dev
Validation Zod (JSON schema enforcement)
Database Supabase (PostgreSQL)
Hosting Vercel (Hobby tier, auto-deploys from main)

API Endpoints

Method Route Body / Params Description
POST /api/predict { "claim": "..." } Verify mode — deterministic pipeline, returns structured VERDICT with audit_trail, model_used: "deterministic"
POST /api/predict { "query": "..." } Predict mode (legacy) — LLM pipeline, returns SSE stream or JSON
GET /api/predict?id=<uuid> Fetch a saved prediction by ID
GET /api/predict?q=<text> Fuzzy-match a cached prediction by question text
PUT /api/predict { id, resolved: true, outcome: true/false } Resolve a prediction and compute Brier score
GET /api/scores Public track record (Brier scores, calibration, history)
GET /api/dashboard Realtime dashboard metrics (JSON or SSE stream)
GET /api/health System health check (includes provider status)
GET /api/scores?action=brier Brier score details

See docs/API.md for the full OpenAPI 3.0 specification with request/response schemas.

API Key Authentication

External developers can authenticate POST requests with an API key for higher rate limits. Anonymous (unauthenticated) access continues to work unchanged.

Authentication headers

Send the key via either header — both are equivalent:

Authorization: Bearer vrd_live_<32 hex chars>
X-API-Key: vrd_live_<32 hex chars>

Rate limit tiers

Tier Requests / 60s Notes
Anonymous (IP) 10 Default for unauthenticated requests
free 30 Default API key tier
pro 100
internal unlimited Internal services only

Response headers when authenticated:

  • X-RateLimit-Limit — limit for the active tier
  • X-RateLimit-Remaining — requests remaining in the current window
  • X-RateLimit-Reset — Unix timestamp when the window resets
  • X-API-Key-Tier — active tier (free, pro, or internal)

Example curl commands

Unauthenticated (anonymous):

curl -X POST https://submind.us/api/predict \
  -H "Content-Type: application/json" \
  -d '{"claim": "Inflation in the US will fall below 2% by end of 2025"}'

With API key (Authorization header):

curl -X POST https://submind.us/api/predict \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer vrd_live_your32hexkeyhere00000000000000" \
  -d '{"claim": "Inflation in the US will fall below 2% by end of 2025"}'

With API key (X-API-Key header):

curl -X POST https://submind.us/api/predict \
  -H "Content-Type: application/json" \
  -H "X-API-Key: vrd_live_your32hexkeyhere00000000000000" \
  -d '{"claim": "Inflation in the US will fall below 2% by end of 2025"}'

Error responses:

Status Code Meaning
401 INVALID_API_KEY Key not found, disabled, or expired
429 RATE_LIMITED Rate limit exceeded; check retryAfter field

Deterministic Pipeline (Verify Mode)

VERDICT's verify mode uses pure math with no LLM judgment. Every number in the output is fully traceable back to the raw evidence:

claim → parse → decompose → fetchEvidence → deterministicSynthesizer → format
                                                        ↑
                                          No LLM. Auditable arithmetic only.

Per sub-claim formula:

base               = effective_count_for / (effective_count_for + effective_count_against)
quality_factor     = effective_sources × avg_independence_score × grounding_rate
confidence_width   = 0.4 / (1 + quality_factor)
lower              = max(0.01, base − confidence_width / 2)
upper              = min(0.99, base + confidence_width / 2)

Overall probability: weighted average of per-claim probabilities, with weights from the decomposition step.

The verdict.audit_trail in the verify response exposes every intermediate value — effective source counts, independence scores, grounding rates, confidence widths — for every sub-claim. Nothing is hidden.

Verifying receipt integrity (external)

Every prediction Submind issues is anchored to a SHA-256 content hash of its canonicalized payload (RFC 8785 JCS). Anyone — including parties who don't trust Submind — can independently confirm a receipt's integrity (i.e. that no byte of its content, evidence, or timestamp was altered after the AI issued it) by cloning this repo and running one command:

node tools/verify-receipt.js <verdict-id | url | path-to-receipt.json>

Exit codes: 0=VERIFIED, 1=TAMPERED, 2=UNREACHABLE, 3=WARN. Stub-mode receipts (formula_version: deterministic-synth-v1) get a full hash + replay check; live-mode receipts (formula_version: live-llm-v1) get a hash-anchor-only check with WARN because LLM output is stochastic by construction. See tools/README.md for the integrity-vs-correctness distinction and threat model. The verifier proves the AI said this; it does NOT prove the AI was right. Truth is measured separately, over time, via public Brier scores on /scores.


Frontend

Route Description
/ Main verify UI — submit a claim or question, watch the pipeline stages live, view the full reasoning tree and source evidence
/scores Public Brier score dashboard — benchmark comparisons, calibration curve, domain breakdown, rolling trend, and recent resolution feed (live SSE)
/p/[slug] Shareable permalink for any saved prediction
/history Browse all past predictions

Environment Variables

Variable Required Description
GEMINI_API_KEY Yes* Google AI Studio key for Gemini 2.0 Flash
TAVILY_API_KEY Yes* Tavily search API key
SUPABASE_URL Yes Supabase project URL
SUPABASE_SERVICE_ROLE Yes Supabase service role key
SELF_HOSTED_LLM_URL No Self-hosted LLM endpoint URL (vLLM/Ollama)
SELF_HOSTED_LLM_KEY No Auth key for self-hosted LLM (default: "not-needed")
SELF_HOSTED_LLM_MODEL No Model name for self-hosted LLM (default: "llama3")
CEREBRAS_API_KEY No Cerebras API key (LLM failover)
OPENROUTER_API_KEY No OpenRouter API key (LLM failover)
ANTHROPIC_API_KEY No Anthropic API key (LLM failover)
BRAVE_API_KEY No Brave Search API key (search failover)
SERPER_API_KEY No Serper.dev API key (search failover)

* At least one LLM provider and one search provider must be configured.

Run with your own model (self-hosted LLM)

VERDICT can route to any OpenAI-compatible inference server — vLLM, Ollama, LM Studio, or your own deployment. Set SELF_HOSTED_LLM_URL and the provider slots into the cascade at position 4, after groqgeminiopenrouterhuggingface and before the paid cerebrasanthropicopenai fallbacks. There is a single self-hosted slot; switch backends by pointing the URL (and SELF_HOSTED_LLM_MODEL) at a different server — only one backend is active at a time.

# Ollama — http://localhost:11434
ollama pull llama3 && ollama serve
SELF_HOSTED_LLM_URL=http://localhost:11434/v1/chat/completions
SELF_HOSTED_LLM_MODEL=llama3
SELF_HOSTED_LLM_KEY=not-needed

# vLLM — http://localhost:8000
vllm serve meta-llama/Meta-Llama-3-8B-Instruct --port 8000
SELF_HOSTED_LLM_URL=http://localhost:8000/v1/chat/completions
SELF_HOSTED_LLM_MODEL=meta-llama/Meta-Llama-3-8B-Instruct

# LM Studio — http://localhost:1234 (default in LM Studio's local server tab)
SELF_HOSTED_LLM_URL=http://localhost:1234/v1/chat/completions
SELF_HOSTED_LLM_MODEL=<model name shown in LM Studio>

Provider definition lives in lib/llm.js as PROVIDERS[4] (name "self-hosted"). The slot only activates when SELF_HOSTED_LLM_URL is set; otherwise it's skipped entirely.

Deployment

  1. Fork this repo
  2. Import to Vercel
  3. Add environment variables (see table above)
  4. Deploy — auto-deploys on every push to main

Database Schema

See DB_SCHEMA.md for a full reference of all Supabase tables, columns, and indexes.


Built with adversarial evidence search, source independence scoring, and public Brier score accountability at the core.

About

predict trends

Resources

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors