Operational infrastructure for running five ForeFlow agents on Foresight Arena.
Part of the research system described in "Coordination as an Architectural Layer for LLM-Based Multi-Agent Systems" (Nechepurenko & Shuvalov, 2026).
| Repo | Role |
|---|---|
| coordination-experiment | LLM harness + five coordination configurations |
| foreflow-agents | Five on-chain agent entry points |
| foreflow-agents-engine (this repo) | Registration, scheduling, healthcheck, deployment |
register-all— interactive Twitter voucher flow that generates wallets for all five agents and registers them on Foresight Arena.healthcheck— wallet balances, on-chain registration status, last successful run.run-agent— invoked by cron viaops/run-agent.sh; spawns the agent subprocess, captures its JSONL stdout, and writes predictions + LLM traces to SQLite.post-daily-status— composes and posts a cumulative stats tweet per agent (cron: 18:00 UTC).post-resolution-status— detects newly resolved rounds and posts one tweet per round (cron: every 30 min).receive-events— reads JSONL agent events from stdin and writes them to the DB (useful for testing).dump-data— exports predictions, traces, and tweets to JSONL files for research.bootstrap-vps— guided one-shot setup from blank VPS.
npm install
npm run build
node dist/cli.js --helpOr after npm link:
foreflow-engine --helpOn-chain interaction uses the foresight-arena
SDK (v0.1.6+). The SDK provides: requestChallenge, verifyTweet, register, isRegistered,
getNonce, getAllScores — everything needed for registration and healthchecks.
The agent runtime (gaslessCommit, gaslessReveal, getActiveRounds, etc.) is consumed by
foreflow-agents, not this repo.
foreflow-engine register-all Register all 5 agents via Twitter voucher flow
--dry-run Simulate, no network calls
--no-manual-fallback Skip agents without Twitter tokens
--no-confirm-pause Skip confirmation + 3s tweet countdown
foreflow-engine register --agent <name> Register a single agent (same flags)
foreflow-engine healthcheck Wallet balances + registration status
foreflow-engine run-agent <name> Run one agent (called by cron)
--mode discover|predict|all
--live
foreflow-engine bootstrap-vps One-shot VPS setup
foreflow-engine twitter-auth <agent> OAuth 2.0 PKCE — authorize a Twitter account
foreflow-engine test-tweet <agent> Post a test tweet from an agent account
foreflow-engine twitter-status Show token and tweet status for all agents
foreflow-engine post-daily-status <agent>
Post cumulative stats tweet (18:00 UTC cron target)
--dry-run Compose and print — do not post
foreflow-engine post-resolution-status <agent>
Check for new round resolutions and post if any
--dry-run Print tweets that would be posted — do not post
foreflow-engine receive-events Read JSONL agent events from stdin → DB
--agent <name> Agent name (default: foreflow-ensemble)
foreflow-engine dump-data <output-dir> Export predictions/traces/tweets to JSONL
Defaults to Polygon Amoy testnet (safe for development). Switch to mainnet by
updating three env vars — see .env.example and docs/DEPLOYMENT.md.
| Network | CHAIN_ID | ARENA_ADDRESS |
|---|---|---|
| Amoy testnet | 80002 | 0x219937292A48266681ECf08d4c2D1B45b4517Fd2 |
| Polygon mainnet | 137 | 0xB81e4F6D37f036508F584B8e9Cc1dceA096D554d |
See .env.example for the full list. Key variables:
| Variable | Description |
|---|---|
FOREFLOW_<AGENT>_AGENT_KEY |
Per-agent private key (written by register-all) |
FOREFLOW_AGENTS_DIR |
Path to built foreflow-agents repo |
DRY_RUN=1 |
Skip on-chain calls (safe default) |
RPC_URL |
Polygon JSON-RPC endpoint |
CHAIN_ID |
80002 (Amoy) or 137 (mainnet) |
TWITTER_CLIENT_ID |
Twitter Developer App OAuth 2.0 client ID |
TWITTER_CLIENT_SECRET |
Twitter Developer App OAuth 2.0 client secret |
Engine state lives in ~/.foreflow-state/:
foreflow.db— SQLite database (0600); stores predictions, LLM traces, Twitter tokens, tweets, and runtime state<agent-name>/registered.json— agentId, txHash, registration timestamp<agent-name>/last-discover.txt— timestamp of last successful discover run
Agent SDK state (reveal queue) lives in ~/.foreflow-state/<agent-name>/.foresight-arena/,
isolated per agent.
discover every 2h drain reveal queue, post on-chain reveals
predict every 5m commit predictions when round is within LEAD_TIME_SECONDS (600s)
daily-status 18:00 UTC post cumulative stats tweet per agent
resolution every 30m check for newly resolved rounds, post one tweet per round
See ops/crontab.example for the full crontab and ops/run-status.sh for the wrapper.
foreflow-agents (subprocess)
stdout (JSONL events)
└─► run-agent → EventHandler → foreflow.db
└─► predictions + traces
foreflow.db
└─► post-daily-status → Twitter (18:00 UTC)
└─► post-resolution-status → Twitter (every 30m)
└─► dump-data → JSONL export (research)
Status tweets are never published for a round until all predictions in that round
have been revealed on-chain and the round's reveal_deadline has passed. This is
enforced at the SQL query level in getRevealedRoundsForAgent. See
docs/STATUS_POSTS.md for details.
foreflow-engine dump-data ~/foreflow-dataset/Writes predictions.jsonl, traces.jsonl, tweets.jsonl, and manifest.json.
Private keys and Twitter OAuth tokens are never included. See
docs/DATA_COLLECTION.md for the full schema and query guide.
| Package | Version | Role |
|---|---|---|
foresight-arena |
^0.1.6 |
SDK: registration, healthcheck queries |
viem |
^2.27.0 |
Wallet generation, balance queries |
Each of the five agents posts updates to its dedicated Twitter account. Authentication uses OAuth 2.0 PKCE — agents authorize a shared Developer App once, then the engine posts on their behalf.
- Set
TWITTER_CLIENT_IDandTWITTER_CLIENT_SECRETin.env(obtain from https://developer.twitter.com). - Register
http://localhost:8765/callbackas an OAuth callback URL in the Twitter Developer Portal app settings. - For each agent, run the OAuth flow:
The CLI will print a URL; open it in a browser, log in as the correct agent account, and approve. Repeat for each of the five agents.
engine twitter-auth foreflow-ensemble - Verify status:
engine twitter-status
engine test-tweet foreflow-ensemble
engine test-tweet foreflow-ensemble --text "Custom test tweet"
register-all and register automatically post the challenge tweet for any agent
whose Twitter account is authorized. Unauthorized agents fall back to a manual
URL-paste prompt. Pass --no-manual-fallback to skip unready agents (non-zero exit)
or --dry-run to preview without touching the network.
See REGISTRATION.md for the full flow and flag reference.
Internal callers use postFromAgent() from src/twitter/post.ts.
Access and refresh tokens are stored in the local SQLite database at
~/.foreflow-state/foreflow.db. The DB file has 0600 permissions. Tokens
auto-refresh when within 60 seconds of expiry.
- DEPLOYMENT.md — step-by-step from blank VPS
- REGISTRATION.md — Twitter voucher flow walkthrough
- TROUBLESHOOTING.md — common errors
- TWITTER.md — Twitter integration setup and troubleshooting
- DATA_COLLECTION.md — SQLite schema, JSONL event protocol, query guide, privacy
- STATUS_POSTS.md — daily/resolution tweet templates, reveal-leak rules, troubleshooting
If you use this software, please cite the accompanying paper. See CITATION.cff.
If you use this code, please cite the papers it implements:
@misc{nechepurenko2026arena,
title = {Foresight Arena: An On-Chain Benchmark for Evaluating AI Forecasting Agents},
author = {Nechepurenko, Maksym and Shuvalov, Pavel},
year = {2026},
url = {https://papers.ssrn.com/abstract=6674059},
note = {SSRN Working Paper 6674059}
}Full preprint: https://foresightflow.org/publications/foresight-arena.
@misc{nechepurenko2026coordination,
title = {Coordination as an Architectural Layer for LLM-Based Multi-Agent Systems: An Information-Controlled Empirical Study on Prediction Markets},
author = {Nechepurenko, Maksym and Shuvalov, Pavel},
year = {2026},
url = {https://papers.ssrn.com/abstract=6687518},
note = {SSRN Working Paper 6687518}
}Full preprint: https://foresightflow.org/publications/coordination-architectural-layer.