A monitoring bot that tracks Chinese regulatory sources for CATL lithium mine license updates and automatically posts analysis to Manifold Markets.
Oreacle-bot v1.0 monitors the exact sources cited in the "CATL receives license renewal for Yichun Lithium mine by x" prediction market:
- CNINFO - Official CATL (300750.SZ) company disclosures
- SZSE - Shenzhen Stock Exchange notices
- Jiangxi Natural Resources - Provincial mining rights announcements
The bot uses GPT-4o-mini to analyze Chinese regulatory documents in real-time, extracting exact quotes, translating them, assessing confidence levels, and posting bilingual analysis to Manifold Markets with optional conservative trading.
-
Install dependencies
pip install -r requirements.txt
-
Set environment variables
export MANIFOLD_API_KEY="your_manifold_api_key" export MARKET_SLUG="MikhailTal/catl-receives-license-renewal-for-y" export OPENAI_API_KEY="your_openai_api_key" # For LLM analysis
-
Run the monitor
You can run the bot either locally or in GitHub Actions (or both simultaneously):
Local (Continuous):
source .env && python src/monitor.py # Runs forever with 15-minute cycles until stopped
GitHub Actions (Cloud Automation):
# Set up repository secrets, then push to GitHub # Automatically runs every 15 minutes in the cloud # See GITHUB_ACTIONS_SETUP.md for complete setup guide
MANIFOLD_API_KEY- Your Manifold Markets API keyMARKET_SLUG- The market to monitor (e.g., "MikhailTal/catl-receives-license-renewal-for-y")
OPENAI_API_KEY- OpenAI API key for GPT-4o-mini analysisOREACLE_MODEL- OpenAI model to use (default: "gpt-4o-mini")OREACLE_MIN_CONFIDENCE- Minimum confidence for decisions (default: 0.75)
DEEPL_API_KEY- DeepL API key for translation fallbackGOOGLE_TRANSLATE_API_KEY- Google Translate API key fallback
OREACLE_COMMENT_ONLY- Set to "1" to disable trading (default: "1")OREACLE_INTERVAL- Check interval in seconds (default: 900)OREACLE_DB- SQLite database path (default: "./tmp/oreacle.db")OREACLE_LOG- Log level: DEBUG, INFO, WARNING, ERROR (default: "INFO")
- URL: https://www.cninfo.com.cn/
- Purpose: Official CATL company filings and announcements
- Keywords: Chinese terms for mining permits, renewals, production status
- URL: https://www.szse.cn/disclosure/listed/notice/
- Purpose: Exchange notices related to CATL (300750.SZ)
- Scope: Regulatory notices and company announcements
- URLs:
- Purpose: Mining rights transactions and permits for Yichun region
- Focus: License renewals, exploration permits, production approvals
- Monitors 3 Chinese regulatory sources every 15 minutes for CATL lithium mine updates
- Analyzes documents with AI using GPT-4o-mini to extract structured information
- Posts bilingual comments to Manifold Markets with exact Chinese quotes + English translations
- Makes conservative trades only when high-confidence analysis passes strict safety gates
- Tracks all activity in SQLite database to prevent duplicate processing
- Document Classification: YES_CONDITION (license renewals, production approvals), NO_CONDITION (exploration only, suspensions), AMBIGUOUS, IRRELEVANT
- Mine Matching: Identifies references to Jianxiawo/枧下窝 mine specifically vs other locations
- Authority Recognition: Maps regulatory bodies (Jiangxi Natural Resources, Yichun authorities, etc.)
- Confidence Scoring: 0.0-1.0 confidence with evidence-based reasoning
- Risk Detection: Flags exploration-only permits, typos, unclear language
- Quote Extraction: Preserves exact Chinese regulatory language with literal English translations
- Conservative Gates: Requires entity match + license action + high confidence + clear evidence
- NO False Positives: Multiple validation layers prevent bad trades
- Small Positions: Default 5 M$ limit orders at 55% probability
- Comment-Only Mode: Trading disabled by default (
OREACLE_COMMENT_ONLY=1)
- Enhanced Data Collection:
- CNINFO: Keyword search + stock code 300750 sweep
- SZSE: Retry logic with reduced keyword set on failures
- Jiangxi: Mining rights portal scraping with relevance filtering
- Boolean Prefiltering: Requires (company OR mine OR geo) AND (license-action OR resumption verb)
- LLM Analysis: Structured JSON Schema output with exact quote preservation
- Decision Gates: Conservative logic prevents false positives in trading
- Rich Comments: Bilingual format with confidence indicators, evidence quotes, source links
- Regex-based classification with Chinese/English keyword matching
- Optional DeepL/Google translation for Chinese text
- Basic relevance filtering and comment generation
- Automatically enabled when
OPENAI_API_KEYnot provided
Raw Document → Boolean Prefilter → LLM Extraction → Decision Gates → Action
↓ ↓ ↓
Entity + Action Structured JSON Comment/Trade
Match Required with Evidence if Gates Pass
src/models.py- Pydantic schemas for structured LLM outputssrc/llm_client.py- OpenAI client with JSON Schema validationsrc/decision.py- Conservative decision gates for trading safetysrc/comment_renderer.py- Rich bilingual comment formattingphrasebook.yml- Chinese/English term definitions for LLM context
src/translate.py- DeepL/Google translation (used when no LLM)src/classify.py- Regex-based classification (used when no LLM)
src/sources/cninfo.py- CATL official filings scrapersrc/sources/szse.py- Stock exchange notices scrapersrc/sources/jiangxi.py- Provincial mining authority scrapersrc/storage.py- SQLite deduplication databasesrc/client.py- Manifold Markets API client
src/monitor.py- Main monitoring loop with LLM integration (local continuous)src/monitor_single.py- Single-cycle version for GitHub Actions (cloud)
When the bot finds a relevant document, it extracts structured information:
{
"mine_match": "JIANXIAWO_MATCH",
"proposed_label": "YES_CONDITION",
"confidence": 0.85,
"key_terms_found_zh": ["采矿许可证延续", "恢复生产"],
"key_terms_found_en": ["mining license renewal", "resume production"],
"evidence": [{
"exact_zh_quote": "同意宜春枧下窝锂云母矿采矿许可证延续申请",
"en_literal": "Approve the mining license renewal application for Yichun Jianxiawo lithium mica mine",
"where_in_doc": "main announcement"
}]
}The bot posts bilingual comments like:
🤖 Oreacle LLM Analysis — 🟢 Confidence: 85%
📄 Source: [江西省自然资源厅批复](https://example.com/doc)
🏛️ Authority: Jiangxi Natural Resources Department
⛏️ Mine Match: JIANXIAWO_MATCH
Key Evidence (ZH→EN):
> 中文: 「同意宜春枧下窝锂云母矿采矿许可证延续申请」
> English: Approve the mining license renewal application for Yichun Jianxiawo lithium mica mine
LLM Verdict: YES_CONDITION → Final: YES_CONDITION
Terms Found:
- 🇨🇳 采矿许可证延续, 恢复生产
- 🇬🇧 mining license renewal, resume production
*Automated analysis by Oreacle Bot*[INFO] LLM integration: ENABLED
[INFO] LLM model: gpt-4o-mini, min confidence: 0.75
[INFO] Connected to market: CATL receives license renewal...
[INFO] CNINFO: Found 2 items
[INFO] Processing cninfo item: 宜春枧下窝采矿许可证延续申请获批...
[INFO] Running LLM extraction...
[INFO] LLM Analysis - Proposed: YES_CONDITION, Final: YES_CONDITION, Confidence: 0.85
[INFO] Posted LLM comment for cninfo item_12345
You can run the bot in three ways:
-
Local Only: Run
python src/monitor.pyon your machine- Persistent SQLite database in
./tmp/oreacle.db - Perfect deduplication across runs
- Requires your machine to stay online
- Persistent SQLite database in
-
GitHub Actions Only: Automated cloud execution every 15 minutes
- Database resets each run (ephemeral runners)
- Still works perfectly for regulatory monitoring
- No machine uptime requirements
-
Both Simultaneously: Local + GitHub Actions running in parallel
- Redundancy ensures no missed announcements
- Each maintains its own deduplication state
- Maximum reliability
GitHub Actions resets the SQLite database each run, but this doesn't cause problems because:
- Regulatory announcements are rare - New CATL mining documents appear maybe 1-10 times per month
- Quick re-analysis is cheap - GPT-4o-mini costs ~$0.01 per document, so re-analyzing the same document a few times costs pennies
- Manifold prevents spam - If you post the same comment twice, Manifold will reject duplicates
- Time-based filtering - Most regulatory sources show recent items first, so old items naturally age out
- Short analysis window - Each run only takes 2-3 minutes, so the overlap window is minimal
In practice: A new regulatory announcement might get analyzed 2-3 times total before it's no longer in the "recent items" feed from the sources. This costs ~$0.03 instead of ~$0.01 - negligible for the reliability benefits of cloud automation.
Use GitHub Actions for production because:
- ✅ 24/7 uptime without your laptop
- ✅ Exact 15-minute intervals
- ✅ Built-in monitoring and logs
- ✅ Free hosting (GitHub Actions)
- ✅ Redundant with occasional re-analysis (~$0.02 extra cost per document)
Use Local for development because:
- ✅ Perfect deduplication
- ✅ Immediate feedback and debugging
- ✅ No CI/CD setup required
- Target Manifold Market - The prediction market being monitored
- CNINFO Portal - CATL official filings
- SZSE Disclosure Portal - Stock exchange notices
- Jiangxi Natural Resources - Provincial mining authority