Discord message archival system with thread reconstruction capabilities. Scrapes channels via browser automation and stores messages in SQLite for analysis.
Prerequisites: Bun 1.3+, Claude Code CLI with --chrome flag
# 1. Clone and install
git clone https://github.com/justSteve/DReader.git
cd DReader
bun install
# 2. Start Claude Code with Chrome browser control
claude --chrome
# Log into Discord in the browser window that opens
# 3. Start the API server
bun run dev
# 4. Start scraping a channel
curl -X POST http://localhost:3001/api/scrape/start \
-H "Content-Type: application/json" \
-d '{"channel_id": "YOUR_CHANNEL_ID", "scrape_type": "initialization"}'┌─────────────────────┐ ┌──────────────────┐ ┌─────────────┐
│ Claude Chrome │────▶│ Scrape │────▶│ SQLite DB │
│ Extension │ │ Orchestrator │ │ (bun:sqlite)│
└─────────────────────┘ └──────────────────┘ └─────────────┘
│ │ │
│ ▼ ▼
Browser DOM ┌──────────────────┐ ┌─────────────┐
Extraction │ Hono REST API │◀────│ Thread │
│ :3001 │ │ Analyzer │
└──────────────────┘ └─────────────┘
| Component | Purpose |
|---|---|
| ClaudeChromeController | Browser automation via Claude's Chrome extension |
| InitializationScrapeOrchestrator | Historical scraping (backward paging) |
| IncrementalScrapeOrchestrator | New message scraping (forward paging) |
| DatabaseService | SQLite storage with bun:sqlite |
| ThreadAnalyzer | Reconstruct conversation trees from reply chains |
Scrapes backward in time to capture channel history.
# Full history
curl -X POST http://localhost:3001/api/scrape/start \
-d '{"channel_id": "123", "scrape_type": "initialization"}'
# With date cutoff
curl -X POST http://localhost:3001/api/scrape/start \
-d '{"channel_id": "123", "scrape_type": "initialization", "oldest_date_required": "2024-01-01"}'Scrapes forward from last known message to capture new activity.
curl -X POST http://localhost:3001/api/scrape/start \
-d '{"channel_id": "123", "scrape_type": "incremental"}'Base URL: http://localhost:3001/api
| Method | Endpoint | Description |
|---|---|---|
POST |
/scrape/start |
Start new scrape job |
POST |
/scrape/resume/:id |
Resume interrupted job |
GET |
/scrape/status/:id |
Get job status |
GET |
/scrape/channel/:id/status |
Get channel scrape status |
| Method | Endpoint | Description |
|---|---|---|
GET |
/messages |
List messages (paginated) |
GET |
/messages/:id |
Get single message |
GET |
/servers |
List scraped servers |
GET |
/channels |
List scraped channels |
| Method | Endpoint | Description |
|---|---|---|
GET |
/threads/:id |
Get thread tree (returns 501) |
SQLite database with these core tables:
| Table | Purpose |
|---|---|
servers |
Discord servers |
channels |
Discord channels |
messages |
Scraped messages with metadata |
scrape_jobs |
Job tracking with progress state |
oldest_scraped_id- Earliest message capturednewest_scraped_id- Latest message capturedscrape_scope- "initialization" or "incremental"status- pending, running, completed, failed
| Variable | Default | Description |
|---|---|---|
PORT |
3001 | API server port |
DB_PATH |
./data/dreader.db |
SQLite database path |
{
"scrape": {
"scroll_delay_ms": 1000,
"max_scrolls": 100,
"messages_per_batch": 50
}
}DReader/
├── src/
│ ├── api/ # Hono REST API
│ │ ├── index.ts # Server entry point
│ │ └── routes/ # Route handlers
│ ├── domain/
│ │ ├── scrape-engine/ # Browser control & orchestration
│ │ │ ├── ClaudeChromeController.ts
│ │ │ ├── InitializationScrapeOrchestrator.ts
│ │ │ └── IncrementalScrapeOrchestrator.ts
│ │ ├── metadata-capture/ # DOM extraction
│ │ ├── thread-reconstruction/ # Thread tree building
│ │ └── models/ # TypeScript types
│ ├── services/ # Database layer
│ │ ├── DatabaseService.ts
│ │ └── schema.sql
│ └── cli/ # CLI tools
│ └── auth-setup.ts
├── packages/ # Shared packages
├── tests/ # Test suites
├── bunfig.toml # Bun configuration
└── package.json
# Install dependencies
bun install
# Run tests (50 passing, 11 skipped)
bun test
# Start dev server
bun run dev
# Type check
bun run typecheckDReader uses Claude Code's Chrome browser for Discord authentication:
- Start Claude Code with Chrome:
claude --chrome - Navigate to Discord in the browser
- Log in with your Discord account
- Keep Claude Code running during scraping
No Discord API access (no bot token, no OAuth, no REST API). All retrieval is computer-use: browser session automation or native app control.
- Code Review - Architecture assessment and recommendations
- Implementation Plan - Refactoring status and roadmap
- Thread analysis is deferred (returns 501)
- Requires Claude Code with
--chromeflag - Windows-focused (PowerShell scripts)
- No rate limiting on API endpoints yet
- Runtime: Bun 1.3+
- API: Hono
- Database: SQLite (bun:sqlite)
- Browser: Claude Chrome Extension
- Language: TypeScript (ESNext)