TraceHub — Distributed Logging Platform (Monolithic MVP)

A production-grade learning system that simulates Kafka-like reliability, ACK/replay protocols, and real-time observability — all in one monolithic repo.

┌─────────────────────────────────────────────────────────┐
│                    TRACEHUB ARCHITECTURE                 │
│                                                         │
│  ┌──────────────┐   batch+seq   ┌──────────────────┐   │
│  │  auth-svc    │──────────────▶│                  │   │
│  │  payment-svc │  POST /ingest │   Backend API    │   │
│  │  notif-svc   │◀─────────────│   (Express.js)   │   │
│  └──────────────┘   ACK/missing └────────┬─────────┘   │
│   (Producer Sim)                         │             │
│                                    ┌─────▼──────┐      │
│                                    │   Redis    │      │
│                                    │  queue:logs│      │
│                                    │  queue:retry│     │
│                                    │  producer:*│      │
│                                    └─────┬──────┘      │
│                                          │             │
│                                    ┌─────▼──────┐      │
│                                    │  Log Worker │     │
│                                    │  (batch    │      │
│                                    │   insert)  │      │
│                                    └─────┬──────┘      │
│                                          │             │
│                                    ┌─────▼──────┐      │
│                                    │ PostgreSQL │      │
│                                    │  logs table│      │
│                                    │  ack_state │      │
│                                    │  replays   │      │
│                                    └────────────┘      │
│                                                        │
│  ┌──────────────┐    Socket.io    ┌──────────────────┐ │
│  │   Next.js    │◀───────────────│   Backend WS     │ │
│  │   Dashboard  │                │   (real-time)    │ │
│  └──────────────┘                └──────────────────┘ │
└─────────────────────────────────────────────────────────┘

🚀 Quick Start

# Clone and start everything
git clone <repo>
cd tracehub
docker-compose up --build

# Services:
# Dashboard:  http://localhost:3000
# Backend:    http://localhost:3001
# PostgreSQL: localhost:5432
# Redis:      localhost:6379

🏗️ Project Structure

tracehub/
├── apps/
│   ├── backend/              # Node.js + Express + Socket.io
│   │   └── src/
│   │       ├── index.ts          # App entry + Socket.io setup
│   │       ├── routes/
│   │       │   ├── ingest.ts     # POST /ingest — receive log batches
│   │       │   ├── logs.ts       # GET /logs — query + replay API
│   │       │   └── metrics.ts    # GET /metrics + POST /control
│   │       ├── services/
│   │       │   ├── database.ts   # PostgreSQL pool + helpers
│   │       │   ├── redis.ts      # Redis client + queue ops
│   │       │   ├── ackService.ts # ACK/replay protocol
│   │       │   └── metricsService.ts
│   │       └── workers/
│   │           └── logWorker.ts  # Queue consumer + batch insert
│   │
│   ├── dashboard/            # Next.js 14 + Tailwind + Recharts
│   │   └── src/
│   │       ├── app/
│   │       │   ├── page.tsx          # Overview
│   │       │   ├── live-logs/        # Real-time log stream
│   │       │   ├── replay/           # Replay center
│   │       │   ├── queue/            # Queue monitor
│   │       │   └── services/         # Per-service metrics
│   │       ├── components/
│   │       │   ├── SocketProvider.tsx
│   │       │   ├── dashboard/
│   │       │   └── ui/
│   │       └── hooks/
│   │           └── useMetrics.ts     # Socket.io hooks
│   │
│   └── producer-simulator/   # Fake EC2 services
│       └── src/index.ts      # auth + payment + notification producers
│
├── postgres/init.sql         # Schema: logs, ack_state, replay_requests
├── redis/redis.conf
├── shared/src/index.ts       # Shared TypeScript types
└── docker-compose.yml

🧠 Core Concepts Implemented

1. Sequence-Based Log Protocol

Every log carries a monotonically increasing seq number per service:

{
  "seq": 1001,
  "service": "payment-service",
  "level": "error",
  "message": "payment failed",
  "requestId": "req_abc123",
  "timestamp": "2024-01-15T10:30:00.000Z"
}

2. ACK/Replay Protocol

Producer → Backend flow:

Producer generates logs, saves them in Redis sorted set (producer:{service}) keyed by seq
Producer sends batch via POST /ingest
Backend enqueues logs into Redis queue:logs
Backend computes ACK response:
- ackTill = highest contiguous seq received
- missing = gaps in the sequence window
Producer removes ACKed logs from its buffer
Missing seqs are queued in replay_requests table

ACK Response Format:

{
  "batchId": "ack_1705312200000",
  "ackTill": 1099,
  "missing": [1088, 1092, 1095],
  "status": "partial"
}

3. Redis Queue Architecture

queue:logs        LIST  → main processing queue (RPUSH / LPOP)
queue:retry       LIST  → failed/retry queue
producer:{svc}    ZSET  → producer buffer sorted by seq (for replay lookup)
ack:{svc}         STRING → current ACK state per service

4. Worker Processing

The LogWorker runs continuously:

Pops up to 50 logs from queue:logs
Deduplicates by service:seq
Batch inserts into PostgreSQL
Checks replay_requests for pending replays
Fetches buffered logs from Redis for replay
Marks replays completed

5. Duplicate Protection

In-memory sliding window Set of service:seq pairs
PostgreSQL ON CONFLICT DO NOTHING on insert
Maximum 10,000 entries tracked before eviction

🎛️ Fault Injection Controls

Control	Effect
Crash Worker	Stops log processing for 5s, auto-recovers
Network Failure	Drops all `/ingest` requests for 10s
Delay ACK	Adds 3s latency to ACK responses
Flush Queue	Drops all queued logs (demonstrates data loss)
Reset All	Clears all failure states

📊 Dashboard Pages

Page	Path	Features
Overview	`/`	System metrics, EPS chart, service status, fault controls
Live Logs	`/live-logs`	Real-time log stream, filters, search
Replay Center	`/replay`	Replay requests, manual trigger, live events
Queue Monitor	`/queue`	Queue depth charts, worker status, sim events
Service Metrics	`/services`	Per-service EPS, errors, ACK state, charts

🔌 API Reference

POST /ingest              Receive log batch, return ACK
GET  /logs                Query logs (filters: service, level, search, from, to)
GET  /logs/replay         List replay requests
POST /logs/replay         Trigger manual replay
GET  /metrics             System metrics snapshot
GET  /metrics/queue       Queue depth + worker stats
GET  /metrics/replay      Replay stats by status
POST /metrics/control     Fault injection controls
GET  /health              Health check

📡 WebSocket Events

Event	Direction	Payload
`metrics:snapshot`	Server → Client	Full SystemMetrics every 2s
`log:new`	Server → Client	Individual LogEntry
`ack:sent`	Server → Client	ACK details
`replay:completed`	Server → Client	Replay result
`worker:crashed`	Server → Client	Crash notification
`worker:recovered`	Server → Client	Recovery notification
`sim:control`	Server → Client	Fault injection event

🔄 Future Microservices Migration Path

This monolith is structured to extract into:

tracehub-ingest-service    → /ingest route
tracehub-worker-service    → logWorker.ts
tracehub-query-service     → /logs route
tracehub-metrics-service   → metricsService.ts
tracehub-replay-service    → ackService.ts + replay worker

Each shares the same PostgreSQL schema and Redis keys, making migration additive rather than disruptive.

🛠️ Local Development (Without Docker)

# Start dependencies
docker-compose up postgres redis -d

# Backend
cd apps/backend
npm install
npm run dev

# Producer
cd apps/producer-simulator
npm install
npm run dev

# Dashboard
cd apps/dashboard
npm install
npm run dev

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TraceHub — Distributed Logging Platform (Monolithic MVP)

🚀 Quick Start

🏗️ Project Structure

🧠 Core Concepts Implemented

1. Sequence-Based Log Protocol

2. ACK/Replay Protocol

3. Redis Queue Architecture

4. Worker Processing

5. Duplicate Protection

🎛️ Fault Injection Controls

📊 Dashboard Pages

🔌 API Reference

📡 WebSocket Events

🔄 Future Microservices Migration Path

🛠️ Local Development (Without Docker)

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
apps		apps
postgres		postgres
redis		redis
shared		shared
README.md		README.md
docker-compose.yml		docker-compose.yml
start.sh		start.sh

Folders and files

Latest commit

History

Repository files navigation

TraceHub — Distributed Logging Platform (Monolithic MVP)

🚀 Quick Start

🏗️ Project Structure

🧠 Core Concepts Implemented

1. Sequence-Based Log Protocol

2. ACK/Replay Protocol

3. Redis Queue Architecture

4. Worker Processing

5. Duplicate Protection

🎛️ Fault Injection Controls

📊 Dashboard Pages

🔌 API Reference

📡 WebSocket Events

🔄 Future Microservices Migration Path

🛠️ Local Development (Without Docker)

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages