A production-ready distributed API monitoring platform that tracks requests across microservices, detects issues, and provides real-time visibility through a modern dashboard.
Works on any platform: Linux, macOS, and Windows with Docker support.
- π Real-time API Monitoring - Track all API requests across your microservices
- π Interactive Dashboard - Modern Next.js UI with charts and filters
- π¨ Smart Alerting - Automatic alerts for latency issues and errors
- π Secure by Default - JWT authentication, role-based access control
- π Rate Limiting - Built-in token bucket rate limiter
- π³ Docker Ready - One-command deployment with Docker Compose
- π§ Easy Integration - Drop-in library for Spring Boot applications
- Architecture
- Database Schemas
- Design Decisions
- Dual MongoDB Setup
- Rate Limiter Deep Dive
- Quick Start
- Integration Guide
- API Reference
- Troubleshooting
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β YOUR MICROSERVICES β
β βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ β
β β Order Service β β User Service β β Payment Service β ... β
β β + Monitoring β β + Monitoring β β + Monitoring β β
β β Library β β Library β β Library β β
β ββββββββββ¬βββββββββ ββββββββββ¬βββββββββ ββββββββββ¬βββββββββ β
βββββββββββββΌβββββββββββββββββββββΌβββββββββββββββββββββΌβββββββββββββββββββββββββ
β β β
β Async HTTP (non-blocking) β
ββββββββββββββββββββββΌβββββββββββββββββββββ
βΌ
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β COLLECTOR SERVICE β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ β
β β Log Ingestion β β Alert β β Issue β β Auth & β β
β β & Processing β β Generator β β Management β β Security β β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ β
β Spring Boot + Kotlin (JDK 21) β
βββββββββββ¬βββββββββββββββββββββββββββββββββββββββββββββββββββββββ¬ββββββββββββββ
β β
βΌ βΌ
ββββββββββββββββββββββββ ββββββββββββββββββββββββ
β LOGS DATABASE β β METADATA DATABASE β
β (MongoDB:27017) β β (MongoDB:27018) β
β β β β
β Collections: β β Collections: β
β β’ api_logs β β β’ users β
β β’ rate_limit_events β β β’ incidents β
β β β β’ alerts β
β TTL: 30 days β β TTL: 90 days β
ββββββββββββββββββββββββ ββββββββββββββββββββββββ
β² β²
ββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββ
β
ββββββββββββββββββββββββββββββββββββ
β NEXT.JS DASHBOARD β
β (Port 9000) β
β β
β β’ Real-time Log Monitoring β
β β’ Log Explorer with Filters β
β β’ Issue Management β
β β’ Alert Viewer β
ββββββββββββββββββββββββββββββββββββ
| Component | Technology | Purpose |
|---|---|---|
| Monitoring Library | Kotlin + Spring Boot | Drop-in library for any microservice |
| Collector Service | Kotlin + Spring Boot 3.2 | Central log aggregation & analysis |
| Dashboard | Next.js 14 + React | Web UI for monitoring |
| Logs Database | MongoDB 7 | High-volume log storage |
| Metadata Database | MongoDB 7 | Users, incidents, alerts |
- Request Interception: Monitoring library intercepts HTTP requests/responses
- Rate Limiting Check: Token bucket algorithm checks request rate
- Async Shipping: Logs sent asynchronously to collector (non-blocking)
- Processing: Collector analyzes logs, generates alerts if thresholds exceeded
- Storage: Data persisted to appropriate MongoDB database
- Visualization: Dashboard queries collector API for real-time data
Stores all API request/response logs from monitored services.
{
"_id": ObjectId,
"timestamp": ISODate, // When request occurred
"serviceName": String, // e.g., "order-service"
"endpoint": String, // e.g., "/api/orders"
"method": String, // HTTP method: GET, POST, etc.
"statusCode": Int32, // HTTP status code
"latency": Int64, // Response time in milliseconds
"requestSize": Int64, // Request body size in bytes
"responseSize": Int64, // Response body size in bytes
"traceId": String, // Distributed tracing ID
"clientId": String, // Optional client identifier
"userId": String, // Optional user identifier
"isRateLimited": Boolean, // Was this request rate-limited?
"error": String, // Error message if any
"createdAt": ISODate // TTL index: auto-delete after 30 days
}
// Indexes for query performance
CompoundIndex: { "serviceName": 1, "endpoint": 1, "timestamp": -1 }
CompoundIndex: { "serviceName": 1, "timestamp": -1 }
TTL Index: { "createdAt": 1 }, expireAfterSeconds: 2592000 (30 days)Tracks when rate limits are exceeded.
{
"_id": ObjectId,
"timestamp": ISODate, // When rate limit was hit
"serviceName": String, // Service that hit the limit
"currentRate": Int32, // Current request rate
"limit": Int32, // Configured limit
"traceId": String, // Request trace ID
"createdAt": ISODate // TTL index: auto-delete after 1 day
}User authentication and authorization.
{
"_id": ObjectId,
"username": String, // Unique username (indexed)
"email": String, // Unique email (indexed)
"passwordHash": String, // BCrypt hashed password
"role": String, // "ADMIN" or "DEVELOPER"
"createdAt": ISODate,
"lastLogin": ISODate
}Tracks issues detected in monitored services.
{
"_id": ObjectId,
"serviceName": String, // Affected service
"endpoint": String, // Affected endpoint
"issueType": String, // "SLOW", "BROKEN", "RATE_LIMITED"
"firstSeen": ISODate, // When issue first detected
"lastSeen": ISODate, // Most recent occurrence
"openIncidentsCount": Int32, // Number of occurrences
"status": String, // "OPEN", "RESOLVED", "RE_OPENED"
"assignee": String, // Assigned user (optional)
"version": Int64, // Optimistic locking version
"auditTrail": [ // Change history
{
"timestamp": ISODate,
"action": String,
"changedBy": String,
"changeDetails": String
}
],
"createdAt": ISODate
}Active and historical alerts.
{
"_id": ObjectId,
"serviceName": String, // Affected service
"endpoint": String, // Affected endpoint
"alertType": String, // "HIGH_LATENCY", "ERROR_5XX", "RATE_LIMIT"
"severity": String, // "CRITICAL", "WARNING", "INFO"
"message": String, // Human-readable alert message
"triggeredAt": ISODate, // When alert fired
"resolvedAt": ISODate, // When resolved (null if active)
"isActive": Boolean, // Current alert status
"metadata": Object, // Additional context
"createdAt": ISODate // TTL index: auto-delete after 90 days
}Decision: Use two separate MongoDB instances instead of one.
Rationale:
| Benefit | Description |
|---|---|
| Isolation | High-volume log writes don't impact user authentication queries |
| Scaling | Each database can be scaled independently based on load |
| Retention | Different TTL policies (logs: 30 days, metadata: 90 days) |
| Performance | Separate connection pools prevent resource contention |
Trade-offs:
- Increased infrastructure complexity
- Higher memory footprint
- No cross-database transactions (acceptable for this use case)
Decision: Implement token bucket algorithm instead of sliding window.
Rationale:
- Burst Handling: Allows legitimate traffic bursts while maintaining average rate
- Non-blocking: Requests continue even when rate-limited (flagged for monitoring)
- Memory Efficient: O(1) space complexity per service
- Thread-safe: Atomic operations using
AtomicLong
Decision: Use @Version annotation for concurrent incident updates.
Rationale:
- No Database Locks: Better scalability than pessimistic locking
- Conflict Detection: Clear feedback when concurrent modifications occur
- Retry Logic: Client can fetch fresh data and retry
- Audit Trail: All changes tracked with timestamps
Decision: Send logs asynchronously from monitoring library.
Rationale:
- Zero Latency Impact: Monitored services not blocked by collector
- Resilience: If collector is down, service continues operating
- Batching: Multiple logs sent in single HTTP request
Decision: Stateless JWT tokens instead of session-based auth.
Rationale:
- Scalability: No server-side session storage needed
- Microservice Ready: Token validated independently by each service
- Expiry: 24-hour token lifetime balances security and UX
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β SINGLE DATABASE PROBLEMS β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β High log write volume blocks user authentication β
β β Log queries slow down incident management β
β β Single point of failure β
β β Cannot scale components independently β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β DUAL DATABASE BENEFITS β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
Write-heavy logs isolated from read-heavy metadata β
β β
Independent scaling (add replicas to logs DB if needed) β
β β
Different backup strategies per database β
β β
Separate connection pools prevent resource contention β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
// LogsDatabaseConfig.kt - High-volume log storage
@Configuration
class LogsDatabaseConfig {
@Value("\${spring.data.mongodb.logs.uri}")
private lateinit var logsUri: String
@Bean("logsMongoTemplate")
fun logsMongoTemplate(): MongoTemplate {
return MongoTemplate(SimpleMongoClientDatabaseFactory(
MongoClients.create(logsUri), "monitoring_logs"
))
}
}
// MetadataDatabaseConfig.kt - Users, incidents, alerts
@Configuration
class MetadataDatabaseConfig {
@Value("\${spring.data.mongodb.metadata.uri}")
private lateinit var metadataUri: String
@Bean("metadataMongoTemplate")
@Primary // Default template for Spring Data repositories
fun metadataMongoTemplate(): MongoTemplate {
return MongoTemplate(SimpleMongoClientDatabaseFactory(
MongoClients.create(metadataUri), "monitoring_metadata"
))
}
}services:
logs-mongodb:
image: mongo:7
ports:
- "127.0.0.1:27017:27017" # Localhost only for security
environment:
MONGO_INITDB_ROOT_USERNAME: ${MONGO_LOGS_ROOT_USERNAME}
MONGO_INITDB_ROOT_PASSWORD: ${MONGO_LOGS_ROOT_PASSWORD}
volumes:
- logs_data:/data/db
metadata-mongodb:
image: mongo:7
ports:
- "127.0.0.1:27018:27017" # Different host port
environment:
MONGO_INITDB_ROOT_USERNAME: ${MONGO_METADATA_ROOT_USERNAME}
MONGO_INITDB_ROOT_PASSWORD: ${MONGO_METADATA_ROOT_PASSWORD}
volumes:
- metadata_data:/data/dbβββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β TOKEN BUCKET CONCEPT β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β Bucket Capacity: 100 tokens β
β Refill Rate: 100 tokens/second β
β β
β ββββββββββββββββββββββββββββ β
β β ββββββββββββββββββββββββ β β 100 tokens (full) β
β β ββββββββββββββββββββββββ β β
β β ββββββββββββββββββββββββ β Each request consumes 1 token β
β β ββββββββββββββββββββββββ β β
β β ββββββββββββββββββββββββ β Tokens refill continuously β
β ββββββββββββββββββββββββββββ β
β β
β Request arrives β Token available? β YES: Process, remove token β
β β NO: Flag as rate-limited β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
class TokenBucketLimiter(
private val capacity: Int, // Max tokens (e.g., 100)
private val refillRate: Double, // Tokens per interval
private val refillIntervalMs: Long // Refill interval in ms
) {
private var tokens = AtomicLong(capacity.toLong())
private var lastRefillTime = AtomicLong(System.currentTimeMillis())
fun allowRequest(): Boolean {
refillTokens() // Add tokens based on elapsed time
return if (tokens.get() > 0) {
tokens.decrementAndGet()
true // Request allowed
} else {
false // Rate limited (but request continues!)
}
}
private fun refillTokens() {
val now = System.currentTimeMillis()
val timePassed = now - lastRefillTime.get()
if (timePassed >= refillIntervalMs) {
val tokensToAdd = ((timePassed / refillIntervalMs.toDouble()) * refillRate).toInt()
if (tokensToAdd > 0) {
val newCount = min(capacity.toLong(), tokens.get() + tokensToAdd)
if (lastRefillTime.compareAndSet(lastRefillTime.get(), now)) {
tokens.set(newCount)
}
}
}
}
}| Property | Value | Why? |
|---|---|---|
| Non-blocking | Requests continue when limited | Service availability > strict enforcement |
| Thread-safe | Uses AtomicLong |
Safe for concurrent Spring requests |
| Burst-friendly | Full bucket handles bursts | Legitimate traffic spikes allowed |
| Memory efficient | O(1) per limiter | Scales to many services |
# In your microservice's application.yml
monitoring:
rateLimit:
enabled: true
limit: 100 # requests per second
window: 1000 # milliseconds (refill interval)Scenario 1: Normal Traffic (50 req/sec)
βββββββββββββββββββββββββββββββββββββββ
Bucket stays near full, all requests allowed
Tokens: ββββββββββββββββββββββββ (80-100)
Scenario 2: Traffic Spike (150 req/sec for 2 seconds)
βββββββββββββββββββββββββββββββββββββββ
First 100 requests allowed (bucket drains)
Next 50 flagged as rate-limited
After spike: bucket refills in ~1 second
Tokens: ββββββββββββββββββββββββ (0-20)
Scenario 3: Sustained High Traffic (120 req/sec)
βββββββββββββββββββββββββββββββββββββββ
~100 allowed, ~20/sec flagged as rate-limited
Service continues operating, alerts generated
Tokens: ββββββββββββββββββββββββ (0-5)
- Docker & Docker Compose (v2.0+)
- Java 21 (for local development)
- Node.js 18+ (for dashboard development)
| Platform | Setup Script | Init Script |
|---|---|---|
| Windows | .\scripts\setup.ps1 |
.\scripts\init-data.ps1 |
| Linux/macOS | ./scripts/setup.sh |
./scripts/init-data.sh |
git clone https://github.com/YOUR_USERNAME/api-monitoring-platform.git
cd api-monitoring-platformFor development (quick setup):
# Linux/macOS
cp .env.example .env
# Windows PowerShell
Copy-Item .env.example .envFor production (secure setup):
# Linux/macOS
chmod +x scripts/setup.sh && ./scripts/setup.sh
# Windows PowerShell
.\scripts\setup.ps1docker-compose up -d# Linux/macOS
./scripts/init-data.sh
# Windows PowerShell
.\scripts\init-data.ps1| Service | URL | Description |
|---|---|---|
| Dashboard | http://localhost:9000 | Web UI |
| API | http://localhost:9080 | Backend API |
| Logs DB | mongodb://localhost:27017 | MongoDB for logs |
| Metadata DB | mongodb://localhost:27018 | MongoDB for metadata |
Default credentials:
- Username:
admin - Password:
admin123
1. Add Dependency:
// build.gradle.kts
dependencies {
implementation(files("libs/monitoring-library-1.0.0.jar"))
}2. Configure:
# application.yml
monitoring:
enabled: true
serviceName: "your-service-name"
collectorUrl: "http://localhost:9080/api/logs"
rateLimit:
enabled: true
limit: 100
window: 1000
tracking:
captureRequestBody: false
excludePaths:
- "/health"
- "/actuator/**"3. Done! The library auto-configures via Spring's spring.factories.
# Login
curl -X POST http://localhost:9080/api/auth/login \
-H "Content-Type: application/json" \
-d '{"username": "admin", "password": "admin123"}'# Get logs with filters
curl -H "Authorization: Bearer TOKEN" \
"http://localhost:9080/api/logs?serviceName=order-service&limit=100"
# Ingest single log
curl -X POST http://localhost:9080/api/logs/single \
-H "Content-Type: application/json" \
-d '{
"timestamp": "2024-12-07T10:00:00Z",
"serviceName": "order-service",
"endpoint": "/api/orders",
"method": "POST",
"statusCode": 201,
"latency": 150,
"traceId": "trace-123"
}'| Method | Endpoint | Description |
|---|---|---|
| POST | /api/auth/login |
JWT login |
| POST | /api/logs/batch |
Batch ingest logs |
| POST | /api/logs/single |
Single log ingest |
| GET | /api/logs |
Query logs with filters |
| GET | /api/incidents |
List issues |
| POST | /api/incidents/{id}/resolve |
Resolve issue |
| GET | /api/alerts |
Get active alerts |
| Variable | Description | Required |
|---|---|---|
MONGO_LOGS_ROOT_PASSWORD |
Logs DB password | Yes |
MONGO_METADATA_ROOT_PASSWORD |
Metadata DB password | Yes |
APP_JWT_SECRET |
JWT signing key (256-bit) | Yes |
APP_ADMIN_USERNAME |
Default admin username | No |
APP_ADMIN_PASSWORD |
Default admin password | No |
| Condition | Severity |
|---|---|
| Latency > 500ms | WARNING |
| Latency > 1000ms | CRITICAL |
| Status Code 5xx | CRITICAL |
| Rate Limit Exceeded | WARNING |
Problem: Multiple developers marking same issue as resolved
Solution: Optimistic Locking with MongoDB version field
1. Read incident with version (v=5)
2. Update: WHERE id=X AND version=5
3. Increment version to 6
4. If modified by another β 409 Conflict β Retry
1. Start MongoDB instances:
# Logs DB
docker run -d --name logs-mongo -p 27017:27017 \
-e MONGO_INITDB_ROOT_USERNAME=root \
-e MONGO_INITDB_ROOT_PASSWORD=rootpassword mongo:7
# Metadata DB
docker run -d --name metadata-mongo -p 27018:27017 \
-e MONGO_INITDB_ROOT_USERNAME=root \
-e MONGO_INITDB_ROOT_PASSWORD=rootpassword mongo:72. Start Collector Service:
cd collector-service && ./gradlew bootRun3. Start Dashboard:
cd dashboard && npm install && npm run dev# Backend
cd collector-service && ./gradlew clean build test
# Frontend
cd dashboard && npm run build && npm run lint| Issue | Solution |
|---|---|
| Cannot connect to MongoDB | docker-compose ps to check containers |
| Dashboard shows "API Error" | curl http://localhost:9080/health |
| Login fails | Re-run ./scripts/init-data.sh |
api-monitoring-platform/
βββ monitoring-library/ # Reusable tracking library
β βββ src/main/kotlin/com/monitoring/
β βββ config/ # Spring auto-configuration
β βββ interceptor/ # HTTP request interceptor
β βββ ratelimit/ # Token bucket implementation
β βββ client/ # Collector HTTP client
β
βββ collector-service/ # Central collector service
β βββ src/main/kotlin/com/collector/
β βββ config/ # Dual MongoDB configuration
β βββ controller/ # REST endpoints
β βββ model/ # Domain models (see DB schemas)
β βββ repository/ # Data access
β βββ security/ # JWT authentication
β
βββ dashboard/ # Next.js frontend
β βββ app/ # App router pages
β βββ components/ # React components
β βββ lib/ # API client & utilities
β
βββ scripts/ # Cross-platform setup scripts
β βββ setup.ps1 # Windows setup
β βββ setup.sh # Linux/macOS setup
β βββ init-data.ps1 # Windows DB init
β βββ init-data.sh # Linux/macOS DB init
β
βββ docker-compose.yml # Development stack
βββ .env.example # Environment template
MIT License - see LICENSE for details.
- Fork the repository
- Create feature branch (
git checkout -b feature/amazing-feature) - Commit changes (
git commit -m 'Add amazing feature') - Push to branch (
git push origin feature/amazing-feature) - Open a Pull Request
- Issues: GitHub Issues
- Discussions: GitHub Discussions