Skip to content

Conversation

@rodrigodlima
Copy link
Contributor

Features and Improvements

🎯 Cache Metrics Monitoring

Integration of Redis Exporter (port 9121) to collect Redis metrics
Integration of Memcached Exporter (port 9150) to collect Memcached metrics
Addition of Telegraf for scraping Prometheus metrics from exporters
Telegraf configuration (
Cache metrics now sent to InfluxDB every 10 seconds

📊 Enhanced Grafana Dashboard

New "Cache Hit/Miss Rate Over Time" panel showing hit/miss rates in real-time
4 new stat panels:
Redis Cache Hit Rate
Redis Cache Miss Rate
Memcached Cache Hit Rate
Memcached Cache Miss Rate

Improved visual thresholds for response times (P95 and P99):
🟢 Green (0-50ms): Excellent
🟡 Yellow (50-100ms): Good
🟠 Orange (100-150ms): Attention needed
🔴 Red (>150ms): Problematic

Fixed min/max limits for success rate gauge (0-1)

⚙️ Resource Limits
Defined CPU and memory limits for all containers:
App: 1 CPU, 1GB RAM
Redis: 0.25 CPU, 256MB RAM
Memcached: 0.25 CPU, 128MB RAM
InfluxDB: 0.5 CPU, 512MB RAM
Grafana: 0.5 CPU, 512MB RAM
K6: 0.5 CPU, 256MB RAM
Redis Exporter: 0.1 CPU, 64MB RAM
Memcached Exporter: 0.1 CPU, 64MB RAM
Telegraf: 0.25 CPU, 256MB RAM

📚 Expanded Documentation
Complete rewrite of [benchmark/README.md]
Detailed explanation of monitoring architecture
Documentation about percentiles (P95, P99) and their importance
Visual diagram of metrics flow
Troubleshooting guide for common issues
Instructions for viewing cache metrics
Explanation of visual thresholds in Grafana

🔧 Infrastructure Improvements
Upgraded InfluxDB to version 1.12.2 (previously 1.8)
Disabled Grafana default password change enforcement for development environment
Added appropriate health checks and dependencies between containers

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant