Add comprehensive monitoring infrastructure with cache metrics and resource limits #23

rodrigodlima · 2025-10-20T11:46:24Z

Features and Improvements

🎯 Cache Metrics Monitoring

Integration of Redis Exporter (port 9121) to collect Redis metrics
Integration of Memcached Exporter (port 9150) to collect Memcached metrics
Addition of Telegraf for scraping Prometheus metrics from exporters
Telegraf configuration (
Cache metrics now sent to InfluxDB every 10 seconds

📊 Enhanced Grafana Dashboard

New "Cache Hit/Miss Rate Over Time" panel showing hit/miss rates in real-time
4 new stat panels:
Redis Cache Hit Rate
Redis Cache Miss Rate
Memcached Cache Hit Rate
Memcached Cache Miss Rate

Improved visual thresholds for response times (P95 and P99):
🟢 Green (0-50ms): Excellent
🟡 Yellow (50-100ms): Good
🟠 Orange (100-150ms): Attention needed
🔴 Red (>150ms): Problematic

Fixed min/max limits for success rate gauge (0-1)

⚙️ Resource Limits
Defined CPU and memory limits for all containers:
App: 1 CPU, 1GB RAM
Redis: 0.25 CPU, 256MB RAM
Memcached: 0.25 CPU, 128MB RAM
InfluxDB: 0.5 CPU, 512MB RAM
Grafana: 0.5 CPU, 512MB RAM
K6: 0.5 CPU, 256MB RAM
Redis Exporter: 0.1 CPU, 64MB RAM
Memcached Exporter: 0.1 CPU, 64MB RAM
Telegraf: 0.25 CPU, 256MB RAM

📚 Expanded Documentation
Complete rewrite of [benchmark/README.md]
Detailed explanation of monitoring architecture
Documentation about percentiles (P95, P99) and their importance
Visual diagram of metrics flow
Troubleshooting guide for common issues
Instructions for viewing cache metrics
Explanation of visual thresholds in Grafana

🔧 Infrastructure Improvements
Upgraded InfluxDB to version 1.12.2 (previously 1.8)
Disabled Grafana default password change enforcement for development environment
Added appropriate health checks and dependencies between containers

…sholds for cache hit/miss rates

rodrigodlima added 7 commits October 19, 2025 18:14

Add resource limits for containers

c0ed508

Add resource limits for containers

42c549b

Modify doc to explain benchmark tests

d351d82

Remove files

880e8dd

Add telegraf configuration

4d68473

Update documentation

4ad6f2e

Enhance Bitonic comparison dashboard with additional metrics and thre…

cea3b1b

…sholds for cache hit/miss rates

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add comprehensive monitoring infrastructure with cache metrics and resource limits #23

Add comprehensive monitoring infrastructure with cache metrics and resource limits #23

Uh oh!

rodrigodlima commented Oct 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Add comprehensive monitoring infrastructure with cache metrics and resource limits #23

Are you sure you want to change the base?

Add comprehensive monitoring infrastructure with cache metrics and resource limits #23

Uh oh!

Conversation

rodrigodlima commented Oct 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant