Skip to content
This repository was archived by the owner on Jun 2, 2026. It is now read-only.

feat: add Prometheus and Grafana monitoring stack#58

Merged
5000user5000 merged 1 commit into
devfrom
feature/production-like-monitoring
May 30, 2026
Merged

feat: add Prometheus and Grafana monitoring stack#58
5000user5000 merged 1 commit into
devfrom
feature/production-like-monitoring

Conversation

@5000user5000
Copy link
Copy Markdown
Owner

Summary

Adds a production-like observability stack for the local Docker Compose environment.

This introduces Spring Boot Actuator and Prometheus metrics for the backend, plus Prometheus, Grafana, PostgreSQL exporter, cAdvisor, and Nginx exporter in deploy/production-like.

Changes

  • Add backend Actuator and Micrometer Prometheus dependencies.
  • Expose backend health, readiness, metrics, and Prometheus endpoints.
  • Allow unauthenticated access to Actuator health and Prometheus scrape endpoints.
  • Change production-like backend healthcheck to use Actuator readiness.
  • Add Prometheus scrape config for:
    • backend
    • PostgreSQL exporter
    • cAdvisor
    • Nginx exporter
  • Add Grafana datasource and dashboard provisioning.
  • Add a production-like dashboard covering:
    • service health
    • API traffic
    • backend runtime
    • database metrics
    • container metrics
    • Nginx proxy metrics
  • Update production-like deployment documentation.

Validation

  • cd backend && mvn clean verify
  • cd frontend && npm run test:coverage
  • docker-compose -f deploy/production-like/compose.yml up -d --build --scale backend=2
  • Verified Prometheus targets are UP:
    • cloud-native-backend
    • cloud-native-postgres
    • cloud-native-containers
    • cloud-native-nginx
  • Verified Grafana dashboard provisioning.
  • Verified root SonarQube Quality Gate passes.

@vercel
Copy link
Copy Markdown

vercel Bot commented May 30, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
cloud-native Ready Ready Preview, Comment May 30, 2026 4:25am

@5000user5000
Copy link
Copy Markdown
Owner Author

監控項目

Service Health

Backend up
PostgreSQL up
Nginx exporter up
Container metrics up

目的:先確認系統主要元件是否還活著。這是故障排查第一步。

API Traffic

HTTP request rate
HTTP 5xx error rate
Average HTTP latency
Nginx request rate

目的:觀察使用者流量、API 是否變慢、是否出現 server error。這些最直接反映使用者體驗。

Backend Runtime

JVM memory used
Process CPU usage
HikariCP active / idle connections

目的:確認 Spring Boot backend 本身是否有資源壓力,例如 memory 過高、CPU 過高、DB connection pool 快被用滿。

Database

PostgreSQL connections
PostgreSQL commits / rollbacks

目的:觀察 DB 是否可用、連線數是否異常、交易是否正常進行。DB 是系統核心依賴,壞掉會直接影響大部分 API。

Container And Proxy

Container CPU usage
Container memory usage
Nginx active / reading / writing / waiting connections

目的:觀察 Docker container 層的資源使用,以及 reverse proxy 是否有連線壓力。這可以幫助判斷問題是在 app、DB、container resource,還是 gateway/proxy 層。

報告說法

我們先監控服務是否可用,再監控使用者會直接感受到的 API latency 和 error rate,接著觀察 backend runtime、database、container resource,以及 reverse proxy。這樣可以判斷問題是發生在應用程式、資料庫、容器資源,還是流量入口層。

Copy link
Copy Markdown
Owner Author

@5000user5000 5000user5000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

這先不會 merge 到 main
主要作為觀察力的展示

@5000user5000 5000user5000 merged commit f8129cc into dev May 30, 2026
4 checks passed
@5000user5000 5000user5000 deleted the feature/production-like-monitoring branch May 30, 2026 04:27
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant