fix: Simplify and remove load from /livez#39070
Conversation
|
Looks like this PR is not ready to merge, because of the following issues:
Please fix the issues and try again If you have any trouble, please check the PR guidelines |
🦋 Changeset detectedLatest commit: 259eceb The changes in this PR will be included in the next version bump. This PR includes changesets to release 41 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
WalkthroughThe changes implement Kubernetes health probe optimizations by splitting health check responsibilities: Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Suggested labels
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
apps/meteor/server/routes/health.ts (1)
116-131: Trim added implementation comments to match repo style guidance.The newly expanded endpoint rationale comments should be minimized or moved to docs to keep implementation lean.
As per coding guidelines: "Avoid code comments in the implementation".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/meteor/server/routes/health.ts` around lines 116 - 131, The added multi-line explanatory comments above the /livez handler and before the readiness probe are too verbose for in-code docs; reduce them to a single-line summary or remove them and move detailed rationale to external docs. Specifically, trim the block comment describing liveness/readiness to one concise line (e.g., "Liveness probe: verifies process responds" and "Readiness probe: verifies service readiness") or delete it around the WebApp.rawHandlers.use('/livez', ...) and the readiness probe comment area so only minimal, repo-style comments remain.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@apps/meteor/server/routes/health.ts`:
- Around line 116-131: The added multi-line explanatory comments above the
/livez handler and before the readiness probe are too verbose for in-code docs;
reduce them to a single-line summary or remove them and move detailed rationale
to external docs. Specifically, trim the block comment describing
liveness/readiness to one concise line (e.g., "Liveness probe: verifies process
responds" and "Readiness probe: verifies service readiness") or delete it around
the WebApp.rawHandlers.use('/livez', ...) and the readiness probe comment area
so only minimal, repo-style comments remain.
ℹ️ Review info
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
.changeset/optimize-health-probes-for-k8s.mdapps/meteor/server/routes/health.ts
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: cubic · AI code reviewer
- GitHub Check: CodeQL-Build
- GitHub Check: CodeQL-Build
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{ts,tsx,js}
📄 CodeRabbit inference engine (.cursor/rules/playwright.mdc)
**/*.{ts,tsx,js}: Write concise, technical TypeScript/JavaScript with accurate typing in Playwright tests
Avoid code comments in the implementation
Files:
apps/meteor/server/routes/health.ts
🧠 Learnings (4)
📓 Common learnings
Learnt from: geekgonecrazy
Repo: RocketChat/Rocket.Chat PR: 36955
File: apps/meteor/server/routes/health.ts:101-111
Timestamp: 2025-09-22T21:37:05.164Z
Learning: In Rocket.Chat's health check implementation, the /livez endpoint intentionally performs the same checks as /readyz rather than being a lightweight process check. This design ensures that pods stuck in a permanently unready state will eventually be terminated by the liveness probe, with orchestrator configuration controlling the different failure thresholds and timeouts between readiness and liveness checks.
Learnt from: ahmed-n-abdeltwab
Repo: RocketChat/Rocket.Chat PR: 38974
File: apps/meteor/app/api/server/v1/im.ts:220-221
Timestamp: 2026-02-24T19:09:09.561Z
Learning: In RocketChat/Rocket.Chat OpenAPI migration PRs for apps/meteor/app/api/server/v1 endpoints, maintainers prefer to avoid any logic changes; style-only cleanups (like removing inline comments) may be deferred to follow-ups to keep scope tight.
Learnt from: ggazzo
Repo: RocketChat/Rocket.Chat PR: 35995
File: apps/meteor/app/api/server/v1/rooms.ts:1107-1112
Timestamp: 2026-02-23T17:53:18.785Z
Learning: In Rocket.Chat PR reviews, maintain strict scope boundaries—when a PR is focused on a specific endpoint (e.g., rooms.favorite), avoid reviewing or suggesting changes to other endpoints that were incidentally refactored (e.g., rooms.invite) unless explicitly requested by maintainers.
📚 Learning: 2025-09-22T21:37:05.164Z
Learnt from: geekgonecrazy
Repo: RocketChat/Rocket.Chat PR: 36955
File: apps/meteor/server/routes/health.ts:101-111
Timestamp: 2025-09-22T21:37:05.164Z
Learning: In Rocket.Chat's health check implementation, the /livez endpoint intentionally performs the same checks as /readyz rather than being a lightweight process check. This design ensures that pods stuck in a permanently unready state will eventually be terminated by the liveness probe, with orchestrator configuration controlling the different failure thresholds and timeouts between readiness and liveness checks.
Applied to files:
.changeset/optimize-health-probes-for-k8s.mdapps/meteor/server/routes/health.ts
📚 Learning: 2026-02-24T19:09:09.561Z
Learnt from: ahmed-n-abdeltwab
Repo: RocketChat/Rocket.Chat PR: 38974
File: apps/meteor/app/api/server/v1/im.ts:220-221
Timestamp: 2026-02-24T19:09:09.561Z
Learning: In RocketChat/Rocket.Chat OpenAPI migration PRs for apps/meteor/app/api/server/v1 endpoints, maintainers prefer to avoid any logic changes; style-only cleanups (like removing inline comments) may be deferred to follow-ups to keep scope tight.
Applied to files:
.changeset/optimize-health-probes-for-k8s.md
📚 Learning: 2026-02-24T19:05:56.710Z
Learnt from: ahmed-n-abdeltwab
Repo: RocketChat/Rocket.Chat PR: 0
File: :0-0
Timestamp: 2026-02-24T19:05:56.710Z
Learning: In Rocket.Chat PRs, keep feature PRs free of unrelated lockfile-only dependency bumps; prefer reverting lockfile drift or isolating such bumps into a separate "chore" commit/PR, and always use yarn install --immutable with the Yarn version pinned in package.json via Corepack.
Applied to files:
.changeset/optimize-health-probes-for-k8s.md
🔇 Additional comments (3)
.changeset/optimize-health-probes-for-k8s.md (1)
1-9: Changeset summary is clear and release-friendly.This correctly communicates the probe behavior split and expected operational benefit.
apps/meteor/server/routes/health.ts (2)
41-42: Nice refactor: shared histogram + readiness-specific naming improves clarity.The
performReadinessChecks()rename and single histogram usage reduce duplication while keeping readiness checks explicit.Also applies to: 93-100, 134-134
120-125: The current/livezendpoint design is intentional and follows standard Kubernetes probe patterns. The code explicitly separates liveness (lightweight process check) from readiness (comprehensive dependency checks), with comments documenting this design. This architecture is correct—liveness probes should be minimal to avoid cascading failures, while readiness probes handle resource/dependency validation. The review comment incorrectly assumes/livezshould mirror/readyz, but they are intentionally different by design.Likely an incorrect or invalid review comment.
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## develop #39070 +/- ##
===========================================
+ Coverage 70.65% 70.83% +0.18%
===========================================
Files 3190 3193 +3
Lines 112732 113257 +525
Branches 20431 20584 +153
===========================================
+ Hits 79649 80225 +576
+ Misses 31035 30981 -54
- Partials 2048 2051 +3
Flags with carried forward coverage won't be shown. Click here to find out more. 🚀 New features to boost your workflow:
|
Proposed changes (including videos or screenshots)
This PR refactors the Kubernetes health probe endpoints (
/livezand/readyz) to follow SRE best practices and reduce unnecessary database load.Problem
Both
/livezand/readyzendpoints were performing identical health checks including MongoDB connectivity, memory usage, and event loop lag. This caused:Solution
Implemented the SRE standard "dumb liveness probe" pattern:
/livez(Liveness Probe)200 OKimmediately without any health checks/readyz(Readiness Probe)Code Changes
/livezendpoint to immediately return successeventLoopHistogramLiveness(only need one for readiness)performHealthChecks()→performReadinessChecks()for clarityBenefits
Issue(s)
Steps to test or reproduce
Summary by CodeRabbit
Release Notes