Skip to content

fix(health): gate memory severity on RSS floor#160

Merged
rohitg00 merged 2 commits intomainfrom
fix/158-health-rss-gate
Apr 18, 2026
Merged

fix(health): gate memory severity on RSS floor#160
rohitg00 merged 2 commits intomainfrom
fix/158-health-rss-gate

Conversation

@rohitg00
Copy link
Copy Markdown
Owner

@rohitg00 rohitg00 commented Apr 18, 2026

Fixes #158.

Problem

evaluateHealth decided memory severity from heapUsed / heapTotal alone. Node's V8 heap naturally fills its current allocation, so a small steady-state process with ~45 MB heap, ~46 MB heapTotal, ~120 MB RSS reported memory_critical_97% while the service was healthy and the host had plenty of RAM.

Fix

Require two signals to degrade memory status: high heap ratio AND RSS above memoryRssFloorBytes (default 512 MB, configurable). When heap ratio is high but RSS is still small, record a non-alerting memory_heap_tight_NN%_rssMMmb note so the info is captured without inflating status.

Tests

test/health-thresholds.test.ts — four cases:

  • Reporter's live example (45/46 MB heap, 120 MB RSS) → healthy, carries memory_heap_tight_*.
  • 970/1000 MB heap, 1100 MB RSS → critical.
  • 850/1000 MB heap, 900 MB RSS (RSS floor 800 MB) → degraded.
  • Honors caller-supplied memoryRssFloorBytes.

All 758 existing tests still pass.

Summary by CodeRabbit

  • New Features

    • Introduced an RSS memory floor threshold (default 512 MB) for refined memory health monitoring.
    • Memory critical and warning alerts now require RSS to meet the configured floor before triggering.
    • Added new heap-only "memory_heap_tight" alerts to surface heap pressure when RSS is below the floor.
  • Tests

    • Added comprehensive tests covering memory health outcomes across RSS/heap scenarios and custom floor values.

Previously heapUsed/heapTotal ratio alone decided memory severity, so
a small steady-state Node process whose V8 heap naturally fills its
46 MB allocation could report memory_critical_97% while the RSS sat
at 120 MB and the host had plenty of free RAM.

Require a second signal before degrading overall health: heap ratio
above threshold AND RSS above memoryRssFloorBytes (default 512 MB).
When heap is tight but RSS is below the floor, record a non-alerting
memory_heap_tight_NN% note so the info is still visible in the
snapshot without inflating status to degraded/critical.

New tests cover the reporter's live-deployment example plus the
critical / degraded / healthy branches around the RSS floor.
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 18, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d942ec7c-1f82-45a4-acd0-690b9c9fadfb

📥 Commits

Reviewing files that changed from the base of the PR and between c57acd7 and 0f09e97.

📒 Files selected for processing (2)
  • src/health/thresholds.ts
  • test/health-thresholds.test.ts
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/health/thresholds.ts
  • test/health-thresholds.test.ts

📝 Walkthrough

Walkthrough

Added a configurable RSS floor to memory health evaluation (default 512 MiB). Critical/warn memory alerts now require both heap ratio thresholds and RSS at/above the floor; high heap with low RSS emits a new heap-only memory_heap_tight_* alert without changing overall status.

Changes

Cohort / File(s) Summary
Health threshold logic
src/health/thresholds.ts
Added memoryRssFloorBytes to ThresholdConfig and DEFAULTS (512 * 1024 * 1024). Memory evaluation now reads snapshot.memory.rss, computes rssAboveFloor, appends _rss{memMb}mb to alert names, gates memory_critical_* and memory_warn_* on RSS >= floor, and emits a new heap-only memory_heap_tight_{%}_rss{memMb}mb alert when heap warns but RSS is below the floor (without setting critical/degraded).
Tests
test/health-thresholds.test.ts
New Vitest suite exercising evaluateHealth across scenarios: low heap/low RSS (healthy, no critical/warn, may include memory_heap_tight_), high heap with RSS above floor (critical + memory_critical_), warn-level heap with RSS below floor (healthy + memory_heap_tight_), warn-level heap with RSS above floor (degraded + memory_warn_), and caller-supplied memoryRssFloorBytes affecting outcomes.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 I nibble through the memory seam,

RSS floors guard my dream.
Heap may crowd its tiny bed,
But alarms now sleep instead.
A tighter hop, a gentler chime—alerts wait for RSS and time.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: gating memory severity on RSS floor to prevent false critical alerts from high heap ratio alone.
Linked Issues check ✅ Passed The PR fully addresses issue #158 by requiring both high heap ratio AND elevated RSS to trigger critical/degraded alerts, with heap-tight visibility via non-alerting notes.
Out of Scope Changes check ✅ Passed All changes are directly related to fixing the memory severity gating issue: threshold config updates, health evaluation logic modifications, and comprehensive test coverage.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/158-health-rss-gate

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
test/health-thresholds.test.ts (1)

1-79: LGTM — coverage for the new RSS-floor branching looks good.

The four cases cleanly exercise: heap-tight under the floor, critical above the floor, warn above a caller-supplied floor, and both directions of caller-supplied memoryRssFloorBytes. The snap() helper is a nice minimal fixture.

Optional: consider adding an assertion on the exact memory_heap_tight_... suffix (e.g. _rss120mb) in the issue-#158 case to lock in the message format that operators/dashboards will key off of.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/health-thresholds.test.ts` around lines 1 - 79, Add a stricter assertion
in the "stays healthy when heap fills a tiny steady-state process (issue `#158`)"
test: after calling evaluateHealth(s) (using the snap helper) assert that alerts
includes the exact memory_heap_tight_... token you expect (e.g., the suffix
"_rss120mb") rather than only checking startsWith; update the test that
references evaluateHealth and the alerts array to verify the precise alert
string so downstream dashboards/operators can rely on the exact message format.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/health/thresholds.ts`:
- Around line 68-76: The current else-if only emits memory_heap_tight when
memPercent > cfg.memoryCriticalPercent, leaving no alert when memPercent is in
the warn band but rssAboveFloor is false; update the condition for the
memory_heap_tight branch in src/health/thresholds.ts (the block referencing
memPercent, cfg.memoryWarnPercent, cfg.memoryCriticalPercent, rssAboveFloor,
alerts, memMb, critical and degraded) to fire when memPercent >
cfg.memoryWarnPercent && !rssAboveFloor (i.e., cover warn and critical heap
ratios when RSS is below the floor) so a memory_heap_tight... alert is pushed
without changing degraded/critical flags.

---

Nitpick comments:
In `@test/health-thresholds.test.ts`:
- Around line 1-79: Add a stricter assertion in the "stays healthy when heap
fills a tiny steady-state process (issue `#158`)" test: after calling
evaluateHealth(s) (using the snap helper) assert that alerts includes the exact
memory_heap_tight_... token you expect (e.g., the suffix "_rss120mb") rather
than only checking startsWith; update the test that references evaluateHealth
and the alerts array to verify the precise alert string so downstream
dashboards/operators can rely on the exact message format.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 0eddd642-dbfe-400e-a7f9-029a706e6086

📥 Commits

Reviewing files that changed from the base of the PR and between 329e7ca and c57acd7.

📒 Files selected for processing (2)
  • src/health/thresholds.ts
  • test/health-thresholds.test.ts

Comment thread src/health/thresholds.ts
CodeRabbit flagged that the 'visibility without inflating severity'
goal was only met at critical-level heap. When heap sat in the warn
band (80-95%) with RSS below the floor, no alert was produced at
all, so operators got zero signal that heap pressure was building.

Widen the heap_tight branch to cover the full memoryWarnPercent+
band. New test: 85%% heap / 50 MB RSS stays healthy but records
memory_heap_tight_*.
@rohitg00 rohitg00 merged commit 3037dd2 into main Apr 18, 2026
3 checks passed
@rohitg00 rohitg00 deleted the fix/158-health-rss-gate branch April 18, 2026 10:46
rohitg00 added a commit that referenced this pull request Apr 18, 2026
Bump version + ship CHANGELOG covering everything that merged since
v0.8.13:

- #118 security advisory drafts for v0.8.2 CVEs
- #132 semantic eviction routing + batched retention audit
- #157 iii console docs + vendored screenshots in README
- #160 (#158) health gated on RSS floor
- #161 (#159) standalone MCP proxies to the running server
- #162 (#125) mem::forget audit coverage + policy doc
- #163 (#62) @agentmemory/fs-watcher filesystem connector
- #164 Next.js website (website/ root, ship to Vercel)

Version bumps (8 files):
- package.json / package-lock.json (top + packages[''])
- plugin/.claude-plugin/plugin.json
- packages/mcp/package.json (self + ~0.9.0 dep pin)
- src/version.ts (union extended, assigned 0.9.0)
- src/types.ts (ExportData.version union)
- src/functions/export-import.ts (supportedVersions set)
- test/export-import.test.ts (export assertion)

Tests: 777 passing. Build clean.
@rohitg00 rohitg00 mentioned this pull request Apr 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Health may report critical from heap ratio alone when RSS is still low

1 participant