fix(mcp-sse): decouple /health liveness from upstream readiness by George-RD · Pull Request #151 · verygoodplugins/automem

George-RD · 2026-04-24T13:16:23Z

Summary

PR #114's /health endpoint returns HTTP 503 whenever upstream AutoMem reports anything other than "healthy" — including any non-zero FalkorDB/Qdrant sync drift. Railway treats 503 as unavailable and blocks the deploy, so the SSE bridge becomes undeployable during brief upstream degradation even though the Node process itself is perfectly able to serve traffic.

In an eventually-consistent system, drift between FalkorDB writes and Qdrant embedding is normal under write load. Gating container liveness on zero drift causes the bridge to fail to deploy whenever writes are happening — defeating the point of a bridge (which should be able to serve errors upstream when its backend is unavailable).

This PR:

GET /health now always returns 200 when the Node process is able to serve HTTP. The response body still carries upstream status (status, upstream, upstream_error, upstream_details) for observability, so the richer info the PR fix: harden MCP bridge resilience, adopt stateless transport, and update cross-client docs #114 change wanted to expose is still there — just not as HTTP status.
GET /ready (new) keeps the old strict behavior: 200 iff upstream is healthy, 503 otherwise. For orchestrators that explicitly want to gate traffic on upstream availability (e.g. Kubernetes readiness probes), this is the right endpoint.
Railway's healthcheckPath should remain /health.

Test plan

Existing health tests updated — the "upstream unreachable" test now asserts 200 + degraded body (was 503)
New tests for /ready: 200 when upstream healthy, 503 when unreachable
All 13 node tests pass locally (node --test test/server.test.js)

Context

Surfaced while trying to ship the release 0.15.1 deployment on Railway — build succeeded, but every deploy was killed by healthcheck failing with 503, even though the Node process was up and the upstream was only showing minor sync drift (17 in-flight memories out of 29,012 — 0.06%, and self-healing). The same failure mode will happen on any deployment where upstream is temporarily degraded for any reason, which this PR addresses.

🤖 Generated with Claude Code

The /health endpoint introduced in verygoodplugins#114 returned HTTP 503 whenever the upstream AutoMem service reported anything other than "healthy" (e.g. FalkorDB/Qdrant disconnected, sync drift). Railway treats 503 as unavailable and blocks the deploy — so the SSE bridge became un-deployable any time its upstream was even partially degraded, even though the bridge itself was perfectly able to serve traffic and return useful errors. Split liveness from readiness: - GET /health now always returns 200 when the Node process is able to serve HTTP. Upstream status is still reported in the response body (status, upstream, upstream_error, upstream_details) for observability. - GET /ready keeps the strict behavior: 200 iff upstream is healthy, 503 otherwise. Use this for orchestrators that should gate traffic on upstream availability. Railway's healthcheckPath should remain /health. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

Adjusts the MCP SSE bridge’s health semantics so container liveness is not blocked by transient upstream AutoMem degradation, while still providing a strict upstream-gated readiness signal for orchestrators.

Changes:

Changed GET /health to always return HTTP 200 while still reporting upstream status in the JSON body.
Added GET /ready to return 200 only when upstream is healthy, otherwise 503.
Updated and expanded Node tests to cover the new /health and /ready behaviors.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File	Description
mcp-sse-server/server.js	Makes `/health` always 200 and introduces `/ready` with upstream-gated status codes.
mcp-sse-server/test/server.test.js	Updates existing `/health` test expectations and adds `/ready` tests for healthy/unreachable upstream.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0429796502

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Address Copilot review on PR verygoodplugins#151: - /health: replace await with void so the first request never blocks on an upstream probe. The body still reports upstream/checked_at for observability; status is always 200 while the process can serve HTTP. - /ready: drop the checked_at guard so readiness always reflects current upstream state. healthProbePromise dedups concurrent hits, so upstream load is bounded by orchestrator cadence rather than per-request fan-out. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

jack-arturo · 2026-04-24T15:21:21Z

Good catch @George-RD , thanks!!

Copilot AI review requested due to automatic review settings April 24, 2026 13:16

Copilot started reviewing on behalf of George-RD April 24, 2026 13:16 View session

Copilot AI reviewed Apr 24, 2026

View reviewed changes

Comment thread mcp-sse-server/server.js

Comment thread mcp-sse-server/server.js Outdated

chatgpt-codex-connector Bot reviewed Apr 24, 2026

View reviewed changes

Comment thread mcp-sse-server/server.js Outdated

jack-arturo merged commit 5bcfb8b into verygoodplugins:main Apr 24, 2026
6 checks passed

This was referenced Apr 24, 2026

chore(main): release 0.15.3 #154

Open

Deploy on Railway Keeps Failing verygoodplugins/mcp-automem#96

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(mcp-sse): decouple /health liveness from upstream readiness#151

fix(mcp-sse): decouple /health liveness from upstream readiness#151
jack-arturo merged 2 commits intoverygoodplugins:mainfrom
George-RD:fix/sse-health-liveness-probe

George-RD commented Apr 24, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

jack-arturo commented Apr 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

George-RD commented Apr 24, 2026

Summary

Test plan

Context

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

jack-arturo commented Apr 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants