Skip to content

Fix pg.instanode.dev monitor: use Upptime tcp-ping schema#4

Merged
mastermanas805 merged 1 commit into
masterfrom
fix/pg-tcp-ping-monitor
May 11, 2026
Merged

Fix pg.instanode.dev monitor: use Upptime tcp-ping schema#4
mastermanas805 merged 1 commit into
masterfrom
fix/pg-tcp-ping-monitor

Conversation

@mastermanas805
Copy link
Copy Markdown
Member

Summary

Uptime CI has been recording pg.instanode.dev:5432 as down with a permanent SSL error since the 7-service-bundle expansion. The endpoint is not broken — the monitor config is. This PR rewrites that one entry to match Upptime's documented raw-TCP probe schema.

Root cause

.upptimerc.yml had:

- name: Customer Postgres (pg.instanode.dev TLS handshake)
  url: https://pg.instanode.dev:5432
  method: TCP_PING
  tcpHostPort: "pg.instanode.dev:5432"
  expectedStatusCodes: [0, 400, 401, 426]

Neither method: TCP_PING nor tcpHostPort exists in Upptime's UpptimeConfig.sites schema. They were silently ignored, leaving Upptime to do its default action: curl https://pg.instanode.dev:5432. Curl opens a TCP connection, tries a TLS handshake against the raw Postgres wire-protocol port (which speaks StartupMessage / SSLRequest, not TLS-from-byte-zero), and gets back zero bytes — hence the log line in Setup CI:

Curl attempt 1/3 returned HTTP 0: Request failed: SSL connect error
Error does not appear transient, skipping further retries

Investigation evidence

The endpoint itself is healthy:

  • nc -zv pg.instanode.dev 5432succeeded
  • dig pg.instanode.dev152.42.154.144 (the cluster LoadBalancer)
  • kubectl get svc -n ingress-nginx ingress-nginx-controller shows port 5432:30213/TCP is exposed by the LoadBalancer
  • kubectl get cm -n ingress-nginx tcp-services -o yaml routes port 5432 to instant/instant-pg-proxy:5432
  • kubectl logs -n instant deploy/instant-pg-proxy shows both replicas listening with the configured route prefix and fallback backend
  • kubectl get svc -A | grep postgres confirms the service exists in instant-data (postgres-customers) and the proxy in instant (instant-pg-proxy)

So: the port is open, the proxy is running, customer Postgres provisioning works. Only the monitor was lying.

The fix

Switch to Upptime's official raw-TCP probe shape:

- name: Customer Postgres (pg.instanode.dev TCP)
  check: "tcp-ping"
  url: pg.instanode.dev
  port: 5432

check: "tcp-ping" is the supported value in UpptimeConfig.sites[].check and is handled by a dedicated TCP-ping branch in update.ts (lines 350-426) that bypasses curl entirely. The url becomes a bare hostname, the port goes in its own field, and there is no synthetic expectedStatusCodes fallback — the check passes when the TCP connect succeeds.

Test plan

  • Merge to master — Setup CI runs on push and seeds the new site entry.
  • Trigger Uptime CI manually from the Actions tab (or wait 5 min for the next cron run).
  • Confirm history/customer-postgres-pginstanodedev-tcp.yml records status: up with a sane responseTime (single-digit ms).
  • Confirm the status page badge for the Postgres TCP row flips from red to green at the next Summary CI run (00:00 UTC, or trigger manually).
  • Optional: temporarily scale instant-pg-proxy to 0 in a test cluster and confirm the monitor flips to down with a connect failed-style error — proving the probe is actually exercising the right code path.

🤖 Generated with Claude Code

The previous config probed `https://pg.instanode.dev:5432` with
`method: TCP_PING` + `tcpHostPort`. Neither of those keys exists in
the Upptime config schema (`UpptimeConfig.sites[]` in
upptime/uptime-monitor `src/interfaces.ts`), so the action ignored
them and fell through to curl, which tried a TLS handshake against
the raw Postgres wire-protocol port and logged:

    Curl attempt 1/3 returned HTTP 0: Request failed: SSL connect error
    Error does not appear transient, skipping further retries

The endpoint itself is healthy. Port 5432 on the public LoadBalancer
IP (152.42.154.144) is exposed by `ingress-nginx` via the
`tcp-services` ConfigMap, routed to the `instant-pg-proxy` Service in
the `instant` namespace, which fronts customer Postgres pools.
`nc -zv pg.instanode.dev 5432` succeeds, and `pg-proxy` logs show
both replicas listening.

Switch to Upptime's documented raw-TCP probe shape:

    check: "tcp-ping"
    url: pg.instanode.dev
    port: 5432

No HTTPS scheme, no synthetic `expectedStatusCodes` fallback list,
no `tcpHostPort` (which Upptime doesn't read). The next Uptime CI
run should record an "up" status for this site.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@mastermanas805 mastermanas805 merged commit e20e2a2 into master May 11, 2026
@mastermanas805 mastermanas805 deleted the fix/pg-tcp-ping-monitor branch May 11, 2026 16:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant