Skip to content

Fix Upptime pipeline: expand to 7-service bundle, document setup#2

Merged
mastermanas805 merged 1 commit into
masterfrom
fix/upptime-pipeline
May 11, 2026
Merged

Fix Upptime pipeline: expand to 7-service bundle, document setup#2
mastermanas805 merged 1 commit into
masterfrom
fix/upptime-pipeline

Conversation

@mastermanas805
Copy link
Copy Markdown
Member

Summary

Brings status.instanode.dev from "renders the README" to a real Upptime status board. Three concrete fixes, plus docs for two human-ops steps that only a repo admin can do.

What's broken today

curl https://status.instanode.dev returns HTTP/2 200 — but the body is the rendered README.md, not a status board. Root causes from gh run list:

Workflow Status Reason
Uptime CI (every 5 min) green works — history/*.yml is being populated
Static Site CI red upptime/status-page@master action does not exist (404)
Summary CI red tries to push regenerated README.md to master, blocked by enforce_admins: true branch protection
Graphs CI red same push-blocked failure
Response Time CI red same push-blocked failure

Because Summary CI has never succeeded, README.md was never rewritten to embed the badge table / uptime % / graphs — so the Jekyll-rendered README is what Pages serves.

Meanwhile our HN draft, PH draft, dashboard MarketingPage.tsx footer, and instant-lite-web/llms-full.txt all point users at this URL. The HN draft itself flags this:

If status.instanode.dev is not yet deployed — it's marked pending in TASKS.md — this answer is dishonest.

What this PR changes

  • .upptimerc.yml — expand from 3 monitored surfaces to 11, mirroring the actual bundle the marketing copy promises:
    • Public web: marketing, agent /healthz, dashboard, /openapi.json
    • Seven provisioning endpoints: /db/new, /cache/new, /nosql/new, /queue/new, /storage/new, /webhook/new, /deploy/new. POST-only routes are probed by GET and accept 405 as the success signal (the handler is wired, the router is up). /deploy/new accepts 401 (auth-gated).
    • Customer Postgres TLS handshake on pg.instanode.dev:5432 (kept from previous config).
  • .github/workflows/static-site.yml — deleted. upptime/status-page@master returns 404; that's why every Static Site CI run has failed since the repo was scaffolded. Modern Upptime uses the Jekyll-rendered README.md as the status page — no separate static-site step.
  • README.md — updated to list the new 11 surfaces and link to SETUP.md. (Note: Summary CI will overwrite this file once human-ops below is done, with an auto-generated badge table. The hand-written copy here is the fallback that ships until Summary CI starts succeeding.)
  • SETUP.md — new file documenting the two human-ops steps below.

Required human-ops (admin-only, ~5 min)

These cannot be done in code. Both are documented in SETUP.md.

  1. Add a GH_PAT secret (classic PAT, repo scope). All four workflows already reference secrets.GH_PAT || secrets.GITHUB_TOKEN so no workflow edit needed. The default GITHUB_TOKEN cannot bypass branch protection — that's the whole point of protection. A PAT can.
  2. Bypass branch protection for that PAT. Either disable enforce_admins on master (gh api -X DELETE repos/InstaNode-dev/instant-status/branches/master/protection/enforce_admins) or add the PAT owner as a bypass actor.

Once both are done, manually run Setup CI to seed the badge dirs, then Summary CI to rewrite the README. Future runs are automatic (every 5 min for probes, daily for README/graphs/response-time).

Test plan

  • Merge this PR (workflow edits + config + docs only)
  • Add GH_PAT secret as documented in SETUP.md
  • Disable enforce_admins on master branch protection
  • Manually trigger Actions → Setup CI → Run workflow
  • Manually trigger Actions → Summary CI → Run workflow
  • Verify curl -s https://status.instanode.dev | grep -c 'shields.io' returns > 0 (one badge per service)
  • Verify gh run list --limit 10 shows green for Summary/Graphs/Response Time on next 00:00 UTC run

Out of scope

  • Dashboard's /status page (dashboard/src/pages/StatusPage.tsx) — kept as-is. It serves a different purpose: in-product client-side probes, no historical data, no incident issues. Marketing footer already deep-links to status.instanode.dev, so no dashboard change needed.
  • Marketing/HN/PH copy edits — once this PR lands and the two ops steps are done, the "status page is at status.instanode.dev" line in the HN draft is truthful.

🤖 Generated with Claude Code

…c-site workflow

- Expand .upptimerc.yml from 3 → 11 monitored surfaces to match the actual
  agent-facing bundle promised in HN/PH/marketing copy: postgres, redis,
  mongodb, queue, storage, webhook, deploy provisioning endpoints, plus
  marketing site, dashboard, agent healthz, OpenAPI spec, and the
  pg.instanode.dev TLS handshake. POST-only routes are probed by GET and
  accept 405 as the success signal. /deploy/new accepts 401 (auth-gated).
- Remove .github/workflows/static-site.yml — the upstream action
  upptime/status-page@master no longer exists, which is why every Static
  Site CI run has failed. The Jekyll-rendered README.md is the modern
  Upptime default and is what status.instanode.dev serves.
- Document the two human-ops steps blocking Summary/Graphs/Response-Time
  workflows: add a GH_PAT secret with repo scope, and either turn off
  enforce_admins on master branch protection or add the PAT owner as a
  bypass actor. Without those, the auto-generated badge table and graphs
  never land in README.md, which is why the page currently renders as a
  plain README instead of a status board.

Refs gtm-ops/TASKS.md item U.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@mastermanas805 mastermanas805 merged commit 13db8b7 into master May 11, 2026
@mastermanas805 mastermanas805 deleted the fix/upptime-pipeline branch May 11, 2026 16:09
mastermanas805 added a commit that referenced this pull request May 11, 2026
The hand-written README from PR #2 had no <!--start: status pages-->
markers, so Summary CI's 'readme' command had nowhere to inject
the auto-generated badge table — it ran successfully (per CI logs)
but the commit was a no-op.

This restores the standard Upptime template shape with the
start/end markers Upptime expects. Next Summary CI run will fill
in the badge board between them.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant