Skip to content

deploy: optional notify_webhook URL fires HTTP POST on deploy terminal state#62

Merged
mastermanas805 merged 2 commits into
masterfrom
feat/deploy-webhook-notify-fresh
May 13, 2026
Merged

deploy: optional notify_webhook URL fires HTTP POST on deploy terminal state#62
mastermanas805 merged 2 commits into
masterfrom
feat/deploy-webhook-notify-fresh

Conversation

@mastermanas805
Copy link
Copy Markdown
Member

Summary

Adds an optional notify_webhook field to POST /deploy/new so callers can subscribe to deploy terminal-state events instead of polling. The agent supplies an https URL and (optionally) an HMAC signing secret; when the deploy reaches healthy or failed, the worker (separate PR) POSTs a payload there.

  • Migration 026 (026_deploy_webhook.sql): adds notify_webhook, notify_webhook_secret, notify_state (unset/pending/sent/failed), notify_attempts columns + partial index on (notify_state, status) WHERE notify_state='pending'
  • API change (POST /deploy/new): two new multipart fields parsed, SSRF-gated, secret AES-256-GCM encrypted at rest; new fields surfaced in the response
  • OpenAPI updated so MCP / CLI / dashboard callers can discover the contract
  • agent_action_contract_test.go extended with AgentActionNotifyWebhookInvalid

SSRF protection — which ranges we reject

The validator (validateNotifyWebhookURL) refuses any URL whose hostname resolves to (or whose literal host is in) any of:

  • IPv4 loopback 127.0.0.0/8
  • IPv4 private 10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16
  • IPv4 link-local 169.254.0.0/16 (covers AWS/GCP metadata 169.254.169.254)
  • IPv4 CGNAT 100.64.0.0/10
  • IPv4 multicast 224.0.0.0/4, broadcast 255.255.255.255, unspecified 0.0.0.0
  • IPv6 loopback ::1, unspecified ::, link-local fe80::/10, unique-local fc00::/7, multicast ff00::/8
  • IPv4-mapped IPv6 (::ffff:127.0.0.1) — re-checked as v4
  • Literal hostname localhost (and *.localhost) before any DNS lookup

The mixed-record DNS dodge is caught: a hostname resolving to [8.8.8.8, 10.0.0.5] is rejected because ANY resolved IP being blocked rejects the whole URL.

Worker dispatcher is a separate PR

This PR only persists the fields. The worker job that scans WHERE notify_state='pending' AND status IN ('healthy','failed') and POSTs to the URL lives in the worker repo and is a follow-up — its contract:

  • 2xx → mark notify_state='sent'
  • 4xx → mark notify_state='failed' (no retry — user URL is broken)
  • 5xx / network → leave notify_state='pending', bump notify_attempts, give up after 3
  • HMAC: when notify_webhook_secret is set, decrypt it and include X-InstaNode-Signature: sha256=<hex(hmac(secret, body))>
  • Payload: {event: 'deploy.healthy' | 'deploy.failed', deploy_id, app_id, url, commit_id, build_time, duration_s, error_message?}

Test plan

  • make test-unit green (entire workspace; pre-existing TestAdminList_AdminUserSees200 flake is unrelated and reproduces on master)
  • TestAgentActionContract green — new AgentActionNotifyWebhookInvalid satisfies the U3 contract (227 chars, < 280)
  • 57 new test cases across validateNotifyWebhookURL and the HTTP handler:
    • URL accepted (https + public IP) → 202, notify_state='pending'
    • URL absent → 202, notify_state='unset' (backward compat)
    • http:// scheme → 400 + agent_action
    • Private IP literal / localhost → 400 + agent_action (SSRF gate)
    • Mixed-record DNS resolution → rejected
    • Plaintext secret never appears in the deployments row (AES round-trip)
  • Verify against the live deployed URL after merge (per project convention — local tests green ≠ user-visible reality)

🤖 Generated with Claude Code

…l state

Adds a new POST /deploy/new field that lets callers subscribe to deploy
terminal-state events instead of polling GET /deploy/:id. When the deploy
reaches 'healthy' or 'failed', a worker job (separate PR) will POST a
payload to the supplied URL, optionally signed with HMAC-SHA256.

Migration 026 adds four columns to deployments: notify_webhook,
notify_webhook_secret (AES-256-GCM at rest), notify_state ('unset' /
'pending' / 'sent' / 'failed'), notify_attempts. A partial index on
(notify_state, status) WHERE notify_state='pending' keeps the worker
scan cheap.

SSRF gate enforces https-only scheme and rejects hostnames resolving to
loopback / RFC1918 / link-local (incl. AWS/GCP metadata 169.254.169.254)
/ CGNAT / multicast / IPv6 unique-local. Mixed-record DNS attacks are
caught: if ANY resolved IP is in a blocked range, the URL is rejected.

This PR only persists the fields; the worker-side dispatcher lives in
the worker repo and is a separate follow-up.
…-notify-fresh

# Conflicts:
#	internal/handlers/agent_action_contract_test.go
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant