Adversarial review + finalize for marketplace submission#2
Conversation
Captures the baseline state of the anon Pumble OAuth-app before the adversarial review starts: identity (pumble-sdk@1.1.1 surface), git state at d4e9bb2, manifest claims, code layout, 9-step forward-only migrator, 23 test files / 150 passing tests, CI matrix shape, outbound HTTP call sites, secret-hygiene scan results, per-handler ack discipline, and where signature verification lives. Records five snapshot-only candidate findings (D-1 dangling AUDIT.md reference, D-2 tracked .pumble* files, D-3 stale test count claim in sibling docs, C-4 incomplete CI dist-migrations assertion, L-1 lint not in CI) for the Phase 2 review to classify as OK / GAP / FLAG.
Records OK/GAP/FLAG verdicts across identity, webhook signature, ack discipline, field-name discipline, secret hygiene, transport, authorization, manifest correctness, error model, tests/gates, and documentation drift. Counts: - 24 OK (every multi-workspace isolation, redaction, ack, scope, and secret-file leak probe passes) - 7 GAPs to fix in Phase 3 (D-1..D-7): manifest hostname guard, signature probes, error categorisation, lint-in-CI, complete dist-migrations check, inline AUDIT H-2 sketch, CHANGELOG - 5 FLAGs (S-2..S-5, T-1) — all upstream pumble-sdk gaps that cannot be fixed inside this addon without forking the SDK (no timestamp staleness check, non-constant-time HMAC compare, no body size cap, unguarded JSON.parse, no axios timeout)
Add 5 integration probes against the SDK's rawBody + verifySignature middleware pair, mounted exactly the way AddonHttpListener does in production. Covers the cases the addon's threat model depends on: - valid signature → 2xx passes through to handler - missing signature header → 403 - missing timestamp header → 403 - tampered body (signed original, sent modified) → 403 - signed with wrong secret → 403 Why: the addon delegates all inbound signature work to pumble-sdk; if a future SDK release silently accepts an unsigned request or ignores HMAC mismatch, this CI gate fails immediately rather than the regression landing in production. Stale-timestamp, body-size cap, and malformed-JSON probes are NOT included — these are FLAGged upstream SDK gaps (S-2, S-4, S-5 in docs/reviews/2026-05-24-adversarial-review.md) that cannot be verified from inside this repo without a forked SDK. The reasons are documented in the test header so a future reader doesn't add them by accident.
Add a single-file categorizeError() helper and use it in the three
catch sites where the addon talks to Pumble (anon command's shared
reportSendError, reportAnon button, anon_reply_modal submit).
Categories: network | api-4xx | api-5xx | unknown, with an explicit
retryable flag and the HTTP status when relevant. 429 is treated as
api-4xx + retryable so an operator querying audit_log for
transient-vs-permanent failures gets the right bucket.
Why: pre-D-3 the catch sites all recorded {outcome: "send-failed",
err: err.message}. A network timeout, a 4xx auth error, and a 5xx
backend outage are operationally distinct, but the audit log shape
made them indistinguishable without parsing err.message. The new
transport field is queryable directly.
The helper has its own 5-case test suite. The three integration
sites are covered by the existing handler tests; coverage rose to
94.66 / 84.01 / 95.17 / 95.27 from 92.41 / 81.08 / 93.75 / 92.96.
Replace the hand-rolled 001..005 file existence list with a shell loop over src/db/migrations/*.sql. Migrations 006, 007, 008, 009 already exist on disk and were missing from the assertion — a build regression that dropped any of them silently from the dist artifact would have passed CI. Verified locally: deleting dist/db/migrations/009_*.sql causes the new loop to exit 1 with a clear ::error:: annotation; restoring it returns the loop to "all 9 migrations copied".
Add an eslint step on the Node 20.x matrix entry. The local `npm run lint` already exits 0, but it was not part of CI, so a PR that re-introduces `any` or otherwise trips the existing no-explicit-any rule could land green. Pinned to one matrix entry (20.x) so the lint result is unambiguous — every Node version produces the same eslint output and gating on both would just slow the matrix without catching extra defects.
Add `npm run verify:manifest` that walks every URL in manifest.json and exits 1 if any of them is still a placeholder host (anon.example.com, example.com, localhost, 127.0.0.1, 0.0.0.0), a dev tunnel (.ngrok.io, .ngrok-free.app, .trycloudflare.com), or plain http:// (non-TLS). The repo intentionally ships manifest.json with anon.example.com placeholders so a fork starts from a known template, so this gate is NOT in the upstream CI — it would always fail. Operators run it before submitting their fork's manifest to the Pumble marketplace, and forks can opt into running it in their own CI. Verified end-to-end: - against shipped manifest.json: exit 1, prints 5 placeholder URLs - against a synthetic prod-shape manifest: exit 0 Five-case test covers placeholder rejection, prod acceptance, localhost / 127.0.0.1 / ngrok rejection, http:// non-TLS rejection, and the PLACEHOLDER_HOSTS export contract. The README update (surfacing this requirement in the deploy transcript) lands in Phase 4 with the other doc-cluster commits.
Make the marketplace-submission requirement loud in the Production deploy transcript: - Add a callout block above the deploy steps describing the placeholder template + the required `npm run verify:manifest` check. - Insert step 2: `npm run verify:manifest` between `pumble-cli pre-publish` and `npm ci`. - Document the new `npm run verify:manifest` in the Scripts table. - Update the CI summary line to reflect the new lint step (D-4) and the glob-driven migrations check (D-5).
Two cleanups to SECURITY.md following the 2026-05-24 adversarial review (see docs/reviews/2026-05-24-adversarial-review.md): 1. Replace the dangling "AUDIT.md finding H-2 Option B" pointer with the actual AES-256-GCM token-encryption design sketch. AUDIT.md was removed in commit 128bc82 along with other internal AI/agent docs; the H-2 design notes belong in SECURITY.md anyway because that is the document operators audit before listing on the marketplace. 2. Add a new "Upstream pumble-sdk webhook hardening" section covering S-2..S-5 and T-1 from the adversarial review. Each flag is paired with: - the SDK file/line where the gap lives - the operator-level mitigation the addon's reverse proxy / platform must apply until the SDK ships its own fix - the recommended upstream PR shape to CAKE-com/pumble-node-sdk These five are SDK-boundary concerns the addon cannot patch without forking the SDK; documenting the operator mitigations unblocks marketplace submission while the upstream PRs land.
Initial CHANGELOG follows the Keep a Changelog format. The Unreleased section summarises every commit that landed on the adversarial-review branch, grouped Security / Reliability / CI / Tests / Docs, with each bullet linking back to the GAP letter in docs/reviews/2026-05-24-adversarial-review.md so the audit trail stays navigable.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d21328a2a4
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
|
||
| function classify(url) { | ||
| if (!url || typeof url !== "string") return "ok"; | ||
| if (url.startsWith("http://")) return "non-tls"; |
There was a problem hiding this comment.
Enforce HTTPS using URL protocol check
The non-TLS guard only checks url.startsWith("http://"), so insecure or unsupported schemes can slip through as "ok" (for example HTTP://... or ftp://...). Since this script is meant to block non-production-safe manifest URLs before marketplace submission, these false negatives undermine the safety gate and can let invalid endpoint schemes pass validation.
Useful? React with 👍 / 👎.
| // Relative URLs like "/hook" are not in the production manifest; | ||
| // ignore them rather than treating as placeholders. | ||
| return "ok"; |
There was a problem hiding this comment.
Reject malformed manifest URLs instead of ignoring them
If new URL(url) throws, the validator currently returns "ok", which treats malformed values as valid. This means broken URL fields (e.g. typos like https://) are silently accepted, even though the script is intended to guarantee production-safe manifest URLs before submission.
Useful? React with 👍 / 👎.
Close four upstream pumble-sdk middleware gaps inside the addon
itself, not at the operator's reverse proxy. Mounted via
onServerConfiguring(express, addon) which runs before app.listen,
so the splice into express._router.stack happens deterministically.
The hardened pipeline:
- S-2 reject `x-pumble-request-timestamp` outside ±5 minutes (403)
- S-3 HMAC compared with crypto.timingSafeEqual after explicit
buffer-length check, replacing the SDK's `!==`
- S-4 raw body capped at 1 MiB (configurable); 413 +
Connection: close before any HMAC work; remaining bytes
drained via req.resume() so supertest sees clean responses
- S-5 JSON.parse wrapped in try/catch; malformed → 400 instead of
throwing inside the stream-end callback (which the SDK swallowed
into the addon's uncaughtException handler before)
Dispatcher mirrors AddonHttpListener.handleMessage exactly, calling
the SDK service's public post* methods (postSlashCommand,
postBlockInteractionView/Message/EphemeralMessage, postViewAction,
postEvent, postGlobalShortcut, postMessageShortcut,
postDynamicSelectMenu) so handler semantics are unchanged from the
operator's perspective.
26-case suite covers happy-path dispatch for every messageType,
every signature failure mode, the body cap, both malformed-JSON
locations (outer envelope + inner PUMBLE_EVENT body), and the
router-splice contract. 5 stability runs all pass.
The single private-API touch is express._router.stack, isolated to
removeSdkWebhookRoute() in src/http/hardenedWebhook.ts. Documented
in SECURITY.md.
Coverage after this commit: 94.29 / 84.45 / 95.09 / 95.00
(thresholds 90 / 80 / 92 / 90).
pumble-sdk's ApiClient constructs its outbound axios instance with
`axios.create({ baseURL, headers })` — no `timeout` option, so a
hanging Pumble response could block the handler forever.
main.ts now imports axios at the very top and sets
`axios.defaults.timeout = 30_000` BEFORE importing pumble-sdk.
axios.create() snapshots defaults at call time and the SDK builds
its instance only on the first per-request lookup, so the 30 s
default propagates to every outbound Pumble call without needing
to wrap every call site or fork the SDK.
The 2-case axiosTimeout.test.ts pins the axios contract this
relies on (defaults propagate to axios.create when no per-call
timeout override is given; per-call override still wins).
After Phase 6 lands the hardened webhook + axios default timeout: - SECURITY.md "Upstream pumble-sdk webhook hardening" section is rewritten as "Webhook hardening layer (closes S-2..S-5, T-1)"; the operator-action table is replaced by a what-defends-what table pointing at src/http/hardenedWebhook.ts and src/main.ts. Suggested upstream PR shape stays so other Pumble apps can benefit; the addon's local fix is not a substitute for upstream. - docs/reviews/2026-05-24-adversarial-review.md "Out of scope" section marks S-2..S-5 + T-1 as CLOSED with strikethrough and the closing-commit references. Summary table updates "FLAG: 5" to "FLAG → CLOSED: 5". - CHANGELOG.md "Unreleased / Security" section leads with the webhook hardening + axios timeout bullets. "Tests" section adds the 26-case hardenedWebhook + 2-case axiosTimeout suites.
`npm audit fix` against production deps (the same set CI runs with `--omit=dev --audit-level=high`). All bumps stay within the declared compat ranges in package.json — no package.json change required. Lockfile changes: - axios 1.15.0 → 1.16.1 (12 CVEs incl. proto-pollution + SSRF + DoS) - body-parser 1.20.4 → 1.20.5 - express 4.22.1 → 4.22.2 - fast-uri 3.1.0 → 3.1.2 (2 CVEs: path traversal + host confusion) - follow-redirects 1.15.11→ 1.16.0 (1 CVE: auth header leak on cross-domain) - qs 6.14.2 → 6.15.2 (1 CVE: DoS via crash on null in comma format) - ws 8.20.0 → 8.21.0 (1 CVE: uninitialised memory disclosure) These CVEs were published *after* the last green CI run on main and already block the audit step on origin/main (verified by running `npm audit --omit=dev --audit-level=high` against d4e9bb2). They are not regressions introduced by this branch — but main's CI is also red because of them, and this bump is the path forward. Verified locally: - npm audit --omit=dev --audit-level=high → found 0 vulnerabilities - npm run type-check → ok - npm run lint → ok - npm run build → ok, 9 migrations copied - CI=1 npm test → 28 files / 193 tests pass
Summary
End-to-end adversarial review of
anonagainst the Pumble OAuth-appprobe checklist, plus every fix that's landable inside this repo —
including the five upstream-SDK hardening gaps (
S-2..S-5,T-1)that the first pass had flagged as "operator must mitigate at the
reverse proxy". After Phase 6, those are closed inside the addon
itself via a hardened webhook layer mounted in
onServerConfiguring.Phases: snapshot → adversarial review → per-GAP fix → doc cluster →
SDK-boundary hardening → final docs sweep.
Verdicts after Phase 6:
scope, secret-file leak probe)
Per-phase deliverables
Phase 1 — snapshot (1 commit)
1192cd2docs(review): addon snapshot before adversarial reviewPhase 2 — adversarial review (1 commit)
8a968afdocs(review): adversarial review against Pumble probe checklistPhase 3 — addon-fixable GAPs, TDD per finding (5 commits)
cda8ee6test(D-2): pin pumble-sdk webhook signature contract2272f71feat(D-3): categorize outbound errors before audit/log2ee2e96ci(D-5): glob every migration into the dist verification check1d8c0ecci(D-4): run eslint as a CI gatef0c9bf6feat(D-1): manifest hostname guard for marketplace submissionsPhase 4 — docs alignment (3 commits)
de2cd56docs(D-1): surface manifest hostname guard in README1cdb791docs(D-6): inline AUDIT H-2 sketch + document SDK hardening flags (initial)d21328adocs(D-7): add CHANGELOG.md with Unreleased entryPhase 6 — close S-2..S-5 + T-1 inside the addon (3 commits)
c8df2a4feat(S-2..S-5): hardened webhook layer replaces SDK middleware —src/http/hardenedWebhook.tsmounts inonServerConfiguring, splices the SDK's/hookPOST out ofexpress._router.stack, and replaces it with a hardened pipeline (timestamp window,crypto.timingSafeEqual, 1 MiB body cap, try/caughtJSON.parse). Verified payloads dispatch to the SDK service via its publicpost*methods, so handler semantics are unchanged. 26-case test suite (tests/security/hardenedWebhook.test.ts).9ba91c6feat(T-1): set axios default timeout before pumble-sdk import —src/main.tssetsaxios.defaults.timeout = 30_000at the top of the file. The SDK'saxios.create()calls inherit the default; no per-call wrap needed. 2-case contract test (tests/security/axiosTimeout.test.ts).954383edocs: flip S-2..S-5 + T-1 from FLAG to CLOSED —SECURITY.mdrewritten to describe the hardening layer instead of operator mitigations.docs/reviews/...adversarial-review.md"Out of scope" section strikethroughs the FLAGs.CHANGELOG.mdSecurity bullets lead with the hardening.Final gate (local, 2026-05-24)
npm run lintnpm run type-checkCI=1 npm run test:coveragenpm run buildnpm run verify:manifestWhat still requires user action
S-2..S-5 + T-1 upstream FLAGs— closed inside the addon asof Phase 6. Filing upstream PRs against
CAKE-com/pumble-node-sdkis still worthwhile so other Pumbleapps inherit the same defences; the suggested PR shape is in
SECURITY.md. The addon's local hardening is not a substitutefor an upstream fix.
abot/credential rotation (workspace64ad1305c701cc5be7c26fe4, app69950af22720c2992bab57f7,SECURITY.md:103–109) — explicitly out of scope per the reviewrequest.
WORKING/CLAUDE.mdsays "115/115 vitest suite";this branch ships 192/192. Update at your convenience.
Review files
docs/reviews/2026-05-24-snapshot.mddocs/reviews/2026-05-24-adversarial-review.md(post-Phase-6 verdicts)CHANGELOG.mdSECURITY.md— see "Webhook hardening layer"Test plan
docs/reviews/2026-05-24-adversarial-review.md; confirm OK / GAP / closed-FLAG split matches your judgement.src/http/hardenedWebhook.ts— confirm theexpress._router.stacksplice inremoveSdkWebhookRoute()is acceptable as the one private-API touch.npm run verify:manifestlocally — confirm it correctly rejects the placeholder hostname (exit 1)./anon @user msgflow against a sandbox workspace; confirm:transportfield on the happy path (empty /unknownis fine if no error occurred)CAKE-com/pumble-node-sdkfor the same defences; suggested PR shape is inSECURITY.md.WORKING/CLAUDE.md's "115/115" claim to "192/192".Do not merge until you've explicitly approved.