Skip to content

harden: assorted server-side tightening for 3.0.2#7784

Merged
JohnMcLear merged 2 commits into
developfrom
harden-v3.0.2
May 17, 2026
Merged

harden: assorted server-side tightening for 3.0.2#7784
JohnMcLear merged 2 commits into
developfrom
harden-v3.0.2

Conversation

@JohnMcLear
Copy link
Copy Markdown
Member

A bundle of defence-in-depth changes across the server entry points. Each change is small on its own; landing them together keeps the diff cohesive for review and the release notes simple.

Areas touched

  • HTTP API auth path (APIHandler.ts) — tightened JWT validation + constant-time API key comparison.
  • Import / export tmp file handling — switched to CSPRNG-derived path tokens.
  • /tokenTransfer flow — added TTL, single-use enforcement, and removed an unnecessary field from the response body.
  • x-proxy-path header handling — extracted a shared sanitiser used by both the admin static serving and the legacy timeslider redirect; added Vary and Cache-Control: private, no-store on the admin responses that vary by that header.
  • Pad.appendRevision — centralised an insert-op invariant the wire handler already enforces so non-wire callers get the same check.

Tests

  • 5 new mocha specs under tests/backend/specs/
  • 1 new vitest spec under tests/backend-new/specs/
  • 1 helper added to tests/backend/common.ts

20 new test cases, all passing.

Regression posture

Ran the touched-surface tests (16 spec files) against this branch and unmodified develop: identical pass/fail counts. The five failures both branches share are pre-existing on develop — issues filed separately.

Notes for reviewers

  • Detailed per-fix breakdown will accompany the 3.0.2 release notes.
  • Behaviour change worth flagging: pads created via internal flows that did not previously attribute the initial revision (e.g. getPad with no author, setHTML with no authorId) now attribute the initial write to the stable a.etherpad-system author — same substitution setText/spliceText already use. Public-facing API surfaces are unchanged.

🤖 Generated with Claude Code

@qodo-code-review
Copy link
Copy Markdown

Qodo reviews are paused for this user.

Troubleshooting steps vary by plan Learn more →

On a Teams plan?
Reviews resume once this user has a paid seat and their Git account is linked in Qodo.
Link Git account →

Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center?
These require an Enterprise plan - Contact us
Contact us →

@qodo-free-for-open-source-projects
Copy link
Copy Markdown

Review Summary by Qodo

Harden server-side security across API, auth, and file handling

🐞 Bug fix ✨ Enhancement

Grey Divider

Walkthroughs

Description
• Tighten JWT validation: verify signature before reading claims, require admin === true
• Switch API key comparison to constant-time crypto.timingSafeEqual
• Replace Math.random() with CSPRNG for import/export temp file paths
• Enforce 5-minute TTL and single-use redemption on /tokenTransfer flow
• Sanitize x-proxy-path header to prevent XSS and path traversal attacks
• Centralize insert-op author attribute invariant in Pad.appendRevision
• Add Vary and Cache-Control headers to admin responses varying by proxy path
Diagram
flowchart LR
  A["HTTP API Auth"] -->|"Verify JWT signature first"| B["Claim validation"]
  B -->|"Require admin=true"| C["Authorization check"]
  D["API Key"] -->|"Timing-safe compare"| C
  E["Import/Export"] -->|"CSPRNG temp paths"| F["File handling"]
  G["Token Transfer"] -->|"TTL + single-use"| H["Redemption"]
  I["Proxy Path Header"] -->|"Sanitize + validate"| J["XSS/traversal prevention"]
  K["Pad Operations"] -->|"Author attribute check"| L["Insert op validation"]
Loading

Grey Divider

File Changes

1. src/node/handler/APIHandler.ts Security hardening +28/-14

Tighten JWT validation and API key comparison

src/node/handler/APIHandler.ts


2. src/node/handler/ImportHandler.ts Security hardening +5/-1

Use CSPRNG for temp file path generation

src/node/handler/ImportHandler.ts


3. src/node/handler/ExportHandler.ts Security hardening +4/-2

Use CSPRNG for temp file path generation

src/node/handler/ExportHandler.ts


View more (12)
4. src/node/hooks/express/tokenTransfer.ts Security hardening +30/-5

Add TTL, single-use enforcement, remove token from response

src/node/hooks/express/tokenTransfer.ts


5. src/node/hooks/express/admin.ts Security hardening +12/-3

Sanitize proxy path and add cache control headers

src/node/hooks/express/admin.ts


6. src/node/hooks/express/specialpages.ts Refactoring +4/-6

Use shared sanitizeProxyPath helper

src/node/hooks/express/specialpages.ts


7. src/node/utils/sanitizeProxyPath.ts Security hardening +38/-0

New shared sanitizer for x-proxy-path header

src/node/utils/sanitizeProxyPath.ts


8. src/node/db/Pad.ts 🐞 Bug fix +77/-2

Centralize insert-op author attribute validation

src/node/db/Pad.ts


9. src/node/utils/ImportHtml.ts 🐞 Bug fix +24/-1

Substitute system author for unattributed imports

src/node/utils/ImportHtml.ts


10. src/tests/backend/common.ts 🧪 Tests +16/-0

Add JWT token generator with admin=false

src/tests/backend/common.ts


11. src/tests/backend/specs/api/jwtAdminClaim.ts 🧪 Tests +83/-0

New tests for JWT admin claim validation

src/tests/backend/specs/api/jwtAdminClaim.ts


12. src/tests/backend/specs/padInsertAuthorInvariant.ts 🧪 Tests +85/-0

New tests for insert-op author attribute invariant

src/tests/backend/specs/padInsertAuthorInvariant.ts


13. src/tests/backend/specs/proxyPathRedirect.ts 🧪 Tests +100/-0

New tests for proxy path sanitization

src/tests/backend/specs/proxyPathRedirect.ts


14. src/tests/backend/specs/tokenTransfer.ts 🧪 Tests +153/-0

New tests for token transfer TTL and single-use

src/tests/backend/specs/tokenTransfer.ts


15. src/tests/backend-new/specs/sanitizeProxyPath.test.ts 🧪 Tests +105/-0

New vitest unit tests for sanitizeProxyPath helper

src/tests/backend-new/specs/sanitizeProxyPath.test.ts


Grey Divider

Qodo Logo

@qodo-free-for-open-source-projects
Copy link
Copy Markdown

qodo-free-for-open-source-projects Bot commented May 17, 2026

Code Review by Qodo

🐞 Bugs (4) 📘 Rule violations (0)

Grey Divider


Action required

1. Legacy restore/copy breaks 🐞 Bug ≡ Correctness
Description
Pad.appendRevision now throws if any insert op lacks an author attribute, but restoreRevision()
and copyPadWithoutHistory() rebuild changesets from stored attrib strings that can legitimately omit
author on legacy pads (notably .etherpad imports and older server-internal flows). Those APIs can
therefore start failing at runtime for existing pads with historical unattributed inserts.
Code

src/node/db/Pad.ts[R281-287]

+    // Centralised "every insert op carries an author attribute"
+    // invariant. The socket handler enforces the same rule at the wire
+    // boundary; checking here covers the non-wire callers (HTTP API
+    // setHTML/setText/restoreRevision, plugin paths that call
+    // appendRevision directly).
+    Pad._assertInsertOpsCarryAuthor(aChangeset, this.pool);
+
Evidence
appendRevision now enforces the invariant, while other parts of the codebase explicitly acknowledge
historical stored data can violate it and there are code paths that replay stored attrib strings
into new insert ops without adding an author, which will trigger the new guard.

src/node/db/Pad.ts[280-334]
src/node/db/Pad.ts[939-945]
src/node/db/API.ts[587-640]
src/node/db/Pad.ts[714-753]
src/node/utils/ImportEtherpad.ts[110-120]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`Pad.appendRevision()` now enforces that every `+` op has an `author` attribute, but some existing stored pads (legacy `.etherpad` imports and historical server-internal flows) can contain unattributed insert ops. Code paths that *replay* stored content as new changesets (notably `API.restoreRevision()` and `Pad.copyPadWithoutHistory()`) will start throwing when they encounter such legacy attribution.

## Issue Context
- `Pad.appendRevision()` now calls `_assertInsertOpsCarryAuthor()`.
- `pad.check()` explicitly documents that historical stored data can violate this invariant.
- `API.restoreRevision()` inserts text using historical `op.attribs` verbatim.
- `Pad.copyPadWithoutHistory()` packs ops from `opsFromAText(oldAText)` which can also propagate missing authors.

## Fix Focus Areas
- src/node/db/API.ts[587-640]
- src/node/db/Pad.ts[280-334]
- src/node/db/Pad.ts[714-753]

### Implementation sketch
1. In `restoreRevision()`: when iterating over `atext.attribs`, detect inserts whose attrib string lacks `author`. Merge in an author attribute (prefer `authorId` if provided, else `Pad.SYSTEM_AUTHOR_ID`) using `AttributeMap.fromString(..., pad.pool)` and `.set('author', ...)` to keep canonical order.
2. In `copyPadWithoutHistory()`: before calling `dstPad.appendRevision(...)`, ensure the packed ops string has `author` on every insert op (same merge strategy; prefer passed `authorId` or system author for missing segments).
3. Add a regression test that restores/copies a pad containing a stored revision with an insert op missing `author` (simulating legacy data) and assert the operation succeeds and produces canonical attribs.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended

2. ImportHtml loses authorId 🐞 Bug ≡ Correctness
Description
ImportHtml.setPadHTML computes an effectiveAuthorId and injects it into insert op attribs, but
still calls pad.appendRevision(theChangeset, authorId), which stores an empty meta.author and
passes an empty authorId to hooks/authorManager. This creates an inconsistency between op
attribution and revision metadata/hook context.
Code

src/node/utils/ImportHtml.ts[R80-91]

+  // Every insert op needs an `author` attribute (the appendRevision
+  // precondition). The contentcollector tags ops with style
+  // attributes (bold, italic, etc.) but doesn't add an author; for
+  // server-side imports the author is implicit in the caller, so
+  // substitute the system author when no explicit one was supplied —
+  // same pattern setText/spliceText already use.
+  const effectiveAuthorId =
+      (newText.length > 0 && !authorId) ? SYSTEM_AUTHOR_ID : authorId;
+
  // assemble each line into the builder
  let textIndex = 0;
  const newTextStart = 0;
Evidence
The new effectiveAuthorId is computed and applied to op attribs, but appendRevision persists the
author from its parameter, and setPadHTML still passes the original authorId parameter rather than
effectiveAuthorId.

src/node/utils/ImportHtml.ts[80-118]
src/node/db/Pad.ts[280-320]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`setPadHTML()` now computes `effectiveAuthorId` (system author fallback) and merges it into insert op attribs, but it still calls `pad.appendRevision(theChangeset, authorId)` using the original (possibly empty) `authorId`. `appendRevision()` persists `meta.author` from its `authorId` parameter and also uses it for hooks and authorManager linkage.

## Issue Context
This inconsistency is introduced by the new `effectiveAuthorId` logic in ImportHtml.

## Fix Focus Areas
- src/node/utils/ImportHtml.ts[80-118]
- src/node/db/Pad.ts[280-320]

### Implementation sketch
- Change the final writes in `setPadHTML()` to use `effectiveAuthorId` consistently:
 - `await pad.setText('\n', effectiveAuthorId);` (optional but consistent)
 - `await pad.appendRevision(theChangeset, effectiveAuthorId);`
- Keep the current attrib merge logic so the changeset itself carries `author` on every insert op.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


3. Proxy-path may be relative 🐞 Bug ≡ Correctness
Description
sanitizeProxyPath’s docstring claims the output always starts with exactly one '/', but the
implementation can return values like 'pad' (no leading slash). Call sites concatenate this into
redirects and entrypoint URLs, which can produce relative URLs and broken redirects/asset paths.
Code

src/node/utils/sanitizeProxyPath.ts[R21-37]

+export const sanitizeProxyPath = (req: {header: (n: string) => string|undefined} | string | undefined): string => {
+  const raw = typeof req === 'string'
+      ? req
+      : req && typeof req.header === 'function'
+          ? (req.header('x-proxy-path') || '')
+          : '';
+  let cleaned = raw.replace(/[^a-zA-Z0-9\-_\/\.]/g, '');
+  if (!cleaned) return '';
+  // Collapse leading "//+" to a single "/" so the value can never be
+  // interpreted as a protocol-relative URL when concatenated into an
+  // href / Location / iframe src.
+  cleaned = cleaned.replace(/^\/{2,}/, '/');
+  // Refuse "/.." / "../" segments — path-traversal shapes that some
+  // downstream URL joiners would still honour even after the character
+  // filter above.
+  if (/(?:^|\/)\.\.(?:\/|$)/.test(cleaned)) return '';
+  return cleaned;
Evidence
The sanitizer returns cleaned as-is after only collapsing multiple leading slashes, and
specialpages/admin consume it as a URL-path prefix for redirects and asset entrypoints (which break
if the prefix is relative).

src/node/utils/sanitizeProxyPath.ts[11-37]
src/node/hooks/express/specialpages.ts[175-180]
src/node/hooks/express/specialpages.ts[404-411]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`sanitizeProxyPath()` documents that non-empty outputs start with exactly one `/`, but it does not enforce this (it only collapses *multiple* leading slashes). This can produce relative path prefixes when the header lacks a leading slash, and several call sites treat the sanitized value as an absolute path prefix.

## Issue Context
- specialpages constructs `entrypoint: proxyPath + '/watch/...'` and redirects using `${proxyPath}/p/...`, which assume `proxyPath` is either `''` or `'/...'`.

## Fix Focus Areas
- src/node/utils/sanitizeProxyPath.ts[21-37]
- src/node/hooks/express/specialpages.ts[175-180]
- src/node/hooks/express/specialpages.ts[404-411]

### Implementation sketch
- After cleaning/collapsing traversal:
 - If `cleaned` is non-empty and does not start with `/`, prefix it with `/`.
 - Keep the existing `//+` collapse and `..` rejection.
- Update/extend unit tests to include an input like `'pad/etherpad'` and assert it becomes `'/pad/etherpad'` (or is rejected, but then update callers/docs accordingly).
- Ensure the docstring matches the actual enforced semantics.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Advisory comments

4. createdAt type mismatch 🐞 Bug ⚙ Maintainability
Description
TokenTransferRequest.createdAt is now required in the TypeScript type, but the GET handler
explicitly treats missing/non-numeric createdAt as a legacy case. This makes the type inaccurate and
increases the chance of future refactors removing the defensive handling.
Code

src/node/hooks/express/tokenTransfer.ts[10]

+  createdAt: number;
Evidence
The type requires createdAt but the handler code path and comments explicitly allow it to be absent,
demonstrating a mismatch between static typing and runtime behavior.

src/node/hooks/express/tokenTransfer.ts[7-71]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`TokenTransferRequest` declares `createdAt: number` as required, but runtime logic treats it as optional (legacy records). The type should reflect reality so future edits don’t accidentally assume it is always present.

## Issue Context
The handler comment and logic explicitly handle missing `createdAt` by treating it as expired.

## Fix Focus Areas
- src/node/hooks/express/tokenTransfer.ts[7-71]

### Implementation sketch
- Change the type to `createdAt?: number;` (or `createdAt: number | undefined`).
- Keep the existing runtime guard `typeof tokenData.createdAt === 'number' ? ... : 0`.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

Qodo Logo

Comment thread src/node/db/Pad.ts
Comment on lines +281 to +287
// Centralised "every insert op carries an author attribute"
// invariant. The socket handler enforces the same rule at the wire
// boundary; checking here covers the non-wire callers (HTTP API
// setHTML/setText/restoreRevision, plugin paths that call
// appendRevision directly).
Pad._assertInsertOpsCarryAuthor(aChangeset, this.pool);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

1. Legacy restore/copy breaks 🐞 Bug ≡ Correctness

Pad.appendRevision now throws if any insert op lacks an author attribute, but restoreRevision()
and copyPadWithoutHistory() rebuild changesets from stored attrib strings that can legitimately omit
author on legacy pads (notably .etherpad imports and older server-internal flows). Those APIs can
therefore start failing at runtime for existing pads with historical unattributed inserts.
Agent Prompt
## Issue description
`Pad.appendRevision()` now enforces that every `+` op has an `author` attribute, but some existing stored pads (legacy `.etherpad` imports and historical server-internal flows) can contain unattributed insert ops. Code paths that *replay* stored content as new changesets (notably `API.restoreRevision()` and `Pad.copyPadWithoutHistory()`) will start throwing when they encounter such legacy attribution.

## Issue Context
- `Pad.appendRevision()` now calls `_assertInsertOpsCarryAuthor()`.
- `pad.check()` explicitly documents that historical stored data can violate this invariant.
- `API.restoreRevision()` inserts text using historical `op.attribs` verbatim.
- `Pad.copyPadWithoutHistory()` packs ops from `opsFromAText(oldAText)` which can also propagate missing authors.

## Fix Focus Areas
- src/node/db/API.ts[587-640]
- src/node/db/Pad.ts[280-334]
- src/node/db/Pad.ts[714-753]

### Implementation sketch
1. In `restoreRevision()`: when iterating over `atext.attribs`, detect inserts whose attrib string lacks `author`. Merge in an author attribute (prefer `authorId` if provided, else `Pad.SYSTEM_AUTHOR_ID`) using `AttributeMap.fromString(..., pad.pool)` and `.set('author', ...)` to keep canonical order.
2. In `copyPadWithoutHistory()`: before calling `dstPad.appendRevision(...)`, ensure the packed ops string has `author` on every insert op (same merge strategy; prefer passed `authorId` or system author for missing segments).
3. Add a regression test that restores/copies a pad containing a stored revision with an insert op missing `author` (simulating legacy data) and assert the operation succeeds and produces canonical attribs.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

A bundle of defence-in-depth hardening picked up during an internal
audit pass. Each change is small on its own; landing them together
keeps the diff cohesive for review and the release notes simple.

Production-side changes:
  - src/node/handler/APIHandler.ts: tighten the OAuth JWT validation
    path on the HTTP API. Verify the signature before reading any
    claim off the payload, and require the admin claim to be strictly
    true (not just present). Switch the apikey comparison to
    crypto.timingSafeEqual.
  - src/node/handler/{Import,Export}Handler.ts: derive temp-file path
    tokens from crypto.randomBytes(16) instead of Math.random.
  - src/node/hooks/express/tokenTransfer.ts: enforce a 5-minute TTL
    on transfer records, make redemption single-use (remove before
    response), and drop the author token from the response body —
    the HttpOnly cookie is the only delivery channel.
  - src/node/utils/sanitizeProxyPath.ts (new): shared sanitiser for
    the `x-proxy-path` header. Used by admin.ts (HTML/JS/CSS
    substitution) and specialpages.ts (legacy timeslider redirect).
    Strips characters outside [A-Za-z0-9_./-], collapses leading
    `//+` to a single `/`, rejects `..` traversal. admin.ts also
    emits Vary: x-proxy-path and Cache-Control: private, no-store.
  - src/node/db/Pad.ts + src/node/utils/ImportHtml.ts: centralise
    the "every insert op carries an author attribute" invariant in
    Pad.appendRevision so all non-wire callers (setText, setHTML,
    restoreRevision, plugin paths) get the same check the socket
    handler already enforces. Pad.init and setPadHTML now
    substitute SYSTEM_AUTHOR_ID when no author is supplied — same
    pattern setText/spliceText already use.

Tests:
  - src/tests/backend/specs/api/jwtAdminClaim.ts (5 cases)
  - src/tests/backend/specs/tokenTransfer.ts (6 cases)
  - src/tests/backend/specs/proxyPathRedirect.ts (5 cases)
  - src/tests/backend/specs/padInsertAuthorInvariant.ts (4 cases)
  - src/tests/backend-new/specs/sanitizeProxyPath.test.ts (14 vitest cases)
  - src/tests/backend/common.ts: add generateJWTTokenAdminFalse helper.

Regression sweep across 16 backend spec files: same 5 pre-existing
failures on develop reproduce after this change; +20 new passing
tests; no new failures introduced.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@JohnMcLear
Copy link
Copy Markdown
Member Author

Addressed the 4 Qodo findings — force-pushed an amend to the single commit.

  • Bug 1 (legacy restore/copy): API.restoreRevision and Pad.copyPadWithoutHistory now merge in an author marker via AttributeMap before handing the changeset to appendRevision. Caller-supplied authorId wins; otherwise falls back to the system author. Added a regression test that copies a pad whose source atext has unattributed inserts (the symmetric restoreRevision test is skipped as a TODO — the historical revs need on-disk surgery rather than in-memory poison to exercise that path, and the merge logic is identical to the copyPad case).
  • Bug 2 (ImportHtml loses authorId): setPadHTML now passes effectiveAuthorId to both pad.setText('\n', ...) and pad.appendRevision(...) so meta.author, the padCreate/padUpdate hook context, and the op attribs all reference the same identity.
  • Bug 3 (proxy-path may be relative): sanitizeProxyPath now prepends / when the cleaned value is non-empty and doesn't already start with one. Docstring updated. New vitest cases assert 'pad/etherpad' → '/pad/etherpad' and that double-prefix is avoided.
  • Bug 4 (createdAt type mismatch): TokenTransferRequest.createdAt is now createdAt?: number to match the runtime treatment of legacy records.

Regression sweep across 16 backend specs: 207 passing, 13 pending, 5 failing — same 5 pre-existing failures present on unmodified develop (filed separately as #7785#7788).

…nvariant

Legacy .etherpad exports (and exports from older server-internal flows
that didn't substitute SYSTEM_AUTHOR_ID) can contain `+content` insert
ops without an `author` attribute. The previous commit's appendRevision
guard rejects that shape, but setPadRaw bulk-writes records directly
to the DB and never goes through appendRevision -- so a hand-crafted
.etherpad file could persist non-conforming data that any subsequent
setText / setHTML / restoreRevision call would then refuse to extend.

Add a pre-pass over the parsed import that:

  - walks revs in numeric order, sanitising each changeset's `+` ops
    against the cumulative pad pool (mutating the pool to register
    SYSTEM_AUTHOR_ID when needed);
  - re-applies each (post-sanitisation) changeset to a running atext
    so the head atext and any key-rev meta.atext / meta.pool
    snapshots are re-derived in lock-step with the rewritten revs
    (otherwise pad.check's deep-equal at the end of setPadRaw would
    fail on the attribute-number drift between the sanitised head
    state and the now-stale key-rev snapshot);
  - leaves already-conforming payloads untouched (no log noise on
    good imports).

Returns the number of ops rewritten so the import can log a single
warning per legacy file rather than per-record. Pure-newline `+` ops
are exempted -- same whitelist as the wire-side guard.

Tests:
  - tests/backend/specs/padInsertAuthorInvariant.ts: two new cases
    drive setPadRaw end-to-end with a hand-crafted legacy payload
    (asserts the head atext gains a `*N` attribute reference and the
    pool registers `[author, a.etherpad-system]`) and with a
    conforming payload (asserts the data round-trips unchanged).

Regression sweep across 17 backend spec files: 209 passing / 12
pending / 5 failing -- same 5 pre-existing failures present on
unmodified develop. +2 new passing tests; no new failures introduced.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@JohnMcLear JohnMcLear merged commit 8c6104c into develop May 17, 2026
30 of 31 checks passed
@JohnMcLear JohnMcLear deleted the harden-v3.0.2 branch May 17, 2026 12:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant