Skip to content

Fix/page content double encoding#38

Merged
bigin merged 2 commits into
imanager-2.0from
fix/page-content-double-encoding
May 4, 2026
Merged

Fix/page content double encoding#38
bigin merged 2 commits into
imanager-2.0from
fix/page-content-double-encoding

Conversation

@bigin
Copy link
Copy Markdown
Owner

@bigin bigin commented May 4, 2026

Page content double-encoding (d7aaf6a)

PagesModule.saveAction() wrapped the posted markdown in htmlentities()
before storing, while the textarea round-trip re-escaped on every render.
Each save added one extra layer (>>> → …), so
content edited multiple times accumulated visible HTML entities in
both blockquotes and code blocks.

  • PagesModule: store posted content raw; drop the orphan
    htmlspecialchars_decode() from the markdown preview action.
  • Site::renderContent(): pass $page->content straight to Parsedown.
  • BasicTheme::renderArticleSummary(): drop the orphan decode.
  • bin/fix-page-content-encoding.php: one-shot migration. Walks all
    Pages items and runs html_entity_decode() on data.content until
    stable (cap 10 rounds). Supports --dry-run and --db=.

Drop dead allowHtmlOutput flag (bd08e0c)

allowHtmlOutput had no live consumer left after the renderContent
cleanup, and the editor.js reference read it off a editConf JS bag
that no PHP ever populated. A global "trust author HTML" flag is the
wrong granularity for the planned WYSIWYG plugin (which will need
per-page format dispatch), so dropping it now keeps the security
story crisp without closing any door for later.

  • scriptor-config.php: remove the key + docblock.
  • editor.js: hard-code html: false for the Remarkable preview.
  • Site::renderContent(): refresh the comment to document the safe-mode
    contract and the per-page dispatch hint.

Smoke

  • Migration on live db: 3 rewritten (-4 / -1 / -1 layers), 6 unchanged.
  • Frontend: blockquote renders as <blockquote>, code block has
    single-layer escaping (-&gt; → browser shows ->), no
    &amp;amp; in source after merge.
  • Storage round-trip: 3× save+reload returns byte-identical content.
  • XSS regression check: 7 attack vectors (<script>, inline event
    handlers, javascript:/data: URIs, <svg onload=>, <iframe>,
    <input autofocus onfocus=>) all neutralised by Parsedown safe-mode.

bigin added 2 commits May 4, 2026 06:23
PagesModule.saveAction() wrapped the posted markdown in `htmlentities()`
before storing, while the textarea round-trip re-escaped on every render.
Each save added one extra layer (`>` → `&gt;` → `&amp;gt;` → …), so
content edited multiple times accumulated visible HTML entities in
both blockquotes and code blocks.

Changes:
- PagesModule: store the posted content raw; drop the extra
  `htmlspecialchars_decode()` from the markdown preview action so the
  preview matches what the frontend now does.
- Site::renderContent(): pass `$page->content` straight to Parsedown.
  No decode pass — content is raw from save time. Parsedown safe-mode
  still escapes embedded HTML, and `allowHtmlOutput` no longer needs a
  branch since it never decoded anyway.
- BasicTheme::renderArticleSummary(): same — drop the orphan decode.
- bin/fix-page-content-encoding.php: one-shot migration that walks all
  Pages items and runs `html_entity_decode()` on `data.content` until
  stable (cap 10 rounds). Supports `--dry-run` and `--db=`. Required
  to clean rows that were edited under the buggy save path.
`allowHtmlOutput` had no live consumer left:
- The backend branch in Site::renderContent() was removed in the same
  PR (content is now always rendered via Parsedown safe-mode).
- The editor.js reference reads it off `editConf`, but no PHP file
  ever populates that JS bag — `typeof editConf !== 'undefined'` was
  always false, so the preview already always used `html: false`.

A global "trust author HTML" flag is also the wrong granularity for
the upcoming WYSIWYG editor plugin, which needs per-page format
dispatch (Markdown vs. HTML vs. plugin-specific). Dropping the flag
now keeps the security story crisp ("page content is Markdown,
period") without closing any door for that future feature.

Changes:
- data/settings/scriptor-config.php: remove the config key + docblock.
- editor/theme/scripts/editor.js: hard-code `html: false` for the
  Remarkable preview, with a comment pointing at Parsedown safe-mode
  as the server-side authority.
- boot/Frontend/Site.php: refresh the renderContent() comment to
  document the safe-mode contract and the per-page dispatch hint.
@bigin bigin merged commit 24a487d into imanager-2.0 May 4, 2026
@bigin bigin deleted the fix/page-content-double-encoding branch May 4, 2026 06:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant