Fix/page content double encoding#38
Merged
Merged
Conversation
PagesModule.saveAction() wrapped the posted markdown in `htmlentities()` before storing, while the textarea round-trip re-escaped on every render. Each save added one extra layer (`>` → `>` → `>` → …), so content edited multiple times accumulated visible HTML entities in both blockquotes and code blocks. Changes: - PagesModule: store the posted content raw; drop the extra `htmlspecialchars_decode()` from the markdown preview action so the preview matches what the frontend now does. - Site::renderContent(): pass `$page->content` straight to Parsedown. No decode pass — content is raw from save time. Parsedown safe-mode still escapes embedded HTML, and `allowHtmlOutput` no longer needs a branch since it never decoded anyway. - BasicTheme::renderArticleSummary(): same — drop the orphan decode. - bin/fix-page-content-encoding.php: one-shot migration that walks all Pages items and runs `html_entity_decode()` on `data.content` until stable (cap 10 rounds). Supports `--dry-run` and `--db=`. Required to clean rows that were edited under the buggy save path.
`allowHtmlOutput` had no live consumer left:
- The backend branch in Site::renderContent() was removed in the same
PR (content is now always rendered via Parsedown safe-mode).
- The editor.js reference reads it off `editConf`, but no PHP file
ever populates that JS bag — `typeof editConf !== 'undefined'` was
always false, so the preview already always used `html: false`.
A global "trust author HTML" flag is also the wrong granularity for
the upcoming WYSIWYG editor plugin, which needs per-page format
dispatch (Markdown vs. HTML vs. plugin-specific). Dropping the flag
now keeps the security story crisp ("page content is Markdown,
period") without closing any door for that future feature.
Changes:
- data/settings/scriptor-config.php: remove the config key + docblock.
- editor/theme/scripts/editor.js: hard-code `html: false` for the
Remarkable preview, with a comment pointing at Parsedown safe-mode
as the server-side authority.
- boot/Frontend/Site.php: refresh the renderContent() comment to
document the safe-mode contract and the per-page dispatch hint.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Page content double-encoding (d7aaf6a)
PagesModule.saveAction() wrapped the posted markdown in
htmlentities()before storing, while the textarea round-trip re-escaped on every render.
Each save added one extra layer (
>→>→>→ …), socontent edited multiple times accumulated visible HTML entities in
both blockquotes and code blocks.
htmlspecialchars_decode()from the markdown preview action.$page->contentstraight to Parsedown.Pages items and runs
html_entity_decode()ondata.contentuntilstable (cap 10 rounds). Supports
--dry-runand--db=.Drop dead allowHtmlOutput flag (bd08e0c)
allowHtmlOutputhad no live consumer left after the renderContentcleanup, and the editor.js reference read it off a
editConfJS bagthat no PHP ever populated. A global "trust author HTML" flag is the
wrong granularity for the planned WYSIWYG plugin (which will need
per-page format dispatch), so dropping it now keeps the security
story crisp without closing any door for later.
html: falsefor the Remarkable preview.contract and the per-page dispatch hint.
Smoke
<blockquote>, code block hassingle-layer escaping (
->→ browser shows->), no&amp;in source after merge.<script>, inline eventhandlers,
javascript:/data:URIs,<svg onload=>,<iframe>,<input autofocus onfocus=>) all neutralised by Parsedown safe-mode.