Skip to content

Decide policy for <u>/<sub>/<sup>/<mark> on storage → markdown export #155

@pchuri

Description

@pchuri

Background

Originally raised as a "while we're there" item in #141 and initially included in #154, then dropped after self-review showed the proposed approach (raw-HTML pass-through in the walker) breaks markdown → storage re-import.

The problem

Markdown has no native syntax for <u> (underline), <sub> / <sup> (sub/superscript, common in chemistry/math), or <mark> (highlight). Storage XHTML routinely contains them. The walker currently has no handler for these tags, so they fall through to the default branch in _dispatchElement and the wrapper is dropped — same loss pattern that #141 fixed for <s>/<del>.

The naive fix — pass them through as raw HTML, like the existing <details>/<summary> handler — fails round-trip:

```js
const c = new MacroConverter({ isCloud: true });
const md = c.storageToMarkdown('

H2O

'); // 'H2O'
const back = c.markdownToStorage(md); // '

H<sub>2</sub>O

'
```

Cause: lib/macro-converter.js constructs new MarkdownIt() with default html: false, so raw HTML tokens are escaped on re-import.

Options to evaluate

  1. Enable html: true on the MarkdownIt instance. Largest behaviour change. Audit needed: every prose path now lets arbitrary HTML through to storage XHTML, and Confluence's own sanitizer behaviour on edge tags (<script>, <style>, attribute injection) needs to be characterized. Probably the right answer if it audits clean, since it unlocks pass-through for all five tags consistently.

  2. Selective tag whitelist via a markdown-it plugin or a pre-processor. Allow only <u>/<sub>/<sup>/<mark> (and maybe <details>/<summary>) to survive markdownToStorage. More surgical, but more code to maintain.

  3. Keep dropping these tags on export. Status quo. Information loss on <u> etc., but no risk of round-trip surprise. Document the limitation in README.

Why deferred from #141 / #154

The walker-side change is one line per tag, but the upload side is the load-bearing decision. Bundling a full HTML-allowance audit into a strikethrough bug-fix would have been scope creep. Splitting it lets the audit get its own review.

Source

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions