Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure void elements have closing slash in mermaid svg #15661

Merged
merged 23 commits into from
Feb 2, 2024

Conversation

bollwyvl
Copy link
Contributor

@bollwyvl bollwyvl commented Jan 18, 2024

References

Code changes

  • transform <br> tags into closed <br/> when generating the img tag containing mermaid SVG
    • the regular expression will actually find all void elements, and preserves attributes

  • test
    • add a test/notebook with examples of all (but one) of the upstream documentation examples
    • add screenshots

User-facing changes

  • users will be able to use expressive more typographic features in mermaid diagrams

Security/Compatibility considerations

  • some implementations (e.g. GitHub) will not even parse diagrams with any void elements (or probably any fancy HTML) except hr and br
    • currently, in this PR, as in the mermaid playground, one can do really interesting/outrageous things, for example:
      • the style="border-color: red" attribute can be used on an hr to change its color
      • <input type="checkbox" /> and radio can be used to do weird things
    • in the playground, these are actually clickable, etc. but sadly not feasible with the (otherwise robust and portable) <img/> approach
      • though text labels may contain further img tags
  • no implementation render <script> tags
  • onclick="javascript:alert('pwnd')" attributes do appear to pass through on the playground, and in 4.1.0b1 if a user goes through right-click, open in new tab and then interacts with the diagram

Backwards-incompatible changes

  • if we decide to do more aggressive sanitizing, this could break some diagrams/examples in the wild

Copy link

Thanks for making a pull request to jupyterlab!
To try out this branch on binder, follow this link: Binder

@bollwyvl bollwyvl changed the title Force self-closing breaking space tags in generated mermaid svg Ensure void tags have closing slash in mermaid svg Jan 18, 2024
packages/mermaid/src/manager.ts Fixed Show fixed Hide fixed
packages/mermaid/src/manager.ts Fixed Show fixed Hide fixed
packages/mermaid/src/manager.ts Fixed Show fixed Hide fixed
@bollwyvl bollwyvl changed the title Ensure void tags have closing slash in mermaid svg Ensure void elements have closing slash in mermaid svg Jan 18, 2024
@krassowski krassowski added this to the 4.1.0 milestone Jan 19, 2024
@krassowski krassowski added the bug label Jan 19, 2024
@krassowski
Copy link
Member

krassowski commented Jan 22, 2024

Upstream issue: mermaid-js/mermaid#1766

What effect would the flowchart.htmlLabels=false suggested by mermaid team have on compatibility? Can we use it instead of manual HTML substitution?

@bollwyvl
Copy link
Contributor Author

bollwyvl commented Jan 22, 2024 via email

@bollwyvl
Copy link
Contributor Author

htmlLabels=false

This does not solve the problem for \n in node labels.

sandbox

This resulted in strictly worse results e.g. &gt;br&lt; as these are "magic" entities, not defined in the SVG spec.

The <iframe srcdoc> approach is very attractive, as, for example, HTML (and therefore markdown) images can do icky things like:

This is totally valid markdown:

<img src="/api/shutdown" alt="shutdown"/>

And pretty much all frontends will render it, including (at least in preview) GitHub

This is totally valid markdown:

shutdown

But marked communicates over strings, and filters out iframes... we probably don't want to change that posture, today, as they can very much be used for deceptive things, not just their "white hat" security techniques).

@krassowski
Copy link
Member

But marked communicates over strings, and filters out iframes... we probably don't want to change that posture, today, as they can very much be used for deceptive things, not just their "white hat" security techniques).

Could we use the same splicing approach as we do with MathJax? See:

// Separate math from normal markdown text.
const parts = removeMath(source);
// Convert the markdown to HTML.
html = await markdownParser.render(parts['text']);
// Replace math.
html = replaceMath(html, parts['math']);

@krassowski
Copy link
Member

Kicking this PR to check if the labeller for mermaid now works (#15717), also FYI once you merge/rebase the tests for mermaid should now run on CI.

@bollwyvl
Copy link
Contributor Author

Thanks. Alas, it might be a tick before I can get back to this...

@bollwyvl bollwyvl marked this pull request as ready for review January 31, 2024 17:30
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks amiss? In general I am not a fan of that many snapshots but I am ok with these being here as long as they are isolated to the diagrams alone (no other UI).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of all the things a screenshot is good at, diagram renders right seems like a legit use case vs all the "whole screen" shots that are sensitive to individual pixel skew.

At any rate, these are now as isolated as they can be... the "screenshot element" function doesn't handle bigger-than-viewport stuff very well, and there are two examples diagrams that are larger than the viewport.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed a fix for this using workaround from microsoft/playwright#13486 (an upstream fix is apparently not planned).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix! Pretty weird its a non-priority upstream, as this hardly seems to that unique. As commented on the new file, perhaps this belongs someplace more prominent/easier to find?

Comment on lines 85 to 86
svg = svg.replace(Private.RE_VOID_ELEMENT, Private.replaceVoidElement);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would love to see this unit tested. For example input '<br/>' (which I understand we do not expect from the current version of marked but I guess user or future versions could introduce it) results in '<br / >' which strictly speaking is invalid XHTML. Basically:

new DOMParser().parseFromString('<br>'.replace(RE_VOID_ELEMENT, replaceVoidElement), 'image/svg+xml')

works, but:

new DOMParser().parseFromString('<br/>'.replace(RE_VOID_ELEMENT, replaceVoidElement), 'image/svg+xml')

fails.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, will add some coverage.

I'm semi-disappointed we still don't have any property-based testing or overarching frontend coverage that combines jest and puppeteer outputs, but certainly won't be wading into the stack any more than i have to on this PR.

@bollwyvl
Copy link
Contributor Author

bollwyvl commented Feb 1, 2024

Ok, all of the screenshots are now working (and less sensitive to scroll skew). Will merge upstream, and see if anything pops out, but no further work planned.

);

const clean = MermaidManager.cleanMermaidSvg(raw);
const parsed = parser.parseFromString(clean, SVG_MIME);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would now error if the cleaned string is not valid XML... it's harder to verify that the browser would go into the "fallback" rendering mode (the most likely failure case).

Because of that, it's probably not worth roundtripping inside the implementation, which is already pretty heavy.

* for elements larger than viewport, derived from:
* https://github.com/microsoft/playwright/issues/13486#issuecomment-1112012053
*/
async function resizePageAndScreenshot(locator: Locator) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps this is something that should be hoisted up outside of a deeply-nested utils?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we could add it as a galata helper in the future if it turns out useful elsewhere too. For now it's easier to do it here because this way we don't need to worry about it becoming a public API yet ;)

@bollwyvl
Copy link
Contributor Author

bollwyvl commented Feb 2, 2024

hooray all 🟢

Copy link
Member

@krassowski krassowski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @bollwyvl!

@krassowski krassowski merged commit e51ff33 into jupyterlab:main Feb 2, 2024
79 checks passed
@bollwyvl
Copy link
Contributor Author

bollwyvl commented Feb 2, 2024

Thanks. Alas, the wheels turn: since this was merged, https://github.com/mermaid-js/mermaid/releases/tag/v10.8.0 😿

I don't see any hard blockers/features that would require going out, but will take a look, especially to kick the tires on the screenshot approach.

@bollwyvl bollwyvl mentioned this pull request Feb 2, 2024
2 tasks
@bollwyvl bollwyvl deleted the gh-15660-mermaid-newlines branch February 2, 2024 14:11
bollwyvl added a commit to bollwyvl/jupyterlab that referenced this pull request Feb 3, 2024
bollwyvl added a commit to bollwyvl/jupyterlab that referenced this pull request Feb 5, 2024
krassowski added a commit to bollwyvl/jupyterlab that referenced this pull request Jul 19, 2024
bollwyvl added a commit to bollwyvl/jupyterlab that referenced this pull request Jul 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants