Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Paragraph whitespace #307

Merged
merged 3 commits into from
Nov 25, 2019
Merged

Paragraph whitespace #307

merged 3 commits into from
Nov 25, 2019

Conversation

bachbui
Copy link
Contributor

@bachbui bachbui commented Nov 22, 2019

This PR updates the Commonmark renderer to support documents containing unicode emspaces (\u2003). There are two relevant changes:
When rendering Paragraphs, we were stripping all leading and trailing whitespace characters from their text to prevent the possibility of these characters being interpreted as markdown symbols, as leading spaces might be interpreted as an indented code block and trailing spaces might be interpreted as a line break. We were being overzealous in this stripping, as MD only considers certain whitespace characters as meaningful in this way. We obtained a narrower set of characters to strip by considering the chars which are matched by [\s], [ \f\n\r\t\v\u00a0\u1680\u2000-\u200a\u2028\u2029\u202f\u205f\u3000\ufeff], and checking which of these are meaningful to markdown.
Secondly, we need to encode unicode whitespace characters as html entities in the rendered output, as otherwise many markdown parsers (including Markdown-it) will ignore them. This PR does that for emspace characters only, but other unicode whitespace characters can be added as needed in subsequent PRs.

 - @atjson/renderer-commonmark@0.21.14-dev.0
 - @atjson/source-commonmark@0.21.13-dev.0
@bachbui
Copy link
Contributor Author

bachbui commented Nov 22, 2019

@gnorsilva I've made a dev build for you to test out these changes if you'd like. You can bump these to these packages versions:
@atjson/renderer-commonmark@0.21.14-dev.0
@atjson/source-commonmark@0.21.13-dev.0

Copy link
Contributor

@gnorsilva gnorsilva left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, checked in copilot-atjson and this fixes the issue 👍

Copy link
Collaborator

@tim-evans tim-evans left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is lovely! Thanks for handling this so quickly @bachbui

I'm going to create a follow-up issue to track other non-breaking spaces that we should handle :)

@bachbui
Copy link
Contributor Author

bachbui commented Nov 25, 2019

@gnorsilva This has been merged and released
@atjson/renderer-commonmark@0.21.14
@atjson/source-commonmark@0.21.13

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants