-
-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make Markdown's newline & space handling conform to web-platform-tests & Firefox's behavior #15081
base: main
Are you sure you want to change the base?
Conversation
92276e4
to
01d93c6
Compare
894c31f
to
01eb21b
Compare
24e63a7
to
65182d7
Compare
Shoud I add |
Can you send a separate PR for this first?
And we can use this module to do the html trim. |
@fisker In CommonMark spec, the form feed https://spec.commonmark.org/0.30/#unicode-whitespace-character Either way, it is destined to be removed at the stage where the converted HTML is processed, so it may as well be removed at the stage of conversion in Prettier. https://codepen.io/tats-u/pen/bGQjrvM
Trailing form feeds don't seem always to be removed. |
Interesting, https://bugs.webkit.org/show_bug.cgi?id=13159 seems we already made mistake in HTML/Handlebars printer. |
Anyway, I think we should reuse |
@fisker Your change has just been applied. |
90de55f
to
304b1de
Compare
I found an interesting description in https://drafts.csswg.org/css-text-3/#line-break-transform
What sucks is not the HTML or CSS specs but browsers' implementations. |
if (isLink) { | ||
return true; | ||
} | ||
function lineBreakCanBeConvertedToSpace() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think we should remove this function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It depends on whether we'll have to add it again.
I can't predict when all browsers will change their behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The result looks good, thank you!
@fisker I'll change the behavior. I looked into the detailed behavior of Firefox. Table: Is a newline between the followings converted to a space in Firefox?
A newline between [Chinese & Japanese & CJK punctuations] should not be converted to a space.
↓ OK
↓ STOP!
This will improve the compatibility with the conventional behavior to some extend and reduce browser-defined behaviors. A space around Chinese (& Japanese) is going not to be converted to a newline. It's not compatible with Firefox. |
Didn't get it. |
These seem to be treated as a single group in Firefox. |
Should't we only need make sure the output equivalence in html? Ah, maybe you are right, in markdown one single new line doesn't generate a space. But should not related to languages. |
Does it mean it's better to always treat a newline in Markdown as a space even if surrounded only by han or kana? |
I found the latest version of CSS is 4 but the specification related to this issue has not been changed. |
For the record, I don't. I still think we only need make sure we doesn't change the output "HTML". a b. equals to <p>a b.</p> and a
b. equals to <p>a
b.</p> And you are saying they are not equal when we replace This means we also handle it incorrectly in HTML printer. Playground link I think we should deal with the HTML printer first then. |
@thorn0 You are more familiar with the markdown text wrapping. Do you have a comment about the problem here? |
I'm not saying that for the Markdown-to-HTML conversion. Your interruption for Markdown-to-HTML conversion is correct. All Newline is kept as is every when converted to HTML.
The current one breaks documents that explain how browsers handle a newline in HTML. I have to add prettier-ignore. I'll add concrete examples later. The CommonMark specification misunderstands how newline in HTML should be interpreted by browsers. Its author mustn't have read the CSS specification.
https://spec.commonmark.org/0.30/#soft-line-breaks The last sentence isn't applicable to Chinese and Japanese documents. |
<p>
In the CSS specification, it depends on browsers how newlines in HTML are treated. When viewed in Firefox, no extra spaces appear in the following sentences.
</p>
<blockquote>
<p>
これは日本語
の文です。这
是一个中文句
子。
</p>
</blockquote> |
https://wpt.fyi/results/css/css-text/line-breaking?label=master&label=experimental&aligned |
Shall we follow all its tests? |
6032cc7
to
e9c58f8
Compare
I wonder why KATAKANA MIDDLE DOT U+30FB (・) has been treated as non-CJK in Prettier. <p>中点
・
中点</p> This is rendered as "中点・中点" in Firefox but Prettier can format the following Markdown as "中点 ・ 中点". 中点
・
中点 |
Katakana-Hiragana Double Hyphen U+30A0 (゠) is also excluded from CJK in Prettier even though it's in the Kataka Unicode block. チンギス゠カン (Genghis Khan) |
f21da19
to
78d7874
Compare
@fisker I finished to change. You can start a review or merge. |
78d7874
to
21bff02
Compare
Is this planned to be included in v3.1? |
I think this has too much impact to release as a minor update... |
I'll have to split changes in this PR into about 3 PRs:
Of course, I want the spec of Prettier to the current changes ultimately. I hope you agree with this plan. |
@fisker @sosukesuzuki Recently I have noticed the plan for the release of v4. |
Current v4 is published for the new CLI test. |
Description
Fixes #14936
Don't wrap even Chinese & Japanese whenproseWrap
isalways
proseWrap
isalways
ornever
Checklist
docs/
directory).changelog_unreleased/*/XXXX.md
file followingchangelog_unreleased/TEMPLATE.md
.✨Try the playground for this PR✨