New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pasting: Respect plain-text pastes #3609

Merged
merged 1 commit into from Nov 28, 2017

Conversation

Projects
None yet
4 participants
@mcsf
Contributor

mcsf commented Nov 22, 2017

Partly fixes #3062. This change only affects shortcodes that are handled by registered blocks via transforms.

Description

When pasting plain-text-only clipboard data, use that plain-text data in the paste processing. This overrides the default behavior whereby an HTML-encoded counterpart of the plain text is used for processing.

Previously, pasting the following text as a plain-text attachment:

[gallery ids="1,2,3"]

Would be processed as:

[gallery ids="1,2,3"]

Thus affecting the correct parsing of shortcodes. With this change, parsing of shortcodes works when pasting a plain-text object — like the one above — but also when pasting rich-text objects. For instance, copying a shortcode snippet from a rich-text page, even if not wrapped with a <pre>, will be handled properly and yield a valid Gallery block:

<meta charset='utf-8'><span style="…">[gallery ids="10,11,9"]</span>

How Has This Been Tested?

  1. Either manually or using the Classic Editor, compose a gallery shortcode comprised of a few images, e.g. [gallery ids="1,2,3"].
  2. Copy that code from a source that will provide it as a plain-text-only clipboard object, e.g. a terminal or a basic HTML input.
  3. Prior to application of this diff, verify that pasting that data into an empty Paragraph block results in a Gallery block that's missing at least one image.¹
  4. After application of diff, verify that the same experience results in a Gallery block comprised of all the expected images.
  5. Grab the same shortcode but from an HTML-formatted source, e.g. http://txti.es/kph06. Repeat the experiment and verify that HTML pastes still work.

¹: This is because the Gallery shortcode transform received the named ids attribute as '&quot;1,2,3&quot;, which gets split as [ '&quot;1', '2', '3&quot;' ], then mapped over parseInt(_, 10), which evals to [ NaN, 2, 3 ].

Screenshots (jpeg or gifs if applicable):

gutenberg-plain-text-paste

Checklist:

  • My code is tested.
  • My code follows the WordPress code style.
  • My code follows has proper inline documentation.
@codecov

This comment has been minimized.

Show comment
Hide comment
@codecov

codecov bot Nov 22, 2017

Codecov Report

Merging #3609 into master will decrease coverage by 0.01%.
The diff coverage is 0%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #3609      +/-   ##
==========================================
- Coverage   37.47%   37.45%   -0.02%     
==========================================
  Files         277      277              
  Lines        6704     6706       +2     
  Branches     1220     1222       +2     
==========================================
  Hits         2512     2512              
  Misses       3535     3535              
- Partials      657      659       +2
Impacted Files Coverage Δ
blocks/editable/index.js 10.16% <0%> (-0.07%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 139dbcd...086dd25. Read the comment docs.

codecov bot commented Nov 22, 2017

Codecov Report

Merging #3609 into master will decrease coverage by 0.01%.
The diff coverage is 0%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #3609      +/-   ##
==========================================
- Coverage   37.47%   37.45%   -0.02%     
==========================================
  Files         277      277              
  Lines        6704     6706       +2     
  Branches     1220     1222       +2     
==========================================
  Hits         2512     2512              
  Misses       3535     3535              
- Partials      657      659       +2
Impacted Files Coverage Δ
blocks/editable/index.js 10.16% <0%> (-0.07%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 139dbcd...086dd25. Read the comment docs.

@mcsf mcsf referenced this pull request Nov 22, 2017

Merged

Pasting: Convert unknown shortcodes to Shortcode block #3610

0 of 3 tasks complete
@mtias

This comment has been minimized.

Show comment
Hide comment
@mtias

mtias Nov 22, 2017

Contributor

Nice debugging here.

Contributor

mtias commented Nov 22, 2017

Nice debugging here.

@iseulde

This comment has been minimized.

Show comment
Hide comment
@iseulde

iseulde Nov 22, 2017

Member

I'm trying to reproduce this issue, but am failing, though I see the HTML entities encoded. The reason seems to be that, if HTML entities are encoded, it is seen as a plain text paste, and passed through the markdown converter, which decodes all these. By the time the shortcode arrives at the shortcode converter, HTML entities will be decoded. I'm not sure then in which circumstances this would not be the case?

Member

iseulde commented Nov 22, 2017

I'm trying to reproduce this issue, but am failing, though I see the HTML entities encoded. The reason seems to be that, if HTML entities are encoded, it is seen as a plain text paste, and passed through the markdown converter, which decodes all these. By the time the shortcode arrives at the shortcode converter, HTML entities will be decoded. I'm not sure then in which circumstances this would not be the case?

@iseulde

This comment has been minimized.

Show comment
Hide comment
@iseulde

iseulde Nov 22, 2017

Member

(Just trying to understand everything that is going on here.)

Member

iseulde commented Nov 22, 2017

(Just trying to understand everything that is going on here.)

@iseulde

This comment has been minimized.

Show comment
Hide comment
@iseulde

iseulde Nov 22, 2017

Member

Oh, okay, this is without any other content. So it seems the source of this bug is the plainText.indexOf( '\n\n' ) !== -1 check? Things seem to work correctly with that check removed. This check is added though to ensure just one string without line breaks is pasted inline, not parsed as blocks. The markdown converter would add P tags, then later it would convert to a paragraph block.

Member

iseulde commented Nov 22, 2017

Oh, okay, this is without any other content. So it seems the source of this bug is the plainText.indexOf( '\n\n' ) !== -1 check? Things seem to work correctly with that check removed. This check is added though to ensure just one string without line breaks is pasted inline, not parsed as blocks. The markdown converter would add P tags, then later it would convert to a paragraph block.

@iseulde

This comment has been minimized.

Show comment
Hide comment
@iseulde

iseulde Nov 22, 2017

Member

There are some other related inline content issues where we're just checking too early if content is inline. Content that is later stripped (e.g. spans) is not caught as inline this early, so it won't be pasted inline. This makes me wonder if we should rethink this whole inline pasting thing to check at a later point.

Member

iseulde commented Nov 22, 2017

There are some other related inline content issues where we're just checking too early if content is inline. Content that is later stripped (e.g. spans) is not caught as inline this early, so it won't be pasted inline. This makes me wonder if we should rethink this whole inline pasting thing to check at a later point.

@mcsf

This comment has been minimized.

Show comment
Hide comment
@mcsf

mcsf Nov 22, 2017

Contributor

Thanks for having a look, you know much more about this than I do.

Oh, okay, this is without any other content. So it seems the source of this bug is the plainText.indexOf( '\n\n' ) !== -1 check?

That makes a lot of sense. Do you see that as entirely supplanting the current fix, or should they be complements?

Contributor

mcsf commented Nov 22, 2017

Thanks for having a look, you know much more about this than I do.

Oh, okay, this is without any other content. So it seems the source of this bug is the plainText.indexOf( '\n\n' ) !== -1 check?

That makes a lot of sense. Do you see that as entirely supplanting the current fix, or should they be complements?

@iseulde

This comment has been minimized.

Show comment
Hide comment
@iseulde

iseulde Nov 22, 2017

Member

Do you see that as entirely supplanting the current fix, or should they be complements?

Yes, it could be alternative fix, as long as it's maybe clearer commented all the things the markdown converter is doing, in this case decoding HTML entities for plain text paste.

Member

iseulde commented Nov 22, 2017

Do you see that as entirely supplanting the current fix, or should they be complements?

Yes, it could be alternative fix, as long as it's maybe clearer commented all the things the markdown converter is doing, in this case decoding HTML entities for plain text paste.

@ephox-mogran

This comment has been minimized.

Show comment
Hide comment
@ephox-mogran

ephox-mogran Nov 23, 2017

Contributor

OK .. so I've looked into this and have some questions / comments:

Q. If I copy and paste this *italic* text into the editor from a terminal should it convert it to italics? This solution won't, because it doesn't run through the showdown converter (since it does not contain two new line characters). However, I'm not sure if that's desired.

Looking at the pasting process, the general path seems to be:

  1. Firstly, store any plain text data on the clipboard as this.pastedPlainText
  2. If there is selected text and selected text is not a link, and there is a link on the clipboard, add a link to the current text with the clipboard link and stop any other pasting

Note, this is an interesting user experience. Where did you get the idea for this kind of behaviour? Is there another product that does it or is it just innovation? Have you thought about ways of making people aware of it?

  1. Identify the mode of pasting. There are three possible modes

a) INLINE: skips initial block parsing, allows for markdown conversion of plain text and always returns a string for TinyMCE to paste
b) AUTO: will parse as blocks if block delimiters detected, otherwise acts as INLINE if the HTML only has inline elements
c) BLOCKS: will parse as blocks if block delimiters detected, allows for markdown conversion of plain text and will always returns blocks

So for the markdown conversion of plain text, it will only take place if there is at least a double line break as @iseulde said. My understanding of why that check is there is to ensure that wrapping the output in paragraph tags isn't going to stop it being inline. Someone asked showdown to support not wrapping with an outer root node here:

showdownjs/showdown#189

And it looks like the answer is to use extensions. You could also just postprocess the HTML if you detected that there weren't linebreaks in the original (and remove the check in the if)

const processed = converter.makeHtml( plainText );
return plainText.indexOf( '\n\n' ) !== -1 ? processed : stripOuterPTag( processed );
Contributor

ephox-mogran commented Nov 23, 2017

OK .. so I've looked into this and have some questions / comments:

Q. If I copy and paste this *italic* text into the editor from a terminal should it convert it to italics? This solution won't, because it doesn't run through the showdown converter (since it does not contain two new line characters). However, I'm not sure if that's desired.

Looking at the pasting process, the general path seems to be:

  1. Firstly, store any plain text data on the clipboard as this.pastedPlainText
  2. If there is selected text and selected text is not a link, and there is a link on the clipboard, add a link to the current text with the clipboard link and stop any other pasting

Note, this is an interesting user experience. Where did you get the idea for this kind of behaviour? Is there another product that does it or is it just innovation? Have you thought about ways of making people aware of it?

  1. Identify the mode of pasting. There are three possible modes

a) INLINE: skips initial block parsing, allows for markdown conversion of plain text and always returns a string for TinyMCE to paste
b) AUTO: will parse as blocks if block delimiters detected, otherwise acts as INLINE if the HTML only has inline elements
c) BLOCKS: will parse as blocks if block delimiters detected, allows for markdown conversion of plain text and will always returns blocks

So for the markdown conversion of plain text, it will only take place if there is at least a double line break as @iseulde said. My understanding of why that check is there is to ensure that wrapping the output in paragraph tags isn't going to stop it being inline. Someone asked showdown to support not wrapping with an outer root node here:

showdownjs/showdown#189

And it looks like the answer is to use extensions. You could also just postprocess the HTML if you detected that there weren't linebreaks in the original (and remove the check in the if)

const processed = converter.makeHtml( plainText );
return plainText.indexOf( '\n\n' ) !== -1 ? processed : stripOuterPTag( processed );
@mcsf

This comment has been minimized.

Show comment
Hide comment
@mcsf

mcsf Nov 23, 2017

Contributor

Thanks for the feedback, @ephox-mogran.

Q. If I copy and paste this *italic* text into the editor from a terminal should it convert it to italics? This solution won't, because it doesn't run through the showdown converter (since it does not contain two new line characters). However, I'm not sure if that's desired.

It could be beneficial to handle those pastes as Markdown too. Your suggestion:

And it looks like the answer is to use extensions. You could also just postprocess the HTML if you detected that there weren't linebreaks in the original (and remove the check in the if)

could be the way to go.

Note, this is an interesting user experience. Where did you get the idea for this kind of behaviour? Is there another product that does it or is it just innovation? Have you thought about ways of making people aware of it?

I don't believe I was around when it got in Gutenberg, but it's been a feature of P2 for as long as I can remember.

Contributor

mcsf commented Nov 23, 2017

Thanks for the feedback, @ephox-mogran.

Q. If I copy and paste this *italic* text into the editor from a terminal should it convert it to italics? This solution won't, because it doesn't run through the showdown converter (since it does not contain two new line characters). However, I'm not sure if that's desired.

It could be beneficial to handle those pastes as Markdown too. Your suggestion:

And it looks like the answer is to use extensions. You could also just postprocess the HTML if you detected that there weren't linebreaks in the original (and remove the check in the if)

could be the way to go.

Note, this is an interesting user experience. Where did you get the idea for this kind of behaviour? Is there another product that does it or is it just innovation? Have you thought about ways of making people aware of it?

I don't believe I was around when it got in Gutenberg, but it's been a feature of P2 for as long as I can remember.

@mcsf

This comment has been minimized.

Show comment
Hide comment
@mcsf

mcsf Nov 24, 2017

Contributor

Actually, a problem with allowing Markdown parsing for single-line pastes (which includes our shortcode examples) is that a shortcode like [su_spoiler open="yes" style="fancy" icon="chevron"]Hidden content[/su_spoiler] gets picked up and rendered as [su<em>spoiler open="yes" style="fancy" icon="chevron"]Hidden content[/su_spoiler]. Therefore, I didn't push that change to the branch, but you can grab it as a gist.

For pastes of a single shortcode, it's trivial to work around this by testing the paste string against wp.shortcode. However, when pasting a fragment of a document, which could as likely be Markdown-formatted and shortcode-infused, which should be favored? In a way, Markdown and WP shortcodes may be fundamentally incompatible, as bracket notation is an important and differing aspect of each. Dealing with this isn't computationally infeasible, but likely impractical for what we want to achieve at this stage of Gutenberg.

I still believe the current fix to be useful. Do you both agree?

Contributor

mcsf commented Nov 24, 2017

Actually, a problem with allowing Markdown parsing for single-line pastes (which includes our shortcode examples) is that a shortcode like [su_spoiler open="yes" style="fancy" icon="chevron"]Hidden content[/su_spoiler] gets picked up and rendered as [su<em>spoiler open="yes" style="fancy" icon="chevron"]Hidden content[/su_spoiler]. Therefore, I didn't push that change to the branch, but you can grab it as a gist.

For pastes of a single shortcode, it's trivial to work around this by testing the paste string against wp.shortcode. However, when pasting a fragment of a document, which could as likely be Markdown-formatted and shortcode-infused, which should be favored? In a way, Markdown and WP shortcodes may be fundamentally incompatible, as bracket notation is an important and differing aspect of each. Dealing with this isn't computationally infeasible, but likely impractical for what we want to achieve at this stage of Gutenberg.

I still believe the current fix to be useful. Do you both agree?

@iseulde

This comment has been minimized.

Show comment
Hide comment
@iseulde

iseulde Nov 24, 2017

Member
Member

iseulde commented Nov 24, 2017

Pasting: Respect plain-text pastes
When pasting plain-text-only clipboard data, use that plain-text data in
the paste processing. This overrides the default behavior whereby an
HTML-encoded counterpart of the plain text is used for processing.

Previously, pasting the following text as a plain-text attachment:

  [gallery ids="1,2,3"]

Would be processed as:

  [gallery ids=&quot;1,2,3&quot;]

Thus affecting the correct parsing of shortcodes. With this change,
parsing of shortcodes works when pasting a plain-text object — like the
one above — but also when pasting rich-text objects. For instance,
copying a shortcode snippet from a rich-text page, even if not wrapped
with a `<pre>`, will be handled properly and yield a valid Gallery
block:

  <meta charset='utf-8'><span style="…">[gallery ids="10,11,9"]</span>
@mcsf

This comment has been minimized.

Show comment
Hide comment
@mcsf

mcsf Nov 28, 2017

Contributor

Let's get this one and #3610 in so that more eyeballs can make all the bugs shallow.

Contributor

mcsf commented Nov 28, 2017

Let's get this one and #3610 in so that more eyeballs can make all the bugs shallow.

@mcsf mcsf merged commit d5d7064 into master Nov 28, 2017

3 checks passed

codecov/project 37.45% (-0.02%) compared to 139dbcd
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
continuous-integration/travis-ci/push The Travis CI build passed
Details

@mtias mtias deleted the fix/shortcode-pasting-plain-text branch Nov 28, 2017

@mcsf mcsf referenced this pull request Feb 7, 2018

Closed

Blocks: Shortcode: Support multi-line shortcodes #4942

0 of 3 tasks complete
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment