Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Markdown: Ignore mid-word underscores #4944

Merged
merged 2 commits into from Mar 19, 2018

Conversation

mcsf
Copy link
Contributor

@mcsf mcsf commented Feb 8, 2018

Description

Related: #4942, #4456

Fixes #5335

See Showdown documentation for literalMidWordUnderscores option:

Two _words_ -> Two <em>words</em>
One_word    -> One_word

This change removes a small point of collision between two competing formatting systems: Markdown and WordPress shortcode syntax. It fixes cases where pasting a shortcode like [my_shortcode]foo bar[/my_shortcode] would be mishandled due to a perceived emphasis (<em>) mark in that content.

When pasting:

[bw_csv]One,Two,Three
A,B,C
D,E,F
[/bw_csv]
Before After
before after

How Has This Been Tested?

Types of changes

Bug fix, but also a decision in what we kind of markdown we want to support. I think this is a good chance in itself, but cc @iseulde for thoughts.

Checklist:

  • My code is tested.
  • My code follows the WordPress code style.
  • My code has proper inline documentation.

Two _words_ -> Two <em>words</em>
One_word    -> One_word
@mcsf mcsf added [Feature] Raw Handling Related to the ability to convert content to blocks, paste handling, etc [Feature] Writing Flow Block selection, navigation, splitting, merging, deletion... labels Feb 8, 2018
@ellatrix
Copy link
Member

ellatrix commented Feb 8, 2018

I can agree with this change. I was thinking we could maybe "protect" shortcakes before handing it over to the Markdown parser. I'm also wondering, since we've been moving shortcake handling up and up, if we should just move it in front of all other handling. This might be less ideal though, as we'd probably make shortcakes out of every link (with the shortcake fallback in place).

In any case, this would need at least a comment, preferably tests to ensure all works as expected. For this we will need to create a WordPress module for shortcakes, instead of using a global.

@ellatrix
Copy link
Member

ellatrix commented Feb 8, 2018

Unrelated to this, if we continue to use shortcake fallbacks, we might want to rethink how we parse shortcakes. If we just want to match anything (/\[[a-z][a-z0-9_-]*\]/), we don't need to parse in so many steps.

@mcsf
Copy link
Contributor Author

mcsf commented Feb 8, 2018

(I think your autocorrect is suffering from sugar withdrawal. 😬)

This might be less ideal though, as we'd probably make shortcakes out of every link (with the shortcake fallback in place).

Yeah, this was my thinking in a previous PR; I wouldn't want to break links/images, and actually I'd be happy if we could fix a different kind of link, the footer kind:

This contains an [awesome] link to something. This here is even [better][awesomer].

[awesome]: http://example.org
[awesomer]: http://example.org

Which makes me think we might want to perform Markdown parsing before anything else, assuming we can get it to leave undefined references alone. That is, given:

This contains an [awesome] link to something. This here is even [better][awesomer].

[awesomer]: http://example.org

It would generate:

This contains an [awesome] link to something. This here is even better.

@ellatrix
Copy link
Member

ellatrix commented Feb 8, 2018

Which makes me think we might want to perform Markdown parsing before anything else, assuming we can get it to leave undefined references alone.

It seems to leave it alone? Or what do you mean?

@ellatrix
Copy link
Member

ellatrix commented Feb 8, 2018

(I think your autocorrect is suffering from sugar withdrawal. 😬)

Oops.

@mcsf
Copy link
Contributor Author

mcsf commented Feb 8, 2018

It seems to leave it alone? Or what do you mean?

It does. :) I hadn't tested yet and didn't know for sure. However, thinking about backwards compatibility, I wonder about the need to disable conversion within shortcode tags:

This contains an [awesome]hello, _world_[/awesome] link to something. This here is even [better][awesomer].

[awesomer]: http://example.org

becomes

This contains an [awesome]hello, world[/awesome] link to something. This here is even better.

that destruction between the awesome tags could be a deal breaker. I may have been too optimistic to think we could let Showdown work before we parsed for shortcodes.

@mcsf
Copy link
Contributor Author

mcsf commented Feb 9, 2018

@iseulde, so I made this little bundle of horrors:

https://gist.github.com/mcsf/57a4f328ff20ace2b3423bb69c2bcc53

I suggest reading the tests first to get the intention. Showdown's extension mechanism feels a bit clunky—from the docs, it doesn't seem like extensions can be dynamic, per-execution, etc., so this patch definitely cheats a lot. Anyway, it... is possible to disable Markdown parsing of shortcode content using Showdown extensions. 😅 It now ignores Markdown-like syntax here:

[my_shortcode]hello, _there_.[/my_shortcode]

but not here:

**oh**, [unknown]hello, _there_.

@aduth aduth added this to the 2.3 milestone Mar 1, 2018
@mtias mtias modified the milestones: 2.3, 2.4 Mar 2, 2018
@aduth aduth modified the milestones: 2.4, 2.5 Mar 14, 2018
@mcsf mcsf merged commit e33d866 into master Mar 19, 2018
@mcsf mcsf deleted the update/markdown-ignore-mid-word-underscores branch March 19, 2018 15:47
@mcsf mcsf mentioned this pull request Mar 19, 2018
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[Feature] Raw Handling Related to the ability to convert content to blocks, paste handling, etc [Feature] Writing Flow Block selection, navigation, splitting, merging, deletion...
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Copying text with underscores should not convert to italics
4 participants