Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Markdown: Added support for nested bold and italic tokens #1897

Merged

Conversation

RunDevelopment
Copy link
Member

This PR adds support for nested bold and italic tokens.

This also resolves #1849.

Example

Before:
image

After:
image

Copy link

@ghost ghost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • bold and italic share nearly identical regular expressions
  • bold, italic, and strike share nearly the same content
  • pandoc's markdown defines ::: as well, for classes
  • pandoc's markdown supports { } after various elements, also for classes
  • not immediately obvious why token.content[2] isn't used/defined (line 196)
  • don't see any tests for ** bold with _ italic inside
  • don't see any tests for __ bold with * italic inside
  • is bold/italic/strike a possibility?

@RunDevelopment
Copy link
Member Author

@DaveJarvis Thank you for taking the time to review this PR!

bold and italic share nearly identical regular expressions
bold, italic, and strike share nearly the same content

The new createInline function should address both issues.

pandoc's markdown defines ::: as well, for classes
pandoc's markdown supports { } after various elements, also for classes

I don't think we need to support a parser specific feature. If it's a commonly used feature, then sure we'll add it.
But you might want to open a new issue for this or add it to #1558 as this isn't within the scope of this PR.

not immediately obvious why token.content[2] isn't used/defined (line 196)

I added a comment explaining the structure of token.content.

don't see any tests for ** bold with _ italic inside
don't see any tests for __ bold with * italic inside

Added.

is bold/italic/strike a possibility?

Yes, you can nest bold, italic, and strike, or is there any issue with it?
The reason why support for nested bold, italic tokens was so difficult is that they can use the same character as delimiters. This makes parsing very hard and error-prone (sometimes even actual MD parser get wrong). Because strikes use a different character, there shouldn't be an issue, or is there?

@ghost
Copy link

ghost commented Jun 24, 2019

If it's a commonly used feature, then sure we'll add it.

Available since Oct 2017 (jgm/pandoc#168), so it's gaining in usage and popularity.

Yes, you can nest bold, italic, and strike, or is there any issue with it?

I didn't see any specific unit tests that go through all the combinations, but maybe they aren't necessary.

@RunDevelopment
Copy link
Member Author

Because strike uses different delimiters than bold and italic, so it's generally less problematic. So I think that this should be sufficent testing.

Copy link

@ghost ghost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The unit test blocks are comprehensive, though a little unwieldy. Not sure if it would be worthwhile to write a recursive function to iterate over an array of inline elements. Something like:

generate_inline_test( [ '**', '__' ] )
generate_inline_test( [ '__', '**' ] )
generate_inline_test( [ '*', '__', '~~' ] )

If so, once the generate_inline_test function is created, it would be possible to call it programmatically to generate all possible combinations and permutations based on a single array of inline elements (e.g., ['**', '__', '~~', '*', '_', ...]).

@RunDevelopment
Copy link
Member Author

Regarding the combinations: Yes we could write such a function but I don't think we have to. Looking at the regular expression, it's enough to test the nested tokens one level deep (It doesn't make sense the nest e.g. bold inside bold which makes things simpler) which gives us 6 * 4 combinations which are all covered by the bold, italic and strike tests in this PR.
More extensive tests might be necessary in the future but for now, I think it's fine as is.

@RunDevelopment RunDevelopment merged commit 1190372 into PrismJS:master Jun 24, 2019
@RunDevelopment RunDevelopment deleted the markdown_nested_bold_italic branch June 24, 2019 20:44
@RunDevelopment
Copy link
Member Author

@DaveJarvis Again, thank you for reviewing!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Markdown: Too greedy italic punctuation
1 participant