Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't require whitespace before closing ATX ### marks #652

Closed

Conversation

urbanautomaton
Copy link
Contributor

Prior to 135f1fd, ATX headers with closing ### marks did not require preceding whitespace. While this is a requirement in GFM markdown, the original markdown spec doesn't require it.

This change therefore adds test cases to cover the reinstatement of this behaviour, and adjusts the trimming regex to exclude closing ### marks only if they are preceded by a backslash (for which a test case already exists).

Feedback

I'm not the most experienced with either different markdown formats or this project specifically, so I'd welcome any and all feedback. Specific questions, though:

  1. I think this is safe from a performance POV, because the negative lookbehind only affects the capture scope, not the match logic. I'm sure people here have more regex perf experience than me, though; do you agree? Are there any benchmarks I should be running to validate my changes?
  2. While I think the change I'm fixing was unintentional, it's not completely clear from the history that that's the case; are we happy to go with a looser interpretation of the original spec, and leave the stricter one to the GFM parser?

Thanks for taking the time to read this!

Prior to 135f1fd, ATX headers with closing ### marks did not require
preceding whitespace. While this is a requirement in GFM markdown[1],
the original markdown spec[2] doesn't require it.

This change therefore adds test cases to cover the reinstatement of this
behaviour, and adjusts the trimming regex to exclude closing ### marks
only if they are preceded by a backslash (for which a test case already
exists).

[1] https://github.github.com/gfm/#atx-heading
[2] https://daringfireball.net/projects/markdown/syntax
@gettalong gettalong self-assigned this Apr 21, 2020
@gettalong
Copy link
Owner

Thanks for taking the time to write this up!

Looking at the referenced commit, at the issue #505 which was the reason for the commit, at the various test cases involving ATX headers and the repository history, I'm inclined to say that, although not explicitly mentioned, the space is required before the closing hash marks.

However, as you have written there was an unintentional behaviour change (because of missing test cases). So let my think about it and I'll come back to you tomorrow.

As for performance: I also think you are okay but you can run the simple benchmarks described in the commit message of your linked commit to verify.

@urbanautomaton
Copy link
Contributor Author

urbanautomaton commented Apr 22, 2020

Thanks for the response! The benchmark does indeed look fine:

$ ruby -Ilib benchmark.rb
Warming up --------------------------------------
  require whitespace    18.000  i/100ms
don't require whitespace
                        19.000  i/100ms
Calculating -------------------------------------
  require whitespace    195.417  (± 1.5%) i/s -    990.000  in   5.067471s
don't require whitespace
                        192.750  (± 2.6%) i/s -    969.000  in   5.030580s

Comparison:
  require whitespace:      195.4 i/s
don't require whitespace:      192.7 i/s - same-ish: difference falls within error

I agree that it's unclear what's required. I guess I'd make two arguments: if it's not mentioned by the spec, it probably shouldn't be required (although I do appreciate that sometimes decisions have to be made that aren't represented in the original spec, which is very loose).

The second is aesthetic, really: requiring trailing content whitespace without also requiring leading content whitespace feels oddly asymmetric:

## Good ##
##Good ##
## Bad##
##Bad##

Meanwhile GFM does keep things symmetric:

## Good ##
##Bad ##
## Bad##
##Bad##

The more permissive parsing gets us back to a place where the base behaviour is symmetric.

However, I must declare a vested interest in that I have 7 years' worth of user-generated content, much of it using the no-trailing-space style, which I'd rather not edit. So please take my arguments with as big a pinch of salt as you like. 😄

@gettalong
Copy link
Owner

Thanks for doing the benchmarks!

Regarding whether the space is required or not: kramdown took inspiration from the original Markdown syntax as well as more advanced syntaxes. So required or not is something I defined (or tried to define as good as possible) when developing kramdown. Maybe this part was intentional or not, I don't really remember.

Aesthetically I prefer having whitespace separating the hashes from the content. However, that's just my personal opinion.

My main reason for merging this is that there was a unintentional behavioral change.

Btw. thanks for being so open about your personal reasons regarding this PR. You argued both sides very well and researched the problem in depth nonetheless which made it possible for me to come to a decision quicker (besides me wanting to do a new release today and closing as many PRs/issues as possible 😉).

Rebased and merged - thank you!

@gettalong gettalong closed this Apr 22, 2020
@urbanautomaton
Copy link
Contributor Author

Haha, thanks very much for the merge! For what it's worth I completely agree that it looks much nicer with whitespace; if only my users were of the same mind. 😄

@urbanautomaton urbanautomaton deleted the fix-atx-header-closing branch April 22, 2020 11:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants