Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create a GFM (Github Flavored Markdown) parser #68

Closed
wants to merge 4 commits into from

Conversation

@plexus
Copy link
Contributor

@plexus plexus commented Jul 22, 2013

As a first step, it supports "fenced code blocks" delimited by three or more
backticks.

I am aware that this has been discussed in #15, #31, #53, and #64, and that the general concensus is that using tildes is a more standard and portable approach.

However in #53 it is also hinted that a pull request for a separate parser for this specific variety of Markdown might be accepted. And since the popularity of GFM is not decreasing, I would also argue it's a very useful feature to have.

This PR adds a GFM parser, including tests and docs.

Kramdown::Document.new(text, :input => 'GFM')

I am testing the water with this PR, to see what the current sentiments are regarding GFM. The parser currently only implements a single extra feature, but if this is deemed acceptable then at least it creates a place to expand GFM conformance. I'll be happy to respond to remarks as to make this worthy of merging.

As a first step, it supports "fenced code blocks" delimited by three or more
backticks.
@plexus
Copy link
Contributor Author

@plexus plexus commented Jul 23, 2013

Wow... that's a bit strong. In #31 @gettalong wrote

Fork kramdown on github, implement the github-markdown parser based on the kramdown parser (you may not know, but kramdown supports multiple parsers, not just the default kramdown parser), send me a pull request and I will have a look at it.

So that's all I'm doing.

@src.pos += @src.matched_size
el = new_block_el(:codeblock, @src[2])
lang = @src[1].to_s.strip
el.attr['class'] = "language-#{lang}" unless lang.empty?

This comment has been minimized.

@yaauie

yaauie Jul 23, 2013

The "language-#{lang}" doesn't seem to be covered in your tests.

@ghost ghost assigned gettalong Jul 29, 2013
@gettalong
Copy link
Owner

@gettalong gettalong commented Jul 29, 2013

I will have a look at the implementation.

@akshatpradhan
Copy link

@akshatpradhan akshatpradhan commented Jul 31, 2013

+1 for this feature

@yaauie
Copy link

@yaauie yaauie commented Jul 31, 2013

👎

@svnpenn would you mind being constructive with your feedback? Why don't you think this contribution should be accepted?

@ghost
Copy link

@ghost ghost commented Jul 31, 2013

@yaauie,

@gettalong said it best, maybe people could stop opening duplicate issues.

As for the inclusion of another code block fence style: this may not really be
necessary since the only other player using this is Github itself and nobody
else...

and

The problem is that Github just invented the ``` syntax out of no apparent reason, they could have just as easily gone with the ~~~ syntax that is used by PHP Markdown Extra, python-markdown, pandoc, ... Some of this implementations just bowed to Github and implemented this other syntax but I don't see any reason to add another syntax variation.

and

If you want to create compatible docs just use ~~~ because this works in
nearly ALL markdown parsers that allow fenced code blocks.

and

If you desire portability, use the tilde version which is implemented in most
versions.

@yaauie
Copy link

@yaauie yaauie commented Jul 31, 2013

@svnpenn I appreciate you taking your time to dig those back up, but the context here is different – those were in response to trying to cram GFM synatx into the Kramdown parser; @gettalong specifically suggested adding GFM as a new parser based on the kramdown parser, which is what this pull-request does. It provides an additional option without cluttering up the prestine kramdown parser.

@yaauie
Copy link

@yaauie yaauie commented Jul 31, 2013

@gettalong wrote

One more thing: If you really want to use the Github syntax, the best option would be to create a "github-markdown" parser based on the kramdown parser (similar to the Markdown parser). This would allow for any options you may need to ensure compatibility with github flavored markdown (like github style fenced code blocks and hard
line breaks).

And later clarified:

You misunderstood me... Fork kramdown on github, implement the github-markdown parser based on the kramdown parser (you may not know, but kramdown supports multiple parsers, not just the default kramdown parser), send me a pull request and I will have a look at it.

I never said that you should fork kramdown and release a separate library. The only thing I said is that I don't like to implement this in the default kramdown parser.

@plexus
Copy link
Contributor Author

@plexus plexus commented Jul 31, 2013

I would appreciate if someone with merge access to this repo could say if a GFM based parser could be considered in principle. If not just say "thanks, but no thanks" and close this PR. I'm fine with that. I just ended up needing this to work on existing GFM documents, and decided to contribute it back so people don't have to reinvent the wheel.

To clarify : Kramdown already has parsers for two different dialects of Markdown, the Markdown parser and the Kramdown parser. This PR adds a new one, it does not impact any existing uses. It is a separate, opt-in alternative.

As for creating another "issue", this is a Pull Request, a contribution of working code. It shows up as a new issue because that's how github works. It is an invitation to review and discuss the code so that it can be made better.

@yaauie thank you for your constructive criticism. I'll look into it as soon as the author lets us know if this PR stands a chance.

@svnpenn how come you deleted the remarks you made earlier?

@gettalong
Copy link
Owner

@gettalong gettalong commented Jul 31, 2013

Below are my 2c...

In general adding new syntax to kramdown or changing existing syntax at this stage is not so easy anymore because people tend to rely on "a minor update won't break my texts".

And as stated multiple times I don't want to include the the "fenced code block with backtick" syntax in the main kramdown parser. Creating new issues when it has been clearly said that this won't be implemented is just a waste of all of our time.

However, if someone wants to implement a parser based on the kramdown parser with different functionality, that is fine by me. So @plexus: This pull request is fine by me.

@plexus
Copy link
Contributor Author

@plexus plexus commented Jul 31, 2013

I completely agree that changing the original Markdown/Kramdown parsers at this point is a bad idea. It's unfortunate that these two things (adding backtick syntax to the old parser vs creating a new parser) have gotten mixed up in a single discussion.

I added a test case for class="language=#{lang}", and now also added the GFM specific behavior where line breaks in paragraphs get turned in to actual <br> tags.

Because of this a lot of the test cases are no longer valid for the GFM parser since the output now differs. I solved this by adding a ".gfm" extension, so that file will be tested against if it exists. Although I've only done this for one test case at the moment, and added another one to explicitly test the line breaks. The other ones are currently excluded from the gfm-to-html tests.

I ran the tests on MRI 1.8.7, 1.9.3, 2.0.0 and jruby-1.7.3 and they all run green.

</div>
</div>

<div><div class="CodeRay">

This comment has been minimized.

@yaauie

yaauie Jul 31, 2013

@plexus I'm not seeing the ruby lang reference reflected in the output.

This comment has been minimized.

@gettalong

gettalong Jul 31, 2013
Owner

This is because Coderay is enabled and at work. Using kramdown --no-enable-coderay you will get the following:

<pre><code class="language-ruby">language no space
</code></pre>
'/home/arne/github/kramdown/test/testcases/span/text_substitutions/entities_as_char.text',
'/home/arne/github/kramdown/test/testcases/span/text_substitutions/entities.text',
'/home/arne/github/kramdown/test/testcases/span/text_substitutions/typography.text'
]

This comment has been minimized.

@gettalong

gettalong Jul 31, 2013
Owner

@plexus The full path should not be contained, ie. strip /home/arne/github/kramdown/ from the paths.

This comment has been minimized.

@plexus

plexus Aug 2, 2013
Author Contributor

Sorry, I missed that. I took it out now.

@trans
Copy link

@trans trans commented Aug 8, 2013

I am so glad to see work done on this! Please merge.

Providing this feature should allow Smeagol to run properly on Rubinus and JRuby.

@plexus
Copy link
Contributor Author

@plexus plexus commented Aug 14, 2013

Can this go in as it is, or are there still things I should improve first? Just checking if something is still in the way of this being merged. Thanks!

@lhazlewood
Copy link

@lhazlewood lhazlewood commented Aug 20, 2013

Huge +1 for this feature.

While I understand that GitHub is the one who invented it, GitHub's Markdown is the predominant Markdown flavor and more people know it better than any other flavor of Markdown.

Please incorporate this feature as a pragmatic solution - it will make many people happy!

@gettalong
Copy link
Owner

@gettalong gettalong commented Aug 21, 2013

I have merged your changes and updated the implementation to avoid too much code duplication.

Anything else to do?

@plexus
Copy link
Contributor Author

@plexus plexus commented Aug 22, 2013

Awesome! I think that's it for now. Thanks for the help and for a great
piece of software!
On 22 Aug 2013 00:09, "Thomas Leitner" notifications@github.com wrote:

I have merged your changes and updated the implementation to avoid too
much code duplication.

Anything else to do?


Reply to this email directly or view it on GitHubhttps://github.com//pull/68#issuecomment-23054419
.

@konklone
Copy link

@konklone konklone commented Dec 9, 2013

I've opened jekyll/jekyll#1791 to update the Jekyll docs with the existence of this option.

Notably though, this feature doesn't do any syntax-specific highlighting (e.g. Pygments) itself, it just surrounds the <code> block with a class named after the language. I recognize that could be considered out of scope, but this is something that Maruku does, and looks to be a dealbreaker for me.

It looks like in #24 the answer was, "write a plugin". @NavarroJ did that over at krampygs. If there's interest in merging that into core as an option, I bet @NavarroJ or myself could be persuaded to prepare such a PR.

@plexus
Copy link
Contributor Author

@plexus plexus commented Dec 9, 2013

Syntax highlighting in Kramdown is done with coderay. If you set the enable_coderay option to true you should get HTML with actual syntax highlighting.

@konklone
Copy link

@konklone konklone commented Dec 9, 2013

Yeah, you're right, I realized that and should have been more precise in my comment. But Pygments seems simpler, like it has more of a future, and would make Kramdown easier to switch to from a Redcarpet or Maruku based site (which is where I am now).

@gettalong
Copy link
Owner

@gettalong gettalong commented Dec 9, 2013

@konklone There will be plugable syntax highlighting in the future so that rouge can also be used (rouge can use pygments stylesheets but is comparable in speed to coderay, so much faster than any pygments solution).

@ghost ghost mentioned this pull request Dec 9, 2013
@jneen
Copy link

@jneen jneen commented Dec 9, 2013

Neat! I'm happy to add a plugin for Kramdown (similar to the existing one for [redcarpet][]) if there's a stable integration point. I've seen some code floating around that subclasses Kramdown::Converters::Html and overrides #convert_codeblock - is there a different way you envision pluggable highlighting?

@jneen
Copy link

@jneen jneen commented Dec 9, 2013

@gettalong
Copy link
Owner

@gettalong gettalong commented Dec 10, 2013

I would like to add an option that can be used to specify the syntax highlighter per converter. And another option for setting syntax highlighter specific data, like the current coderay* options. This would also allow, for example, minted to be used with the LaTeX converter.

@konklone
Copy link

@konklone konklone commented Dec 10, 2013

Both sound like great ideas to me.

@ePirat
Copy link

@ePirat ePirat commented Feb 2, 2016

Is this still in kramdown? I can't seem to use three backticks for code highlighting using jekyll with kramdown and input set to GFM.

@gettalong
Copy link
Owner

@gettalong gettalong commented Feb 16, 2016

@ePirat Yes, GFM is still in the kramdown distribution. You need to tell Jekyll that kramdown should use the GFM parser, not the standard kramdown parser - please ask the Jekyll people about this, they can help you with this.

@ePirat
Copy link

@ePirat ePirat commented Feb 16, 2016

@gettalong I actually did this, I am using the GFM option:

kramdown:
  input: GFM
  syntax_highlighter: rouge
@gettalong
Copy link
Owner

@gettalong gettalong commented Feb 16, 2016

@ePirat Then you still have to ask the Jekyll guys why it doesn't work because I don't use Jekyll myself.

@ePirat
Copy link

@ePirat ePirat commented Feb 16, 2016

@gettalong Thanks, just wanted to make sure it's not an issue of kramdown before I ask them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

9 participants
You can’t perform that action at this time.