with_toc_data: GitHub style anchors #186

Merged
merged 8 commits into from Aug 5, 2013

Conversation

Projects
None yet

mattr- commented Aug 1, 2013

GitHub uses different anchors for TOC data than redcarpet, which is odd because my understanding is that GitHub uses Redcarpet to render markdown files.

The markdown # Getting Started becomes the anchor <a name="getting-started"> on GitHub, whereas for Redcarpet it becomes <h1 id="toc_1">.

Could Redcarpet output both style of id, or have the option of outputting the GitHub style id?

Thanks,
Shaun

zakkain commented Mar 11, 2013

Bump – I'd also like to know this! I'm using redcarpet for the fenced code block goodness, but would also like the GitHub-style id

carwin commented Mar 12, 2013

👍 for this. Does anyone have an idea what this would take to implement? Maybe it's something I can take a whack at.

FSX commented Mar 12, 2013

You can do this by implementating your own renderer methods. Subclass the HTML and TOC renderers and overwrite the header method. You can read the Sundown source code to see how it works.

carwin commented Mar 12, 2013

Hey thanks @FSX, that's good information 👍

Collaborator

robin850 commented May 18, 2013

Even if this can be realized in pure Ruby simply, I think that we should improve the headers' id. I'm working on this.

Collaborator

mattr- commented May 18, 2013

I've been working on this actually.

Sent from Mailbox for iPhone

On Sat, May 18, 2013 at 8:24 AM, Robin Dupret notifications@github.com
wrote:

Even if this can be realized in pure Ruby simply, I think that we should improve the headers' id. I'm working on this.

Reply to this email directly or view it on GitHub:
#186 (comment)

Collaborator

robin850 commented May 18, 2013

Oops, no problem! I have about nothing for the moment so... 😄

Collaborator

mattr- commented May 20, 2013

I don't have much either. I'll see if I can't get a bit farther on it this week. No promises though. 😃

Collaborator

robin850 commented May 20, 2013

Ok, nice! :)

2013/5/20 Matt Rogers notifications@github.com

I don't have much either. I'll see if I can't get a bit farther on it this
week. No promises though. [image: 😃]


Reply to this email directly or view it on GitHubhttps://github.com/vmg/redcarpet/issues/186#issuecomment-18130708
.

mattr- was assigned Jun 2, 2013

simonc commented Jul 31, 2013

Hey guys, any news on this one ?

Collaborator

mattr- commented Jul 31, 2013

This is actually next on my list of things to work on. So, hopefully it'll land soon.

simonc commented Jul 31, 2013

Awesome :)

Collaborator

robin850 commented Jul 31, 2013

@mattr- : Awesome ❤️!

zakkain commented Jul 31, 2013

​Baller!

parkr commented Jul 31, 2013

@mattr-: Looks like you could just take a manipulated version of text here:

https://github.com/vmg/redcarpet/blob/master/ext/redcarpet/html.c#L255-L270

Collaborator

mattr- commented Jul 31, 2013

That's exactly what I'm doing. 😃 

I've just put this on the back burner until now due to other commitments.

On Wed, Jul 31, 2013 at 10:52 AM, Parker Moore notifications@github.com
wrote:

@mattr-: Looks like you could just take a manipulated version of text here:

https://github.com/vmg/redcarpet/blob/master/ext/redcarpet/html.c#L255-L270

Reply to this email directly or view it on GitHub:
#186 (comment)

Collaborator

mattr- commented Jul 31, 2013

I've made a fair bit of progress on this today, actually.

screen shot 2013-07-31 at 5 00 23 pm

and

screen shot 2013-07-31 at 5 01 09 pm

But if the markdown is

# First level  heading

(note the two spaces between level and heading) then I get first-level--heading as the id for the header and I'd prefer to squish all the whitespace.

mattr- added some commits Aug 1, 2013

Collaborator

mattr- commented Aug 1, 2013

🚧 not quite ready yet, still trying to figure out the multiple space squishing thing 🚧

@robin850 robin850 and 1 other commented on an outdated diff Aug 1, 2013

ext/redcarpet/html.c
@@ -260,8 +260,14 @@ static inline void escape_href(struct buf *ob, const uint8_t *source, size_t len
if (ob->size)
bufputc(ob, '\n');
- if ((options->flags & HTML_TOC) && (level <= options->toc_data.nesting_level))
- bufprintf(ob, "<h%d id=\"toc_%d\">", level, options->toc_data.header_count++);
+ if ((options->flags & HTML_TOC) && (level <= options->toc_data.nesting_level)) {
+ VALUE str = rb_str_new2(bufcstr(text));
+ VALUE pattern = rb_str_new2(" ");
+ VALUE heading = rb_funcall(str, rb_intern("gsub"), 2, pattern, rb_str_new2("-"));
+ heading = rb_funcall(heading, rb_intern("downcase"), 0);
+ bufprintf(ob, "<h%d id=\"%s\">", level, StringValueCStr(heading));
+ options->toc_data.header_count++;
@robin850

robin850 Aug 1, 2013

Collaborator

I haven't done any testing on this but since this works:

>> string = "First level              heading"
=> "First level              heading"
>> string.gsub(/ +/, "-")
=> "First-level-heading"

Maybe we could try:

VALUE pattern = rb_str_new2(" +");
pattern = rb_reg_new_str(pattern, 0);

Not sure that it will work and if it's not too tricky.

@mattr-

mattr- Aug 1, 2013

Collaborator

I can't find an rb_reg_new_str in the C API anywhere. 😢

@robin850

robin850 Aug 1, 2013

Collaborator

Yup, found it in re.c but it seems that this is not part of the API. Sorry about that.

@mattr-

mattr- Aug 1, 2013

Collaborator

the folks on the ruby-talk mailing list also recommended rb_reg_new_str as well and it actually works!

@robin850 robin850 commented on the diff Aug 1, 2013

ext/redcarpet/html.c
@@ -724,6 +730,7 @@ static inline void escape_href(struct buf *ob, const uint8_t *source, size_t len
/* Prepare the options pointer */
memset(options, 0x0, sizeof(struct html_renderopt));
options->flags = render_flags;
+ options->toc_data.nesting_level = 99;
@robin850

robin850 Aug 1, 2013

Collaborator

Why this change? 😄

@mattr-

mattr- Aug 1, 2013

Collaborator

Because we don't generate header tags with IDs without it. options->toc_data.nesting_level was only initialized properly in the HTML_TOC renderer before this change.

@robin850

robin850 Aug 1, 2013

Collaborator

Yes, it's my bad actually! But why not 6? :-)

@mattr-

mattr- Aug 1, 2013

Collaborator

I just chose a sufficiently high number so as to never trigger the issue. 😀

@robin850 robin850 and 1 other commented on an outdated diff Aug 1, 2013

test/html_render_test.rb
@@ -199,4 +199,13 @@ def test_autolink_short_domains
assert output.include? 'mailto:auto@l.n'
assert output.include? '<a href="http://a/u/t/o/s/h/o/r/t">http://a/u/t/o/s/h/o/r/t</a>'
end
+
+ def test_toc_heading_id
+ renderer = Redcarpet::Render::HTML.new(:with_toc_data => true)
+ parser = Redcarpet::Markdown.new(renderer)
+ markdown = "# First level heading"
+ output = parser.render(markdown).strip
+
@robin850

robin850 Aug 1, 2013

Collaborator

Maybe you could just define here an :with_toc_data entry so we could use a render with this option enabled elsewhere. Also, we could rely on the @markdown variable instead of creating a new parser. What do you think?

@mattr-

mattr- Aug 1, 2013

Collaborator

totally! I wrote my test bottom up (assertion first), so I'll go back and fix this up.

Collaborator

mattr- commented Aug 1, 2013

Figured out the multiple space squishing stuff. Should be good to go.

Collaborator

robin850 commented Aug 2, 2013

@mattr- : Awesome! 🤘 ❤️ Could you just add a changelog entry please and I think we can merge this.

parkr commented Aug 2, 2013

yey

robin850 merged commit 5ed23a3 into master Aug 5, 2013

1 check passed

default The Travis CI build passed
Details

robin850 deleted the github-style-titles branch Aug 5, 2013

Collaborator

robin850 commented Aug 5, 2013

Okay, I've added the changelog entry. Feel free to improve it, my English knowledge is pretty bad. Thanks everyone! Nice work @mattr-! ❤️

enyo commented Aug 5, 2013

👍 Great change!
I know that this is off topic, but when can I expect this feature to land on gh-pages?

parkr commented Aug 5, 2013

@enyo If it's in RedCarpet 3.0, it may not happen for a while. Jekyll still needs to support Ruby v1.8.7 until we release v2.0, and as RedCarpet 3.0 doesn't support Ruby < 1.9.2, Jekyll (and therefore GH:Pages) can't support it either.

carwin commented Aug 5, 2013

Huzzah! Thanks!

@robin850 robin850 referenced this pull request Aug 6, 2013

@robin850 robin850 Ensure GitHub style anchors are also in TOC
Pull request #286 introduced GitHub style anchors but we forgot also to
generate these nice IDs in the tables of content.

Simply move the snippets about IDs generation to a function to avoid
code duplication and make output common if we need to output it
elsewhere.

Also remove the useless header_count entry from the toc_data struct
since we are not generating header's IDs through a counter.
4a85052

Hey, so is there documentation for how to actually do this + the TOC that @robin850 worked on in #291? Or failing that, does anyone here know how to get this working?

Collaborator

robin850 commented Aug 16, 2013

@ngoldman : I'm not sure to understand what you are asking but actually, if you want to use this feature you will need to depend on the GitHub repository:

gem 'redcarpet', github: 'vmg/redcarpet'

But be aware, this is still a "in-development" feature. For instance, we've just noticed that tags weren't stripped out from IDs (see #298) and I think that we will find other bugs before (and certainly after) the next release.

Realizing this is off-topic, but I was working with Middleman and just looking for something along the lines of how to get it working:

Use :with_toc_data

set :markdown_engine, :redcarpet
set :markdown,
    :autolink => true,
    :fenced_code_blocks => true,
    :tables => true,
    :with_toc_data => true # yay

Helper to generate TOC

def toc(page)
  html_toc = Redcarpet::Markdown.new(Redcarpet::Render::HTML_TOC) # hooray
  file = ::File.read(page.source_file).gsub(/^(---\s*\n.*?\n?)^(---\s*$\n?)/m,'') # remove YAML frontmatter
  html_toc.render file
end

Was not obvious for a redcarpet noob but I figured it out after reading the tests, thanks for adding these features!
👍 :shipit: 🆒

Do I understand it correctly that it is still not possible to get GitHub style anchors?

If it is already part of redcarped 3, could you please give me a link do a blog that does use it? (source + rendered)

(By now, I use https://github.com/dafi/jekyll-toc-generator)

Collaborator

robin850 commented Jan 16, 2014

@MartinThoma : Hello, this patch is currently not part of any release. This will be in version 3.1.0 but we have still a blocker to get this version out (i.e. #307). So you can set this in your Gemfile:

gem 'redcarpet', github: 'vmg/redcarpet'

(and if you are using Jekyll on GitHub pages, you will have to generate your site locally) but duplicate anchors won't be handled correctly. We are trying to release Redcarpet as soon as possible.

Collaborator

mattr- commented Jan 17, 2014

This is only available in master and hasn't yet been released. It is Coming Soon™ 😃

sferik referenced this pull request Jan 30, 2014

Closed

:ship: version 3.1.0 #345

Digging in the past. For your consideration: This could have been done using POSIX regex and regexec instead of calling Ruby for help. Apparently the whole ruby.h is dragged in here just for these three lines.

I understand that this is now fully ruby project but clearly splitting the two (C and Ruby) would help the "fork scene".

Thanks.

Collaborator

robin850 replied Jun 1, 2014

Sorry for the inconvenience, I will try to have a shot at this later. If we can rewrite this in pure C, then we should. Thank you!

zdne replied Jun 1, 2014

Thanks. Having as most as (reasonably) possible code in pure C would help establishing GFM Markdown parser as the "standard" Markdown parser.

The reasons I am asking is that I would love to have a standard GFM parser that can be used in variety of languages (JS, python, ruby, C# etc) for tools beyond rendering a Markdown (e.g. API Blueprint or MSON).

Cheers!

Collaborator

robin850 replied Jun 1, 2014

Actually GFM isn't relying on Redcarpet directly anymore ; they have a sort of fork of the project without the callbacks stuff and they have diverged a bit with us at several points (there are fixes and features in GFM that aren't in Redcarpet and vice versa).

If you want a pure C Markdown parser to integrate it in several languages, you can have a look at Hoedown. However, as far as I remember, they don't have yet integrated the GitHub style anchors.

zdne replied Jun 1, 2014

Thank you @robin850 !

I am familiar with Hoedown, but thought GH still uses Redcarpert for rendering GFM. What is what they use then? Is it a direct fork of Redcarpet?

Hm this really does not help towards a common standard Markdown Parser (or maybe it is just an utopian idea 😄 )

Collaborator

robin850 replied Jun 1, 2014

I am familiar with Hoedown, but thought GH still uses Redcarpert for rendering GFM. What is what they use then? Is it a direct fork of Redcarpet?

They use their own Markdown parser ; it is not an open source project.

Hm this really does not help towards a common standard Markdown Parser (or maybe it is just an utopian idea 😄 )

There were plans to create a common Markdown parser (and continue the work already down by the original implementation.

For now, in Redcarpet, we are trying to be as consistent as possible with the original Markdown implementation. There are still lacks (like encoding e-mail addresses through auto-links) but we are working on fixing them step by step.

It is hard to have the best of both worlds between features people want and being compatible with the original Markdown parser while trying to limit the number of existing options you can pass to the different objects but I hope we will meet Markdown expectations in the future.

zdne replied Jun 1, 2014

@robin850 your comments are really helps me to get a grasp on the Markdown situation. Thanks for sharing!

Just one last™ question if I may – what is Redcarpet policy on adapting / implementing GFM syntax (changes) – is the goal to implement all the GFM in Redcarpet or just a subset of it?

Collaborator

robin850 replied Jun 2, 2014

your comments are really helps me to get a grasp on the Markdown situation. Thanks for sharing!

No problem, you're welcome!

Just one last™ question if I may – what is Redcarpet policy on adapting / implementing GFM syntax (changes) – is the goal to implement all the GFM in Redcarpet or just a subset of it?

Actually we're not specifically trying to follow GFM but we are trying to met people's expectations and needs so for the moment, the only disparities seem to be:

  • Having to enable certain extensions to match GFM output (such as :no_intra_emphasis or :hard_wrap)
  • Tasks lists
  • Things specific to GitHub (e.g. emojis, commits/issues references, @mention)

This article and this one should gather all these disparities and features.

zdne replied Jun 2, 2014

Thank you @robin850 ! Really helpful. Much appreciated!

Collaborator

robin850 replied Jun 2, 2014

No problem! Thanks for bringing the discussion ; I guess this would be cool to have a place to explain these things (maybe in the wiki or something like this). I'll think about this! :-)

zdne replied Jun 2, 2014

I think it would be good – to bring clarity to the situation on GFM features and Redcarpet goals. Wiki could do the trick.

On a personal note - I am bit disappointed (not your or Redcarpet's fault) that GitHub does not completely open source its GFM parser...

Collaborator

mattr- replied Jun 2, 2014

zdne replied Jun 2, 2014

@mattr- Just to clarify: I am interested working with Markdown AST, in tools build on markdown; not working with a rendered document.

Collaborator

robin850 replied Jun 3, 2014

@mattr- : Whoah, I didn't even know this project, it seems really cool, thanks! :-)

Collaborator

robin850 replied Oct 31, 2014

Just for the record, this area has been rewritten with C in #426. 😃

Hey guys. Being totally ignorant I created a brand new issue today. Sorry about that.

#391

~ didn't see this discussion because it was already marked as "closed".

I'm on ruhoh v2.6 and still only getting the #toc_1 #toc_2 etc.
+1million.

zdne commented on 6d881e8 Nov 3, 2014

👍

westurner referenced this pull request in github/markup Jun 19, 2016

Closed

ENH: markdown: Table of Contents (with_toc_data) #904

0 of 1 task complete
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment