New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatically build TOCs for pages #920

Merged
merged 16 commits into from Oct 16, 2018

Conversation

Projects
None yet
3 participants
@allejo
Contributor

allejo commented Oct 4, 2018

Summary

Greetings @coopr and Kong Doc maintainers!

Create an issue and you shall receive, especially during Hacktoberfest 🎃 I'm the guy lacking sanity who built jekyll-toc, a GitHub Pages compatible way of building table of contents automagically

Before continuing any further work, to save both our times I just wanted to present a quick proof of concept of how this would work.

jekyll-toc works by reading in HTML and extracting out H1-H6 headings, getting their IDs, and then building a TOC in markdown with that information and then markdownifying it.

Since it seems that "Terminology" and "Configuration" are automatically rendered inside of the extensions.html layout, we'll need to combine that with {{ content }} and then feed that into toc.html so that it may generate all the headings correctly; this is being achieved by using a {% capture %}.

If this is something that interests you, I'd be more than happy to continue working on this but I would need a lot of help checking pages to make sure things are being rendered correctly. And checking that headings are in correct order. (e.g. "Usage" is an H3 instead of H2 on the ACL page)

Full changelog

  • Added cool new automated ToC

Issues resolved

Fix #894

Checklist:

@coopr

This comment has been minimized.

Member

coopr commented Oct 4, 2018

Wow, this is super-awesome of you @allejo - and yay for Hacktoberfest!!

I'm 110% on-board with doing all the work of checking each and every page, adjusting heading levels, etc.

One question/request: #894 perhaps implies incorrectly that this solution should apply only to pages within /hub - what would you think about making this solution apply to any page in the docs.KongHQ.com site?

For example, you can see a Table of Contents on https://docs.konghq.com/0.14.x/secure-admin-api/ - but when you look at https://raw.githubusercontent.com/Kong/docs.konghq.com/master/app/0.14.x/secure-admin-api.md you see that the ToC is built and maintained by hand - if anyone edits/adds/removes a header in the content, they need to know and remember to also edit it in the ToC section.

@allejo

This comment has been minimized.

Contributor

allejo commented Oct 4, 2018

One question/request: #894 perhaps implies incorrectly that this solution should apply only to pages within /hub - what would you think about making this solution apply to any page in the docs.KongHQ.com site?

Sure! jekyll-toc can take whatever syntactically sane-ish HTML you toss at it. Since all that content is coming from markdown, it'll be sane, so we would just need to include toc.html in all of the layouts you'd like. I just did the /hub layout first since that's what was mentioned but this can easily be implemented anywhere else.

In your example, since the TOC is in the middle of the content, we could add a directive like <!-- toc --> in the markdown files and in the respective layout, Jekyll can look for that directive and replace it with whatever toc.html generates.

tl;dr: Yes, it's doable

@coopr

This comment has been minimized.

Member

coopr commented Oct 4, 2018

OK, I've pulled and run this locally - seems like a great start!

I see nicely auto-generated ToCs on most extension pages - awesome!

I don't see the new ToCs on pages for Kong Enterprise extensions like http://localhost:3000/hub/kong-inc/oauth2-introspection/ nor on third-party extension pages like http://localhost:3000/hub/peter-evans/paseto/

Also, though perhaps you noticed this already, I wanted to call out an exception to

Since it seems that "Terminology" and "Configuration" are automatically rendered inside of the extensions.html layout,...

This is true, but only for extensions of type: plugin - you'll see that https://docs.konghq.com/hub/okta/okta/ doesn't have those sections, because it is type: integration - not sure if that distinction matters, just trying to help smooth your efforts to help :)

@allejo allejo force-pushed the allejo:feature/auto-toc branch from 842f045 to 5f1c55d Oct 5, 2018

@allejo

This comment has been minimized.

Contributor

allejo commented Oct 5, 2018

I don't see the new ToCs on pages for Kong Enterprise extensions like http://localhost:3000/hub/kong-inc/oauth2-introspection nor on third-party extension pages like http://localhost:3000/hub/peter-evans/paseto

Ah this was because I had left in a condition to check for page.nav; I removed it, so they should be generated now.

I'm 110% on-board with doing all the work of checking each and every page, adjusting heading levels, etc.

I just realized how helpful this will truly be! I looked through a couple of pages and I couldn't quite understand the heading levels for sure so this'll be something I'll defer this task to your team.


Information/things I need:

  • Which layouts will need a toc?
  • For pages under the 0.X.Y folders, is there a specific pattern as to where those tocs will be placed? For example, always before the second heading?
  • CSS changes to the heading sizes will need to be made. Because it seems like h4 tags are being used as they generate smaller font sizes than h3. I designed toc.html to enforce proper ordering of headings so, this may cause problems if headings are out of order.
@coopr

This comment has been minimized.

Member

coopr commented Oct 5, 2018

Nice progress!

I don't see the new ToCs on pages for Kong Enterprise extensions like http://localhost:3000/hub/kong-inc/oauth2-introspection nor on third-party extension pages like http://localhost:3000/hub/peter-evans/paseto

Ah this was because I had left in a condition to check for page.nav; I removed it, so they should be generated now.

Confirmed, thanks for adjusting that!

  • Which layouts will need a toc?

I believe it is just app/_layouts/extension.html and app/_layouts/docs.html

Some of the remaining layouts are "dead code" that'll be removed soon, and I believe the other layouts like default.html and page.html are too general to need ToCs. We certainly don't need ToCs on pages like https://docs.konghq.com/ and https://docs.konghq.com/enterprise/

  • For pages under the 0.X.Y folders, is there a specific pattern as to where those tocs will be placed? For example, always before the second heading?

I see how we've been inconsistent where we've placed the manually-created ToCs. Let's keep this simple - ToCs should go at the top of the document for docs pages like https://docs.konghq.com/enterprise/0.33-x/getting-started/adding-your-api/ and https://docs.konghq.com/0.14.x/network/, and at the top of the right-hand sidebar for Hub pages like https://docs.konghq.com/hub/kong-inc/request-termination/

  • CSS changes to the heading sizes will need to be made. Because it seems like h4 tags are being used as they generate smaller font sizes than h3. I designed toc.html to enforce proper ordering of headings so, this may cause problems if headings are out of order.

Sounds good, I'll make sure those CSS changes get done!

@coopr

This comment has been minimized.

Member

coopr commented Oct 10, 2018

Wow, you went above and beyond with 8539d5d - thanks @allejo ! I'm reviewing this PR now...

@coopr

This comment has been minimized.

Member

coopr commented Oct 10, 2018

New issues:

Now these pages have auto-generated ToCs, and they should not:
http://localhost:3000/
http://localhost:3000/enterprise/

Some pages now have mal-formed ToCs, like:
http://localhost:3000/0.14.x/loadbalancing/

which looks like:

loadbalancing_reference_-v0_14_x___kong-_open-source_api_management_and_microservice_management

@allejo

This comment has been minimized.

Contributor

allejo commented Oct 10, 2018

New issues:

Now these pages have auto-generated ToCs, and they should not:
http://localhost:3000
http://localhost:3000/enterprise

Oops... That was not my intention, let me look into that.

Some pages now have mal-formed ToCs, like:
http://localhost:3000/0.14.x/loadbalancing

So this is what I was mentioning about having headings out of order, the headings will need to be fixed to be in proper order (which is what I recommend for accessibility purposes). however, we could work around this and make the TOC generator accept incorrect orders.

e.g. Notice how in app/0.14.x/loadbalancing.md it goes from # to ### right after? Guess I missed that one when I was touching some headings 😅

@coopr

This comment has been minimized.

Member

coopr commented Oct 10, 2018

Some pages now have mal-formed ToCs, like:
http://localhost:3000/0.14.x/loadbalancing

So this is what I was mentioning about having headings out of order, the headings will need to be fixed to be in proper order (which is what I recommend for accessibility purposes). however, we could work around this and make the TOC generator accept incorrect orders.

Ohhhh, now I get it - thanks for helping me along :)

Ok, let me see if I can fix the heading order on that specific page, to see if it fixes the malformed ToC...

@coopr

This comment has been minimized.

Member

coopr commented Oct 10, 2018

Ah, got it! Now I see how the heading levels have to be in order for the ToC to work - so this will cause issues:

# Heading
## Sub-heading
#### Sub-sub-sub-heading

But this will work well:

# Heading
## Sub-heading
### Sub-sub-heading

I've tuned-up the heading levels on http://localhost:3000/0.14.x/loadbalancing and now it works well, except for one thing - the bolding of the entries is inconsistent:

loadbalancing_reference_-v0_14_x___kong-_open-source_api_management_and_microservice_management

I would imagine we'd want H1 to be bold, and the others not bold, or maybe have all heading levels not be bold in the ToC?

@coopr coopr added the wip label Oct 10, 2018

@coopr

This comment has been minimized.

Member

coopr commented Oct 10, 2018

Uh oh, I've found yet another issue - sorry for the rabbit hole here @allejo - have a look at http://localhost:3000/enterprise/changelog/ - you'll see that because we have multiple headers in that page with the same name (eg. there are multiple ### Fixes) the ToC doesn't link correctly to the relevant heading. Any ideas?

@allejo

This comment has been minimized.

Contributor

allejo commented Oct 10, 2018

I just pushed a change to sanitize the headings automatically. However, this is up to you on how you'd want to handle this. In the case of the load balancing document, your headings look like:

# **Blue-Green Deployments**

So the bolding was just being kept, by sanitizing it any custom HTML is being stripped out. The only situation where this wouldn't be ideal is if you have headings with <code> blocks like on http://localhost:3000/0.14.x/proxy/


Uh oh, I've found yet another issue - sorry for the rabbit hole here @allejo - have a look at http://localhost:3000/enterprise/changelog/ - you'll see that because we have multiple headers in that page with the same name (eg. there are multiple ### Fixes) the ToC doesn't link correctly to the relevant heading. Any ideas?

Oh boy... this is would be a kramdown issue as the toc generator doesn't do anything about generating IDs; it just uses them. Let me investigate a bit further on this one.

Worst case, you could explicitly specify the IDs for the headings, e.g.

# 0.33.0

## Fixes {#0-33-0-fixes}
@coopr

This comment has been minimized.

Member

coopr commented Oct 10, 2018

I just pushed a change to sanitize the headings automatically. However, this is up to you on how you'd want to handle this. In the case of the load balancing document, your headings look like:

# **Blue-Green Deployments**

So the bolding was just being kept, by sanitizing it any custom HTML is being stripped out. The only situation where this wouldn't be ideal is if you have headings with <code> blocks like on http://localhost:3000/0.14.x/proxy/

Ah, I see - ya know, past us was doing silly things with headers that include bolding. I'll commit to removing all the silly formatting like the bolding, and let's remove the sanitizing so we can keep intentional things like the <code> blocks in headings.

Uh oh, I've found yet another issue - sorry for the rabbit hole here @allejo - have a look at http://localhost:3000/enterprise/changelog/ - you'll see that because we have multiple headers in that page with the same name (eg. there are multiple ### Fixes) the ToC doesn't link correctly to the relevant heading. Any ideas?

Oh boy... this is would be a kramdown issue as the toc generator doesn't do anything about generating IDs; it just uses them. Let me investigate a bit further on this one.

Worst case, you could explicitly specify the IDs for the headings, e.g.

# 0.33.0

## Fixes {#0-33-0-fixes}

That doesn't seem like so bad of a worst case, but it'd sure be nicer if ToC generating took care of this automagically - I appreciate your investigating :)

@allejo

This comment has been minimized.

Contributor

allejo commented Oct 10, 2018

Ah scratch that thought, you're using redcarpet instead of kramdown for your markdown engine. Duplicate IDs are a known issue in redcarpet, see vmg/redcarpet#307 (comment). The syntax in my previous post was relating to kramdown, don't know if it'll work in redcarpet. kramdown handles duplicate IDs correctly, fwiw

Jekyll moved to kramdown as the official markdown engine in Jekyll 3.0, not sure if you'd like to follow suit. You'd have to look through the docs to make sure no special markup unique to redcarpet breaks if you switch to kramdown.

allejo added a commit to allejo/docs.konghq.com that referenced this pull request Oct 10, 2018

@coopr

This comment has been minimized.

Member

coopr commented Oct 10, 2018

Ah scratch that thought, you're using redcarpet instead of kramdown for your markdown engine. Duplicate IDs are a known issue in redcarpet, see vmg/redcarpet#307 (comment). The syntax in my previous post was relating to kramdown, don't know if it'll work in redcarpet. kramdown handles duplicate IDs correctly, fwiw

Jekyll moved to kramdown as the official markdown engine in Jekyll 3.0, not sure if you'd like to follow suit. You'd have to look through the docs to make sure no special markup unique to redcarpet breaks if you switch to kramdown.

I'd love for us to be on a modern version of Jekyll, along with using kramdown and surely other things we've neglected for a while - but I'm guessing that doing that upgrade as part of this particular PR is not the right approach.

@allejo

This comment has been minimized.

Contributor

allejo commented Oct 12, 2018

d324d6f has changed the behavior so the TOC isn't loaded on index.md files OR if you set toc: false in the front-matter to false. Is this suitable behavior?

On another topic, do you want me to rewrite the commit messages to follow a specific convention?

@coopr

This comment has been minimized.

Member

coopr commented Oct 12, 2018

At first I thought the "TOC isn't loaded on index.md files" would be a problem, because we do want the ToC on Hub pages like https://github.com/Kong/docs.konghq.com/tree/master/app/_hub/peter-evans/paseto/index.md - but somehow even though that page is index.md, I see a generated ToC at http://localhost:3000/hub/peter-evans/paseto/ - could you help me understand how that happens? (I'm happy that it happens, I just don't understand why it happens :)

I tried out toc: false on that same page ^^ and it didn't make any difference - the auto-generated ToC is still present.

I tried it out on a different page http://localhost:3000/0.14.x/auth/ and it also had no effect - changing the front-matter of that file to:

auth_md_ ___documents_github_docs_konghq_com

Causes the page to render as:

authentication_reference_-v0_14_x___kong-_open-source_api_management_and_microservice_management

On another topic, do you want me to rewrite the commit messages to follow a specific convention?

Our commit message guidelines are here https://github.com/Kong/docs.konghq.com/blob/master/CONTRIBUTING.md#commit-message-format thanks!

@allejo

This comment has been minimized.

Contributor

allejo commented Oct 12, 2018

At first I thought the "TOC isn't loaded on index.md files" would be a problem, because we do want the ToC on Hub pages like /app/_hub/peter-evans/paseto/index.md@master - but somehow even though that page is index.md, I see a generated ToC at http://localhost:3000/hub/peter-evans/paseto - could you help me understand how that happens? (I'm happy that it happens, I just don't understand why it happens :)

So, I'm checking for index.md only in the docs.html layout, since that's the pattern I noticed. Everything under /hub/ uses the extension.html layout, which doesn't check for index.md, just toc: false.

I tried out toc: false on that same page ^^ and it didn't make any difference - the auto-generated ToC is still present.

Oops! My bad, latest commit should have fixed that

Our commit message guidelines are here /CONTRIBUTING.md@master#commit-message-format thanks!

Will do 👍

allejo and others added some commits Oct 5, 2018

feat(toc) don't load TOC on index.md or on FM flag
Do not load the TOC on `index.md` files that are being loaded from the
extensions layout. Additionally add support for `toc: false` in the
front matter to explicitly disable the TOC from being generated
@coopr

This comment has been minimized.

Member

coopr commented Oct 13, 2018

@allejo can you help me understand what is going on with the ToC on app/0.14.x/proxy.md - I see the following, but all the header levels look correct to me?

proxy_reference_-v0_14_x___kong-_open-source_api_management_and_microservice_management

@allejo

This comment has been minimized.

Contributor

allejo commented Oct 13, 2018

Seems like that may be an issue with redcarpet once again, see vmg/redcarpet#615.

So I originally misunderstood that this site was being compiled by GH Pages hence using my Liquid-only solution. But if you're compiling the site yourself and just hosting on GH Pages, then a plugin such as toshimaru/jekyll-toc instead could possibly solve a lot of these problems caused by redcarpet's generation.

Otherwise, a switch to kramdown would solve the problem with my implementation. Or ignoring headings 5 and 6 from the TOC could be a temporary workaround until it's updated to kramdown.

@coopr

This comment has been minimized.

Member

coopr commented Oct 15, 2018

OK, thanks @allejo - I've created #942 to track that issue.

coopr added some commits Oct 15, 2018

@coopr

This comment has been minimized.

Member

coopr commented Oct 15, 2018

I'm asking for another set of 👀 on this, as it is a rather large changeset.

I've manually reviewed and corrected all the ToCs for all Hub pages, and for Kong 0.14 and Kong Enterprise 0.33 pages - none of them are broken now.

ToCs for pages for versions prior to 0.14 and 0.33 have not been reviewed or corrected - they'll need to be, hopefully soon.

@coopr coopr requested a review from Kong/team-devx Oct 15, 2018

@coopr coopr added needs review and removed wip labels Oct 15, 2018

@hutchic

This comment has been minimized.

Member

hutchic commented Oct 16, 2018

I'm still seeing some mal-formed TOC's similar to what @coopr noticed earlier

screenshot from 2018-10-16 10-42-05

quickstart on 0.2.x - 0.13.x
clustering on 0.11.x - 0.13.x
configuration on 0.10.x, 0.12.x, 0.13.x
cli on 0.10.x - 0.13.x

@coopr

This comment has been minimized.

Member

coopr commented Oct 16, 2018

@hutchic indeed older docs need to be tuned-up - I mentioned:

ToCs for pages for versions prior to 0.14 and 0.33 have not been reviewed or corrected - they'll need to be, hopefully soon.

These are typically the result of mis-ordered heading levels (eg. H1 -> H3, or H2 -> H4) - I've not yet figured out a way of using find and replace to fix all of these at once, so the "repair" process is, at least for me, manual and tedious. Your contributions are welcome :)

@coopr coopr merged commit 0069e8a into Kong:master Oct 16, 2018

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
@coopr

This comment has been minimized.

Member

coopr commented Oct 16, 2018

Congrats @allejo - I've deployed this to production! Thanks again for your huge Hacktoberfest contribution!! If you haven't already, be sure to claim your contributor t-shirt.

@allejo allejo deleted the allejo:feature/auto-toc branch Oct 17, 2018

@coopr coopr removed the needs review label Oct 19, 2018

@coopr

This comment has been minimized.

Member

coopr commented Oct 22, 2018

@allejo sorry, can you please remind me: Is there an option to force a ToC to appear at a particular place in the text in a given page (rather than at the default top-of-the-page location)?

@allejo

This comment has been minimized.

Contributor

allejo commented Oct 23, 2018

@coopr oh no worries! There currently isn't an option to move the TOC anywhere else. On doc pages, it'll be at the top and on hub/ it'll be on the sidebar. What did you have in mind?

@vividh vividh referenced this pull request Oct 24, 2018

Merged

Switch to Kramdown engine #986

8 of 8 tasks complete
@coopr

This comment has been minimized.

Member

coopr commented Oct 24, 2018

Ah ok @allejo thanks for clarifying, much appreciated. My question came up as I was reviewing some pages that had introductory content that might have been nice to put above the ToC. Instead, I created new "Introduction" headers, and put the introductory content under those headers - problem solved!

kikito added a commit that referenced this pull request Nov 1, 2018

fix(pdk-docs) remove table of context from ldoc generation
All Tables of Contexts across the site, PDK or otherwise, are generated automatically via #920 .

@kikito kikito referenced this pull request Nov 1, 2018

Merged

Fix pdk doc headings #1015

kikito added a commit that referenced this pull request Dec 4, 2018

fix(pdk-docs) remove table of context from ldoc generation
All Tables of Contexts across the site, PDK or otherwise, are generated automatically via #920 .
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment