-
-
Notifications
You must be signed in to change notification settings - Fork 8.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(v1): strip html from TOC #1762
Conversation
The approach here is to first strip the HTML from the heading's content, then rendered it with markdown to get the HTML content for the TOC entry, then to strip the HTML from the rendered content again, as to get the text for the TOC entry's link. Adds an additional dependency of striptags (MIT licensed) Example TOC Entry, given the heading of: ```markdown ``` ```javascript { hashLink: 'foo', rawContent: '<a name="foo"></a> _Foo_', content: '<em>Foo</em>', children: [] } ``` Previously this TOC entry would be: ```javascript { hashLink: 'a-name-foo-a-_foo_', rawContent: '<a name="foo"></a> _Foo_', content: '<a name="foo"></a> <em>Foo</em>', children: [] } ``` closes issue #1703
Deploy preview for docusaurus-2 ready! Built with commit fe3ceee |
It's also perhaps good to note that if the heading has markdown in the middle of it, e.g.,
Then the hash link will be |
Deploy preview for docusaurus-preview ready! Built with commit fe3ceee |
Also, should this have been a |
Fixes #1703 Some context on why this is important for us, that was missing from the discussion in #1740: we manually create anchor tags in our headings because the software we ship contains links back to the documentation within error logs, so we need the links to stay the same even if we end up changing the wording of the heading. This broke with some update of Docusaurus. The fix looks good to me, but I'm not overly familiar with the internals of Docusaurus. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ThisIsMissEm This looks great, thank you!
We will bump the minor version as it's somewhat breaking for certain users.
P.S. I read your articles before!
@yangshun hey, can you let me know when the minor version bump is out, that way I can do an PR for the kafkajs project to upgrade us. |
@ThisIsMissEm We just published v1.13.0 |
Motivation
As a developer using kafkajs, I was annoyed by their TOC links constantly being broken or formatted totally unreadably, so I decided to write a fix.
Have you read the Contributing Guidelines on pull requests?
yes, CLA signed.
Test Plan
Added test cases to verify the output is correct; haven't yet run Docusaurus on itself, but the test cases cover what's changed.
What changed
The approach here is to first strip the HTML from the heading's content, then rendered it with markdown to get the HTML content for the TOC entry, then to strip the HTML from the rendered content again, as to get the text for the TOC entry's link.
closes issue #1703, tulios/kafkajs#450
Example Output
Given the heading of:
## <a name="foo"></a> Foo
This change will now mean the TOC Entry for that heading will be:
Previously it would have been:
Screenshots
Related PRs
#1740 — I looked at that PR first, but felt like it was removing functionality by just using
rawContent
, as this actually would result in, for instance, the<a name="foo"></a>
from the above example to be outputted into the document twice, which isn't probably the intention.