Skip to content

feat: add no-missing-link-fragments rule #380

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 28 commits into
base: main
Choose a base branch
from

Conversation

SwetaTanwar
Copy link
Contributor

@SwetaTanwar SwetaTanwar commented May 19, 2025

Prerequisites checklist

What is the purpose of this pull request?

This PR adds a new rule no-missing-link-fragments to ensure there is no missing link fragment in the markdown

What changes did you make? (Give an overview)

Added the no-missing-link-fragments rule, along with documentation and tests.

Related Issues

fixes #369

Is there anything you'd like reviewers to focus on?

@SwetaTanwar SwetaTanwar marked this pull request as ready for review May 19, 2025 02:20
Copy link
Contributor

@snitin315 snitin315 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for taking this up. This looks like a good start. I have added a few suggestions on documentation & few false postives. I think there are few more cases we should cover that MD051 handles as well:

  1. HTML ids:

    <a id="bookmark"></a>
    
    [Link](#bookmark) // valid
  2. #top: HTML links to #top scroll to the top of a document.

    [Link](#top)
  3. Custom fragment syntax used by GitHub to highlight specific content in a document.

    [Link](#L20) // Valid
    
    [Link](#L19C5-L21C11) // Valid

@SwetaTanwar SwetaTanwar force-pushed the feat/no-missing-link-fragments branch from 67f7cc1 to b051c1a Compare May 19, 2025 07:26
@fasttime fasttime added this to Triage May 19, 2025
@github-project-automation github-project-automation bot moved this to Needs Triage in Triage May 19, 2025
@snitin315 snitin315 moved this from Needs Triage to Implementing in Triage May 19, 2025
@SwetaTanwar SwetaTanwar force-pushed the feat/no-missing-link-fragments branch from 1301dba to a36b406 Compare May 25, 2025 12:03
@SwetaTanwar SwetaTanwar requested a review from lumirlumir May 26, 2025 04:00
@SwetaTanwar SwetaTanwar requested a review from snitin315 May 26, 2025 04:00
@SwetaTanwar SwetaTanwar requested review from lumirlumir and nzakas May 26, 2025 13:48
@nzakas
Copy link
Member

nzakas commented May 27, 2025

@SwetaTanwar please double-check the merge conflict.

nzakas
nzakas previously approved these changes May 28, 2025
Copy link
Member

@nzakas nzakas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Would like @lumirlumir to review before merging.

snitin315
snitin315 previously approved these changes May 28, 2025
Copy link
Contributor

@snitin315 snitin315 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@SwetaTanwar SwetaTanwar dismissed stale reviews from snitin315 and nzakas via b7d1ee3 May 29, 2025 06:57
@SwetaTanwar SwetaTanwar requested a review from lumirlumir May 29, 2025 08:38
@SwetaTanwar
Copy link
Contributor Author

@lumirlumir Can you please review the PR

Comment on lines +113 to +121
if (customIdMatch) {
baseId = customIdMatch[1];
} else {
const tempSlugger = new GithubSlugger();
baseId = tempSlugger.slug(rawHeadingText);
}

const finalId = slugger.slug(baseId);
fragmentIds.add(ignoreCase ? finalId.toLowerCase() : finalId);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can I ask why you're using slug twice?

If it's intended to handle the case I mentioned in the earlier comment, I'd like to suggest traversing the heading node's children, finding the text nodes, and combining them to make a valid base ID.

So how do we distinguish between underscores used intentionally and those used for Markdown formatting?

This distinction is important because the Markdown heading below generates a slug of test_, not test_:

test_

To handle this accurately, we can't rely on regex. Instead, we need to traverse to the deepest text nodes, extract their raw values, and reassemble them—effectively reconstructing the heading text without interpreting Markdown syntax manually. This ensures the slug generation mirrors GitHub's behavior and avoids false positives or negatives.


The current logic has a problem, since GitHubSlugger remembers duplicate heading texts.

image

For example:

# foo

# foo

The first heading foo will have the slug foo, and the second heading foo will have the slug foo-1.

If we declare GitHubSlugger only once, then we can track duplicate heading IDs and append -1, -2, etc., to the slug as we encounter duplicate headings.

But in this situation, if we declare GitHubSlugger each time, the tracking will fail and result in false positives or negatives.


So in conclusion, I'd recommend traversing the heading node, finding its children’s text nodes, and recombining them to get valid text without Markdown syntax.

It would be nice if you could reference the logic implemented in https://github.com/remarkjs/strip-markdown.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you share any example of a false positive or negative, the following case works as expected:

<!-- eslint markdown/no-missing-link-fragments: "error" -->

# foo

## foo

[Link](#foo) // No error

[Link](#foo-1) // No error

[Link](#foo-2) // Error Link fragment '#foo-2' does not reference a heading or anchor in this document

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added test cases in ff82b1b

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got it, fixed the behavior in 4c98730

Comment on lines 127 to 132
if (
htmlText.startsWith("<!--") &&
htmlText.endsWith("-->")
) {
return;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you use a stricter check? Maybe we can use a negative lookbehind regex to find invalid comments.

Here’s another edge case:

<div>
  <!-- comment -->
</div>

References:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 757a61b

const extractedId = match[1];
const finalId = slugger.slug(extractedId);
fragmentIds.add(
ignoreCase ? finalId.toLowerCase() : finalId,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you check whether the id and name are inside <a> or <h1>, <h2>, ... tags?

Maybe we can reference the current implementation of another rule.

https://github.com/eslint/markdown/blob/main/src/rules/require-alt-text.js

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already checking that, added more test cases in b9ec4da

@nzakas
Copy link
Member

nzakas commented Jun 5, 2025

@lumirlumir please re-review this when you can.

@lumirlumir
Copy link
Member

lumirlumir commented Jun 6, 2025

@nzakas Sure, but I think my reviews haven’t been addressed yet. It seems like a bit more time might be needed.

@SwetaTanwar If you don’t mind, and if you’re still working on it, could you mark this PR as a draft while you continue making progress? :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Implementing
Development

Successfully merging this pull request may close these issues.

New Rule: link-fragments-should-be-valid
4 participants